We follow strict internal guidelines and processes to regulate our output quality
and operate at the highest of standards.
Customer developed Web Portal and indexed over 1.5 million recipes - this portal acts as a search engine to find all your recipes in cookbooks, magazines and blogs. It is also a platform to exchange recommendations, find new ideas & follow blogs written by cooking experts.
While the goal is to continuously index more and more recipe books, the manual process adopted for crawling & indexing data was not fast enough to add more data quickly. Also it involved a tedious task of coordination between crawler staff, indexer staff, editors and portal administrators since it also involved careful quality assurance before any data is added in the portal.
Our team got involved in understanding of the various types of book formats, data formats and current manual process of crawling & indexing. Subsequently, we suggested that an automated crawling & indexing solution can be developed and implemented thus providing a huge benefit to the business.
#webapplicationdevelopment #crawling #Indexing #Angular #HTML5 #MSSQL
The project can be divided into two parts, first was system analysis and solution envisioning and second was solution development.
A. System Analysis
Ace Publishing has spent 4.5 weeks for doing the detailed analysis for the possible solution with the following goals in mind:
The analysis phase had very tangible outcomes in terms of benefits mentioned below and additionally screen designs were done, workflow was finalized, technical architecture was created for the crawling/indexing solution:
Following is the stats of analysis done:
B. Crawling & indexing solution development
Timeline for this Project