Time Sensitive Search Engine Test phase 1, test phase 2, srs, design phase and coding final deliverable
Project Domain / Category
Information Mining and Retrieval.
Abstract/Introduction
The purpose of a search engine is to extract requested information from the huge database of resources available on the internet. Search engines become an important day to day tool for finding the required information without knowing where exactly it is stored. The present search engines like MSN, Yahoo and Google hits millions of records against a single query. It is very difficult and time consuming for the users to find the relevant information. These search engines search information based on key words mentioned in the query. Time sensitive search engines have the capability to give priority to the times mentioned in the query. It will consider only time mentioned inside the text (page contents) and not the time on which page is updated, created or published.
Student should be very careful while crawling web for creating indexes and taking of times from contents that mentioned in the contents because they may have different formats. There is need to handle all form of times and times references and convert them to ISO standard (ISO- 8601) like “hh – mm – ss”. Student need to maintain a list or local data base for storing times and offsets at which they occur in document.
Students are required to select/specify a particular dataset to test and evaluate their project.
Functional Requirements:
This project has the following basic modules:
- Web Crawler:
Web search engines work by storing information about many web pages, which they retrieve from the html itself. These pages are retrieved by a Web crawler which is an automated Web browser which follows every link on the site. The contents of each page are then analyzed to determine how it should be indexed.
- Front end for query processing and their results:
The front-end presents a search bar for users and the query processor parses the request and executes the search. The results are displayed by the front-end.
- Data base:
- Maintaining a list or database for storing time specific
- Data about web pages are stored in an index database for use in later queries. The purpose of an index is to allow information to be found as quickly as
Tools:
The following tools can be used for developing the above project.
- Net, SQL Server
- Java, SQL Server/MySQL
Supervisor:
Name: Said Nabi
Email ID: said.nabi@vu.edu.pk Skype ID: saidnabi115