Introduction To Information Retrieval

Information Retrieval is process of finding documents that satisfies the information need from a large collection of documents.

The data in the documents can be unstructured ( like pure text files ) or structured (like the xml files or the relational database files).

Most of the data used by humans is structured. Even the documents are structured in the form of heading , body and footer.

The field of information retrieval also deals with filtering document collection as per requirement of the customer, clustering or grouping of documents as per the content and also classifying new documents automatically.

Information Retrieval systems are classified into three groups based on the scale at which they are being operated

    •  Web Search 
      • searching over billions of documents stored in millions of computers.
    •  Personal
      •  Operating systems search or email search like spam filtering etc.
    •  Domain-specific
      • searching only specific database like searching only organization data or database of articles etc