Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 594 Bytes

README.md

File metadata and controls

26 lines (21 loc) · 594 Bytes

Wikipedia-Search-Engine

Search Engine on Wikipedia dump with support for field queries

Requirements

  • Python 2.6 or above
  • Python libraries:
    • Porter Stemmer
    • XML Parser
    • NLTK

Index can be generated using:

  ./index.sh  "path_to_wiki_dump"

For Searching:

  python search.py

Sample Query

  • Plain query
  • Field query: "C:Plane B:Bus T:Air"

Term Field Abbreviations: b:Body, t:Title e:External Link, c:Category

You can download a small dump to test run from here.