Inverted Index Algorithm

#### Description of the problem
Searching through text documents is a process that requires lots of computational powers. For a given search keyword, the naive searching algorithm would be to pass through the corpus of documents iteratively and for each document, check whether the word exists in the document. This algorithm will be very slow for a large corpus of documents. If the corpus had ​D​ documents each having ​W​ words each of length ​L​ then the complexity of the naive approach would be O(D * W * L)​ (The complexity of naive search in each document is ​O(W*L)​ ). Searching algorithm will be optimized . The straightforward way is to build an index. For an index, All the words that occurred in a document will be stored. An inverted index works in the reverse way. Instead of storing the words that occurred in each document, The ids of the documents in which each word has occurred will be stored. 

#### References/Other comments
https://www.elastic.co/blog/found-indexing-for-beginners-part3/
http://www.dcs.bbk.ac.uk/~dell/teaching/cc/book/ditp/ditp_ch4.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inverted Index Algorithm #213

Description of the problem

References/Other comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inverted Index Algorithm #213

Description

Description of the problem

References/Other comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions