Document Reading

I'm a bot. Documents are not boring for me!

Neural Document reading is a task where a Deep learning Model finds an answer to a query from a document/context.

Components of Document Reader

Elastic Search as a Document store
The documents are preprocessed and are stored in elastic search indexes
Neural/statistical Document Ranker
Document Ranker retrieves Top n documents out of m documents (m>>n) which would most likely have an answer for the incoming query. The most likely documents that would contain answer are chosen based on:
- Neural approach: Semantic similarity of query embedding with the already existing Document embeddings
- Statistical approach: Based on word overlap in question to that of document
Transformer Reader
- Extractive Reader
  we feed question and context (the list of documents shortlisted by the document ranker) as input to Transformer. The Embeddings generated from the transformer layers are passed through two separate Feed-Forward neural networks. One of the Networks predicts the start token index and the other predicts the end Token index. The Probability distribution over the words in documents (for both start and end token) is used to retrieve the answer
- Generative Reader
  Generates a novel answer (not necessarily a span of text) from the document

Last updated 3 years ago