Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 2 KB

README.md

File metadata and controls

18 lines (11 loc) · 2 KB

keras-language-modeling

Some code for doing language modeling with Keras, in particular for question-answering tasks. I wrote a very long blog post that explains how a lot of this works, which can be found here.

Stuff that might be of interest

  • attention_lstm.py: Attentional LSTM, based on one of the papers referenced in the blog post and others. One application used it for image captioning. It is initialized with an attention vector which provides the attention component for the neural network.
  • insurance_qa_eval.py: Evaluation framework for the InsuranceQA dataset. To get this working, clone the data repository and set the INSURANCE_QA environment variable to the cloned repository. Changing config will adjust how the model is trained.
  • keras-language-model.py: The LanguageModel class uses the config settings to generate a training model and a testing model. The model can be trained by passing a question vector, a ground truth answer vector, and a bad answer vector to fit. Then predict calculates the similarity between a question and answer. Override the build method with whatever language model you want to get a trainable model. Examples are provided at the bottom, including the EmbeddingModel, ConvolutionModel, and RecurrentModel.
  • word_embeddings.py: A Word2Vec layer that uses the embeddings generated by Gensim's word2vec model to provide vectors in place of the Keras Embedding layer, which could help improve convergence, since fewer parameters need to be learned. Note that this requires generating a separate file with the word2vec weights, so it doesn't fit in very nicely with the Keras architecture.

Data