Repository to contain code and information for:
Knowledge-Enhanced Document Embeddings for IR
- ElasticSearch 6.6
- Python 3
- Numpy
- TensorFlow >= 1.13
- Whoosh
- SQLite3
- Cvangysel
- Pytrec_Eval
- Scikit-Learn
- Tqdm
- QuickUMLS
- Elasticsearch
- Elasticsearch_dsl
- UMLS 2018AA
server.py
needs to be substitued within QuickUMLS folder as it contains a modified version required to run knowledge-enhanced models.
The folder structure required to run experiments can be seen in folder example
. Python files need to be put in root.
Qrels file needs to be in .txt
format.
To perform retrofitting run retrofit_doc_vecs.py
, whereas to train PV-DM and cDoc2Vec models run gensim_doc2vec.py
.
To run BM25, use the Jupyter Notebook file elastic_search.ipynb
.
To perform query expansion run qe_combsum.py
.