MedCAT can be used to extract information from Electronic Health Records (EHRs) and link it to biomedical ontologies like SNOMED-CT, UMLS, or HPO (and potentially other ontologies). Original paper for v1 on arXiv.
Discussion Forum discourse
As MedCAT v2 is still in Beta, we do not currently have any models publically available. You can still use models for v1, however (see the README there). If you wish you can also convert the v1 models into the v2 format (see tutorial (TODO + link)).
Currently MedCAT v2 is in Beta. As such, we're not yet pushing to PyPI. And because of that the current installation command for (only) core MedCAT v2 is:
pip install "install git+https://github.com/CogStack/[email protected]#egg=medcat2"
Do note that this installs only the core MedCAT v2.
It does not necessary dependencies for spacy
-based tokenizing or MetaCATs or DeID.
However, all of those are supported as well.
You can install them as follows:
pip install "git+https://[email protected]/CogStack/[email protected]#egg=medcat2[spacy]" # for spacy-based tokenizer
pip install "git+https://[email protected]/CogStack/[email protected]#egg=medcat2[meta_cat]" # for MetaCAT
pip install "git+https://[email protected]/CogStack/[email protected]#egg=medcat2[deid]" # for DeID models
pip install "git+https://[email protected]/CogStack/[email protected]#egg=medcat2[spacy,meta_cat,deid,dict_ner]" # for all of the sbove
PS:
For in the above example, we're installing the MedCAT v2 BETA version of v0.1.5
.
The README is unlikely to change after every new release.
If another version is available / required, substitute the version tag as appropriate.
Demo for v2 is upcoming
A guide on how to use MedCAT v2 is available at MedCATv2 Tutorials. However, the tutorials are a bit of a work in progress at this point in time.
Entity extraction was trained on MedMentions In total it has ~ 35K entites from UMLS
The vocabulary was compiled from Wiktionary In total ~ 800K unique words
A big thank you goes to spaCy and Hugging Face - who made life a million times easier.