Skip to content

Using Natural Language processing techniques - word embedding (word2vec, bert, biowordvec...) and clustering techniques, coming up with labels for documents and grouping documents together.

Notifications You must be signed in to change notification settings

lo1gr/medical_document_clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocAI - Medical documents tagging

Data collection

Run (it will take few minutes to fetch all the abstracts from PubMed)

python repository/abstracts.py

Modeling and output

Once you fetched abstracts, run the main script

python main.py

It will output a file at the root folder named abstracts_labelled.csv with labels for each document.

About

Using Natural Language processing techniques - word embedding (word2vec, bert, biowordvec...) and clustering techniques, coming up with labels for documents and grouping documents together.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages