Skip to content

R library for Harmony. R package - open source tool using AI for psychology and mental health. Actively recruiting contributors.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

harmonydata/harmony_r

Repository files navigation

Harmony R library: harmonydata

You can also join our Discord server!
If you found Harmony helpful, you can leave us a review!

🧠 What Does Harmony Do?

Psychologists and social scientists often have to match items in different questionnaires, such as:

“I often feel anxious”
“Feeling nervous, anxious or afraid”

This process is called harmonisation.

🔹 The Problem

  • Harmonisation is a time-consuming and subjective process.
  • Researchers have to manually go through long PDFs of questionnaires.
  • Extracting questions and putting them into Excel is tedious.

🔹 The Solution: Harmony

🚀 Harmony uses natural language processing (NLP) and generative AI models to: - Automatically match similar questionnaire items.

  • Help researchers work across multiple languages.
  • Save time and effort in data harmonisation.

📂 Looking for Examples?

Check out our examples repository for hands-on demonstrations. ## 🌍 The Harmony Project

Harmony is an AI-powered tool that helps researchers compare items from questionnaires and identify similar content.

🔹 Try Harmony: Harmony Web App
🔹 Read our blog: Harmony Blog

📞 Who to Contact?

🔹 Harmony Team: harmonydata.ac.uk
🔹 Thomas Wood: fastdatascience.com

Getting started with the Harmony R library

Installing R library

You can install the development version of harmonydata from GitHub with:

#install.packages("devtools") # If you don't have devtools installed already.
library(devtools)
#> Loading required package: usethis
devtools::install_github("harmonydata/harmony_r")
#> Using GitHub PAT from the git credential store.
#> Skipping install of 'harmonydata' from a github remote, the SHA1 (5104e35b) has not changed since last install.
#>   Use `force = TRUE` to force installation

or you can install it via CRAN:

install.packages("harmonydata")

Setting up domain

Before starting, you can set up the remote API endpoint for harmony using this function. By default it uses the remote Harmony API https://api.harmonydata.ac.uk

harmonydata::set_url()

For example, if you want to use Harmony locally, you can run the Harmony API as a Docker container. By default it runs on localhost at port 8000. In this case you can run this command to run it locally:

docker run -p 8000:8000 harmonydata/harmonylocal

Now in R you can set the R library to point to your local Harmony on Docker.

harmonydata::set_url("http://localhost:8000")

Parsing a raw file into an Instrument

If you want to read in a raw (unstructured) PDF or Excel file, you can do this via a POST request to the REST API. This will convert the file into an Instrument object in JSON. It returns the instrument as a list.

library(harmonydata)
instrument = load_instruments_from_file(path = "examples/GAD-7.pdf")
names(instrument[[1]])
#> [1] "file_id"         "instrument_id"   "instrument_name" "file_name"      
#> [5] "language"        "questions"

You can also input a url containing the questionnaire.

instrument_2 = load_instruments_from_file("https://medfam.umontreal.ca/wp-content/uploads/sites/16/GAD-7-fran%C3%A7ais.pdf") 
names(instrument_2[[1]])
#> [1] "file_id"         "instrument_id"   "instrument_name" "file_name"      
#> [5] "language"        "questions"

Matching instruments

You can get a list containing the results of the match. Here we can see a list of similarity score for each question comapred to all the other questions in th other questionaire.

instruments = append(instrument, instrument_2)
match = match_instruments(instruments)
names(match)
#> [1] "instruments"                          
#> [2] "questions"                            
#> [3] "matches"                              
#> [4] "query_similarity"                     
#> [5] "closest_catalogue_instrument_matches" 
#> [6] "instrument_to_instrument_similarities"
#> [7] "clusters"

Here is how the matches look like.

match$matches
#> [[1]]
#> [[1]][[1]]
#> [1] 1
#> 
#> [[1]][[2]]
#> [1] 0.5830621
#> 
#> [[1]][[3]]
#> [1] 0.6179736
#> 
#> [[1]][[4]]
#> [1] 0.4357673
#> 
#> [[1]][[5]]
#> [1] 0.4945895
#> 
#> [[1]][[6]]
#> [1] 0.5529693
#> 
#> [[1]][[7]]
#> [1] 0.7089151
#> 
#> [[1]][[8]]
#> [1] 0.2380928
#> 
#> [[1]][[9]]
#> [1] 0.2814474
#> 
#> [[1]][[10]]
#> [1] 0.894249
#> 
#> [[1]][[11]]
#> [1] 0.6634801
#> 
#> [[1]][[12]]
#> [1] 0.5109949
#> 
#> [[1]][[13]]
#> [1] 0.5931828
#> 
#> [[1]][[14]]
#> [1] 0.4505574
#> 
#> 
#> [[2]]
#> [[2]][[1]]
#> [1] 0.5830621
#> 
#> [[2]][[2]]
#> [1] 1
#> 
#> [[2]][[3]]
#> [1] 0.7629658
#> 
#> [[2]][[4]]
#> [1] 0.4594004
#> 
#> [[2]][[5]]
#> [1] 0.4558097
#> 
#> [[2]][[6]]
#> [1] -0.4613766
#> 
#> [[2]][[7]]
#> [1] 0.5173815
#> 
#> [[2]][[8]]
#> [1] -0.2566257
#> 
#> [[2]][[9]]
#> [1] -0.2383574
#> 
#> [[2]][[10]]
#> [1] 0.60493
#> 
#> [[2]][[11]]
#> [1] 0.8852125
#> 
#> [[2]][[12]]
#> [1] 0.5615149
#> 
#> [[2]][[13]]
#> [1] -0.4793222
#> 
#> [[2]][[14]]
#> [1] -0.4719152
#> 
#> 
#> [[3]]
#> [[3]][[1]]
#> [1] 0.6179736
#> 
#> [[3]][[2]]
#> [1] 0.7629658
#> 
#> [[3]][[3]]
#> [1] 1
#> 
#> [[3]][[4]]
#> [1] 0.3895614
#> 
#> [[3]][[5]]
#> [1] 0.3963558
#> 
#> [[3]][[6]]
#> [1] 0.4716267
#> 
#> [[3]][[7]]
#> [1] 0.6041647
#> 
#> [[3]][[8]]
#> [1] 0.2892596
#> 
#> [[3]][[9]]
#> [1] 0.2572643
#> 
#> [[3]][[10]]
#> [1] 0.6280157
#> 
#> [[3]][[11]]
#> [1] 0.7662809
#> 
#> [[3]][[12]]
#> [1] 0.5106027
#> 
#> [[3]][[13]]
#> [1] 0.4637058
#> 
#> [[3]][[14]]
#> [1] 0.5593851
#> 
#> 
#> [[4]]
#> [[4]][[1]]
#> [1] 0.4357673
#> 
#> [[4]][[2]]
#> [1] 0.4594004
#> 
#> [[4]][[3]]
#> [1] 0.3895614
#> 
#> [[4]][[4]]
#> [1] 1
#> 
#> [[4]][[5]]
#> [1] 0.6178267
#> 
#> [[4]][[6]]
#> [1] 0.3250091
#> 
#> [[4]][[7]]
#> [1] 0.3117914
#> 
#> [[4]][[8]]
#> [1] 0.1839352
#> 
#> [[4]][[9]]
#> [1] 0.2985738
#> 
#> [[4]][[10]]
#> [1] 0.4527453
#> 
#> [[4]][[11]]
#> [1] 0.4667662
#> 
#> [[4]][[12]]
#> [1] 0.5440194
#> 
#> [[4]][[13]]
#> [1] 0.3540848
#> 
#> [[4]][[14]]
#> [1] 0.2137841
#> 
#> 
#> [[5]]
#> [[5]][[1]]
#> [1] 0.4945895
#> 
#> [[5]][[2]]
#> [1] 0.4558097
#> 
#> [[5]][[3]]
#> [1] 0.3963558
#> 
#> [[5]][[4]]
#> [1] 0.6178267
#> 
#> [[5]][[5]]
#> [1] 1
#> 
#> [[5]][[6]]
#> [1] 0.3895386
#> 
#> [[5]][[7]]
#> [1] 0.4360376
#> 
#> [[5]][[8]]
#> [1] 0.2008711
#> 
#> [[5]][[9]]
#> [1] 0.2626984
#> 
#> [[5]][[10]]
#> [1] 0.4596621
#> 
#> [[5]][[11]]
#> [1] 0.4473513
#> 
#> [[5]][[12]]
#> [1] 0.6250208
#> 
#> [[5]][[13]]
#> [1] 0.4114662
#> 
#> [[5]][[14]]
#> [1] 0.2880645
#> 
#> 
#> [[6]]
#> [[6]][[1]]
#> [1] 0.5529693
#> 
#> [[6]][[2]]
#> [1] -0.4613766
#> 
#> [[6]][[3]]
#> [1] 0.4716267
#> 
#> [[6]][[4]]
#> [1] 0.3250091
#> 
#> [[6]][[5]]
#> [1] 0.3895386
#> 
#> [[6]][[6]]
#> [1] 1
#> 
#> [[6]][[7]]
#> [1] 0.4438164
#> 
#> [[6]][[8]]
#> [1] 0.3468708
#> 
#> [[6]][[9]]
#> [1] 0.3111583
#> 
#> [[6]][[10]]
#> [1] 0.5644366
#> 
#> [[6]][[11]]
#> [1] 0.5049124
#> 
#> [[6]][[12]]
#> [1] 0.5719854
#> 
#> [[6]][[13]]
#> [1] 0.9502258
#> 
#> [[6]][[14]]
#> [1] 0.3653329
#> 
#> 
#> [[7]]
#> [[7]][[1]]
#> [1] 0.7089151
#> 
#> [[7]][[2]]
#> [1] 0.5173815
#> 
#> [[7]][[3]]
#> [1] 0.6041647
#> 
#> [[7]][[4]]
#> [1] 0.3117914
#> 
#> [[7]][[5]]
#> [1] 0.4360376
#> 
#> [[7]][[6]]
#> [1] 0.4438164
#> 
#> [[7]][[7]]
#> [1] 1
#> 
#> [[7]][[8]]
#> [1] -0.1535627
#> 
#> [[7]][[9]]
#> [1] -0.153154
#> 
#> [[7]][[10]]
#> [1] 0.612879
#> 
#> [[7]][[11]]
#> [1] 0.541166
#> 
#> [[7]][[12]]
#> [1] 0.5295712
#> 
#> [[7]][[13]]
#> [1] 0.5013311
#> 
#> [[7]][[14]]
#> [1] 0.8445888
#> 
#> 
#> [[8]]
#> [[8]][[1]]
#> [1] 0.2380928
#> 
#> [[8]][[2]]
#> [1] -0.2566257
#> 
#> [[8]][[3]]
#> [1] 0.2892596
#> 
#> [[8]][[4]]
#> [1] 0.1839352
#> 
#> [[8]][[5]]
#> [1] 0.2008711
#> 
#> [[8]][[6]]
#> [1] 0.3468708
#> 
#> [[8]][[7]]
#> [1] -0.1535627
#> 
#> [[8]][[8]]
#> [1] 1
#> 
#> [[8]][[9]]
#> [1] 0.5548581
#> 
#> [[8]][[10]]
#> [1] 0.2341754
#> 
#> [[8]][[11]]
#> [1] 0.3289153
#> 
#> [[8]][[12]]
#> [1] 0.3237803
#> 
#> [[8]][[13]]
#> [1] 0.3217046
#> 
#> [[8]][[14]]
#> [1] 0.1625244
#> 
#> 
#> [[9]]
#> [[9]][[1]]
#> [1] 0.2814474
#> 
#> [[9]][[2]]
#> [1] -0.2383574
#> 
#> [[9]][[3]]
#> [1] 0.2572643
#> 
#> [[9]][[4]]
#> [1] 0.2985738
#> 
#> [[9]][[5]]
#> [1] 0.2626984
#> 
#> [[9]][[6]]
#> [1] 0.3111583
#> 
#> [[9]][[7]]
#> [1] -0.153154
#> 
#> [[9]][[8]]
#> [1] 0.5548581
#> 
#> [[9]][[9]]
#> [1] 1
#> 
#> [[9]][[10]]
#> [1] 0.3128226
#> 
#> [[9]][[11]]
#> [1] -0.3486197
#> 
#> [[9]][[12]]
#> [1] 0.2828471
#> 
#> [[9]][[13]]
#> [1] 0.3370971
#> 
#> [[9]][[14]]
#> [1] -0.217787
#> 
#> 
#> [[10]]
#> [[10]][[1]]
#> [1] 0.894249
#> 
#> [[10]][[2]]
#> [1] 0.60493
#> 
#> [[10]][[3]]
#> [1] 0.6280157
#> 
#> [[10]][[4]]
#> [1] 0.4527453
#> 
#> [[10]][[5]]
#> [1] 0.4596621
#> 
#> [[10]][[6]]
#> [1] 0.5644366
#> 
#> [[10]][[7]]
#> [1] 0.612879
#> 
#> [[10]][[8]]
#> [1] 0.2341754
#> 
#> [[10]][[9]]
#> [1] 0.3128226
#> 
#> [[10]][[10]]
#> [1] 1
#> 
#> [[10]][[11]]
#> [1] 0.712629
#> 
#> [[10]][[12]]
#> [1] 0.5177428
#> 
#> [[10]][[13]]
#> [1] 0.6094118
#> 
#> [[10]][[14]]
#> [1] 0.4456488
#> 
#> 
#> [[11]]
#> [[11]][[1]]
#> [1] 0.6634801
#> 
#> [[11]][[2]]
#> [1] 0.8852125
#> 
#> [[11]][[3]]
#> [1] 0.7662809
#> 
#> [[11]][[4]]
#> [1] 0.4667662
#> 
#> [[11]][[5]]
#> [1] 0.4473513
#> 
#> [[11]][[6]]
#> [1] 0.5049124
#> 
#> [[11]][[7]]
#> [1] 0.541166
#> 
#> [[11]][[8]]
#> [1] 0.3289153
#> 
#> [[11]][[9]]
#> [1] -0.3486197
#> 
#> [[11]][[10]]
#> [1] 0.712629
#> 
#> [[11]][[11]]
#> [1] 1
#> 
#> [[11]][[12]]
#> [1] 0.6538957
#> 
#> [[11]][[13]]
#> [1] 0.5488661
#> 
#> [[11]][[14]]
#> [1] 0.539001
#> 
#> 
#> [[12]]
#> [[12]][[1]]
#> [1] 0.5109949
#> 
#> [[12]][[2]]
#> [1] 0.5615149
#> 
#> [[12]][[3]]
#> [1] 0.5106027
#> 
#> [[12]][[4]]
#> [1] 0.5440194
#> 
#> [[12]][[5]]
#> [1] 0.6250208
#> 
#> [[12]][[6]]
#> [1] 0.5719854
#> 
#> [[12]][[7]]
#> [1] 0.5295712
#> 
#> [[12]][[8]]
#> [1] 0.3237803
#> 
#> [[12]][[9]]
#> [1] 0.2828471
#> 
#> [[12]][[10]]
#> [1] 0.5177428
#> 
#> [[12]][[11]]
#> [1] 0.6538957
#> 
#> [[12]][[12]]
#> [1] 1
#> 
#> [[12]][[13]]
#> [1] 0.6412413
#> 
#> [[12]][[14]]
#> [1] 0.4908774
#> 
#> 
#> [[13]]
#> [[13]][[1]]
#> [1] 0.5931828
#> 
#> [[13]][[2]]
#> [1] -0.4793222
#> 
#> [[13]][[3]]
#> [1] 0.4637058
#> 
#> [[13]][[4]]
#> [1] 0.3540848
#> 
#> [[13]][[5]]
#> [1] 0.4114662
#> 
#> [[13]][[6]]
#> [1] 0.9502258
#> 
#> [[13]][[7]]
#> [1] 0.5013311
#> 
#> [[13]][[8]]
#> [1] 0.3217046
#> 
#> [[13]][[9]]
#> [1] 0.3370971
#> 
#> [[13]][[10]]
#> [1] 0.6094118
#> 
#> [[13]][[11]]
#> [1] 0.5488661
#> 
#> [[13]][[12]]
#> [1] 0.6412413
#> 
#> [[13]][[13]]
#> [1] 1
#> 
#> [[13]][[14]]
#> [1] 0.4567534
#> 
#> 
#> [[14]]
#> [[14]][[1]]
#> [1] 0.4505574
#> 
#> [[14]][[2]]
#> [1] -0.4719152
#> 
#> [[14]][[3]]
#> [1] 0.5593851
#> 
#> [[14]][[4]]
#> [1] 0.2137841
#> 
#> [[14]][[5]]
#> [1] 0.2880645
#> 
#> [[14]][[6]]
#> [1] 0.3653329
#> 
#> [[14]][[7]]
#> [1] 0.8445888
#> 
#> [[14]][[8]]
#> [1] 0.1625244
#> 
#> [[14]][[9]]
#> [1] -0.217787
#> 
#> [[14]][[10]]
#> [1] 0.4456488
#> 
#> [[14]][[11]]
#> [1] 0.539001
#> 
#> [[14]][[12]]
#> [1] 0.4908774
#> 
#> [[14]][[13]]
#> [1] 0.4567534
#> 
#> [[14]][[14]]
#> [1] 1

Running harmonydata locally from a docker image

To run harmonydata locally, first you need to pull the docker image using the terminal.

1. Pull docker image

docker pull harmonydata/harmonyapi

2. Run docker image

docker run -p 8000:80 harmonyapi

3. Configure harmonydata to run locally

Set url to use localhost. Don’t forget to expose port 8000:

set_url(harmony_url = "http://localhost:8000")

📜 How do I cite Harmony?

You can cite our validation paper:

McElroy, Wood, Bond, Mulvenna, Shevlin, Ploubidis, Scopel Hoffmann, Moltrecht, Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data. BMC Psychiatry 24, 530 (2024), https://doi.org/10.1186/s12888-024-05954-2

A BibTeX entry for LaTeX users is

@article{mcelroy2024using,
  title={Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data},
  author={McElroy, Eoin and Wood, Thomas and Bond, Raymond and Mulvenna, Maurice and Shevlin, Mark and Ploubidis, George B and Hoffmann, Mauricio Scopel and Moltrecht, Bettina},
  journal={BMC Psychiatry},
  volume={24},
  number={1},
  pages={530},
  year={2024},
  publisher={Springer}
}

About

R library for Harmony. R package - open source tool using AI for psychology and mental health. Actively recruiting contributors.

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •