Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fhir Resources and Data generator #6

Merged
merged 19 commits into from
May 16, 2024
Merged

Fhir Resources and Data generator #6

merged 19 commits into from
May 16, 2024

Conversation

adamkells
Copy link
Contributor

@adamkells adamkells commented May 11, 2024

What is the purpose of this PR?

This PR introduces the concepts of Data Resources and Data Generators.

It adds resources, generators and tests for the Patient, Practitioner and Encounter resources.

What are data resources?

Data resources are pydantic models which encode some relevant data structure (in this case FHIR V5 resources)/

What are data generators?

Data generators are objects which are logged to a registery and can produce data which will pass validation for a corresponding data resource.

These generators will range from producing data meeting the bare minimum requirements (i.e. returning a string for a text field) to more appropriate/realistic data (i.e. returning an actual FHIR code for a particular codeable concept).

These generators are logged to a global generator registry by use of a register_generator decorator.

Next steps not covered in PR

  • Extension to further commonly used FHIR resources.
  • Opinionated use case of generators.
  • Addition of LLM data generator.

@adamkells adamkells self-assigned this May 11, 2024
@adamkells adamkells linked an issue May 11, 2024 that may be closed by this pull request
Copy link
Member

@jenniferjiangkells jenniferjiangkells May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To save time we can maybe skip writing tests for all the pydantic models as correct validation behaviour should be under pydantic's concern - I would only write tests if there is additional custom validators we implement in our code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truedat-bojack

Copy link
Member

@jenniferjiangkells jenniferjiangkells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, next step we should prioritise resources that are relevant to encounter-discharge and also we may need an additional wrapper for specific use cases (see base models for CDSRequest). Also, how flexible is it if users want to fix specific fields? e.g. Encounter type is inpatient, but rest of it is random. Would we just modify the field post generation?

For medical concept codes i.e. specific conditions they want, ask user to input themselves (LLMs will make shit up). We will not generate things here on the fly, but we can have some pre-made templates of different levels of complexity in patients? (healthy, simple condition, chronic conditions etc., pick from list of common conditions.)

Some relevant important resources and fields -

Must Haves

Encounter

  • admission and discharge dates
  • the type of encounter (inpatient)
  • the location within the hospital (Optional)
  • providers involved. (Optional)

Condition

  • medical conditions diagnosed during the hospital stay

Procedure

  • procedures performed during the inpatient stay, including surgeries and other significant interventions.

MedicationRequest

  • medications prescribed at discharge

MedicationAdministration

  • medications that were administered during the stay.

Nice to Haves (only look into if low-hanging fruit)

Observation

  • clinical findings and measurements, such as vital signs or laboratory test results

DiagnosticReport

  • summary of diagnostic tests, including imaging and pathology reports

ClinicalImpression

  • summary of the clinical impressions made by the healthcare providers during the encounter, which can include diagnosis, prognosis, and proposed treatment plans.

CarePlan

  • outlines the care and treatment recommendations for after the patient's discharge, including follow-up appointments, ongoing treatment plans, and patient education.

CareTeam

  • identifies all the healthcare providers who participated in the care of the patient during the hospital stay, including their roles and responsibilities.

DischargeSummary (free text)

  • A document resource that provides a comprehensive summary of the patient's hospital stay, including the reason for admission, the treatments provided, the course of care, and detailed discharge instructions.

@adamkells
Copy link
Contributor Author

Modifying data post-generator is easy. Best way to do this is to define a number of a user templates and overlay these on the random data post-hoc.

To avoid further PR bloat, will branch off and add any additional resources templates in new PR.

@adamkells adamkells merged commit a2be2d5 into main May 16, 2024
2 checks passed
@adamkells adamkells deleted the feature/data-generator branch May 16, 2024 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Write data generator v1
2 participants