Skip to content

Commit

Permalink
Scale up step 2 (#142)
Browse files Browse the repository at this point in the history
* Updated config

* Update TeleQ and queries

* Retrained model and updated requirements

* Removed setup.cfg from manifest

* Fix linting

* Updated workflows

* typo

* forgot keyword

* Add removed subsection to changelog
  • Loading branch information
rubenpeters91 authored Nov 13, 2024
1 parent d0911d8 commit e65cc54
Show file tree
Hide file tree
Showing 21 changed files with 186 additions and 78 deletions.
12 changes: 8 additions & 4 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,18 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v3

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ruff
run: uv sync --only-dev

- name: Analysing the code with ruff
run: |
ruff check .
uv run ruff check .
17 changes: 9 additions & 8 deletions .github/workflows/unit_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,21 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v3

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install dependencies
run: |
python -m pip install uv
uv venv .venv
if [ -f requirements.txt ]; then uv pip install -r requirements.txt; fi
uv pip install -e ".[test]"
run: uv sync --all-extras --dev

- name: Running Unit tests
run: |
source .venv/bin/activate
pytest tests/ --doctest-modules --cov=src --cov-report=xml
run: uv run pytest tests/ --doctest-modules --cov=src --cov-report=xml

- name: Code Coverage Summary Report
uses: irongut/[email protected]
with:
Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [1.4.8] - 2024-11-13

### Added
- New clinics added to the queries and config

### Changed
- Updated dependencies

### Removed
- Removed deprecated setup.cfg file

## [1.4.7] - 2024-11-04

### Changed
Expand Down
4 changes: 2 additions & 2 deletions data/raw/poliafspraken_no_show.csv.dvc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
outs:
- md5: 148c83475d3eeac86a809cbc569eba0e
size: 411770150
- md5: 530380d7750f4c1e819d94bee3c2c52a
size: 499291413
path: poliafspraken_no_show.csv
hash: md5
21 changes: 20 additions & 1 deletion data/sql/data_export.sql
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,26 @@ WHERE 1=1
'ZH0122', -- Infectieziekten
'ZH0088', -- Diabetologie
'ZH0096', -- Endocrinologie
'ZH0413' -- Vasculaire geneeskunde
'ZH0413', -- Vasculaire geneeskunde
-- Longfunctie
'ZH0182',
-- Neurologie & neurochiurgie
'ZH0005', -- Algemene neurochirurgie
'ZH0006', -- Algemene neurologie
'ZH0035', -- Cerebrovasculaire ziekten
'ZH0194', -- Neuro oncologie
-- Neuromusculaire ziekten
'ZH0197', -- Neuromusculaire ziekten
'ZH0105', -- Functionele neurochirurgie
-- Zorglijn affectieve en psychotische stoornissen
'ZH0298', -- PSY stemming en psychose
-- Zorglijn Diagnostiek en vroege psychose
'ZH0299', -- PSY Diagnostiek en vroege psychose
-- Zorglijn Ontwikkeling in perspectief
'ZH0300', -- PSY Ontwikkeling in perspectief
-- Zorglijn Acute en intensieve zorg
'ZH0297' -- PSY Acute en intensieve zorg

)
AND APP.identifier_system = 'https://metadata.umcutrecht.nl/ids/HixAgendaAfspraak'
AND APP.created >= '2015-01-01'
Expand Down
40 changes: 38 additions & 2 deletions data/sql/data_prediction.sql
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,25 @@ WHERE 1=1
'ZH0122', -- Infectieziekten
'ZH0088', -- Diabetologie
'ZH0096', -- Endocrinologie
'ZH0413' -- Vasculaire geneeskunde
'ZH0413', -- Vasculaire geneeskunde
-- Longfunctie
'ZH0182',
-- Neurologie & neurochiurgie
'ZH0005', -- Algemene neurochirurgie
'ZH0006', -- Algemene neurologie
'ZH0035', -- Cerebrovasculaire ziekten
'ZH0194', -- Neuro oncologie
-- Neuromusculaire ziekten
'ZH0197', -- Neuromusculaire ziekten
'ZH0105', -- Functionele neurochirurgie
-- Zorglijn affectieve en psychotische stoornissen
'ZH0298', -- PSY stemming en psychose
-- Zorglijn Diagnostiek en vroege psychose
'ZH0299', -- PSY Diagnostiek en vroege psychose
-- Zorglijn Ontwikkeling in perspectief
'ZH0300', -- PSY Ontwikkeling in perspectief
-- Zorglijn Acute en intensieve zorg
'ZH0297' -- PSY Acute en intensieve zorg
)
AND APP.identifier_system = 'https://metadata.umcutrecht.nl/ids/HixAgendaAfspraak'
AND APP.[created] >= '2015-01-01'
Expand Down Expand Up @@ -200,7 +218,25 @@ WHERE 1=1
'ZH0122', -- Infectieziekten
'ZH0088', -- Diabetologie
'ZH0096', -- Endocrinologie
'ZH0413' -- Vasculaire geneeskunde
'ZH0413', -- Vasculaire geneeskunde
-- Longfunctie
'ZH0182',
-- Neurologie & neurochiurgie
'ZH0005', -- Algemene neurochirurgie
'ZH0006', -- Algemene neurologie
'ZH0035', -- Cerebrovasculaire ziekten
'ZH0194', -- Neuro oncologie
-- Neuromusculaire ziekten
'ZH0197', -- Neuromusculaire ziekten
'ZH0105', -- Functionele neurochirurgie
-- Zorglijn affectieve en psychotische stoornissen
'ZH0298', -- PSY stemming en psychose
-- Zorglijn Diagnostiek en vroege psychose
'ZH0299', -- PSY Diagnostiek en vroege psychose
-- Zorglijn Ontwikkeling in perspectief
'ZH0300', -- PSY Ontwikkeling in perspectief
-- Zorglijn Acute en intensieve zorg
'ZH0297' -- PSY Acute en intensieve zorg
)
AND APP2.identifier_system = 'https://metadata.umcutrecht.nl/ids/HixAgendaAfspraak'
AND CONVERT(DATE, APP2.[start]) = @start_date
Expand Down
16 changes: 8 additions & 8 deletions dvc.lock
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@ stages:
deps:
- path: data/processed/featuretable.parquet
hash: md5
md5: 1b447d4fc41fb782d93804cde240e002
size: 76540090
md5: 98d6d67a29704828698fcd1fbd039e83
size: 93522680
- path: src/noshow/model/train_model.py
hash: md5
md5: 00964a947199825f721ebbbe0bb23da6
size: 3708
outs:
- path: output/models/no_show_model_cv.pickle
hash: md5
md5: 3d81bdcc860f51a1843593988ac2c1f6
size: 1591119
md5: f30c648b054c4ee8f336638f1982f381
size: 1860467
feature_building:
cmd: python src/noshow/features/feature_pipeline.py
deps:
Expand All @@ -24,14 +24,14 @@ stages:
size: 279455
- path: data/raw/poliafspraken_no_show.csv
hash: md5
md5: 148c83475d3eeac86a809cbc569eba0e
size: 411770150
md5: 530380d7750f4c1e819d94bee3c2c52a
size: 499291413
- path: src/noshow/features/feature_pipeline.py
hash: md5
md5: 71ffb7a162976bde11e0aed72ea19f98
size: 2889
outs:
- path: data/processed/featuretable.parquet
hash: md5
md5: 1b447d4fc41fb782d93804cde240e002
size: 76540090
md5: 98d6d67a29704828698fcd1fbd039e83
size: 93522680
3 changes: 0 additions & 3 deletions manifest_admin_dash.json
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,6 @@
"run/config/config.toml.dvc": {
"checksum": "9859113746e147a7efab10481e7a1769"
},
"setup.cfg": {
"checksum": "bbbe201f135ed769da1877db5cc56fb6"
},
"src/noshow/__init__.py": {
"checksum": "d41d8cd98f00b204e9800998ecf8427e"
},
Expand Down
3 changes: 0 additions & 3 deletions manifest_api.json
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,6 @@
"run/config/config.toml.dvc": {
"checksum": "9859113746e147a7efab10481e7a1769"
},
"setup.cfg": {
"checksum": "bbbe201f135ed769da1877db5cc56fb6"
},
"src/noshow/__init__.py": {
"checksum": "d41d8cd98f00b204e9800998ecf8427e"
},
Expand Down
3 changes: 0 additions & 3 deletions manifest_dash.json
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,6 @@
"run/config/config.toml": {
"checksum": "fc166438569074175c386c4a31d0184c"
},
"setup.cfg": {
"checksum": "bbbe201f135ed769da1877db5cc56fb6"
},
"src/noshow/__init__.py": {
"checksum": "d41d8cd98f00b204e9800998ecf8427e"
},
Expand Down
60 changes: 55 additions & 5 deletions notebooks/analyze_rct.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,9 @@
"metadata": {},
"outputs": [],
"source": [
"data_export = pd.read_csv(\"../data/raw/poliafspraken_rct.csv\").drop(\n",
" columns=[\"specialty_code\", \"name\", \"soort_consult\", \"afspraak_code\", \"start\", \"end\"]\n",
")\n",
"data_export = pd.read_csv(\n",
" \"../data/raw/poliafspraken_rct.csv\", parse_dates=[\"start\", \"end\"]\n",
").drop(columns=[\"specialty_code\", \"name\", \"soort_consult\", \"afspraak_code\"])\n",
"data_export.loc[data_export[\"cancelationReason_code\"] == \"N\", \"outcome\"] = \"No-Show\"\n",
"data_export.loc[data_export[\"status_code_original\"] == \"J\", \"outcome\"] = \"Show\"\n",
"data_export = data_export.drop(\n",
Expand Down Expand Up @@ -150,7 +150,7 @@
"\n",
"live_data_rct[\"prediction_id\"] = live_data_rct[\"prediction_id\"].astype(\"int64\")\n",
"live_data_rct = live_data_rct[live_data_rct[\"treatment_group\"] != 2]\n",
"live_data_rct = live_data_rct[live_data_rct[\"active\"] == 1]\n",
"# live_data_rct = live_data_rct[live_data_rct[\"active\"] == 1]\n",
"live_data_rct = live_data_rct.drop(\n",
" columns=[\"active\", \"clinic_reception\", \"request_id\", \"remarks\", \"last_call_date\"]\n",
")"
Expand Down Expand Up @@ -184,7 +184,6 @@
" right_on=\"APP_ID\",\n",
" how=\"left\",\n",
")\n",
"combined_data = combined_data.drop(columns=[\"clinic_name\", \"APP_ID\"])\n",
"combined_data"
]
},
Expand All @@ -204,6 +203,57 @@
"## Analyze No-Show"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Analyse how many appointments with status \"herinnerd\" are actually completed\n",
"combined_data.loc[combined_data[\"call_outcome\"] == \"Herinnerd\"].value_counts(\n",
" \"outcome\", normalize=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"combined_data.loc[combined_data[\"call_outcome\"] == \"Geen\"].value_counts(\n",
" \"outcome\", normalize=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"combined_data[\"call_outcome\"].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# How many appointments with call outcome verzet/geannulleerd are actually changed and\n",
"# subsequently completed\n",
"combined_data.loc[\n",
" combined_data[\"start_time\"] != combined_data[\"start\"], \"app_moved\"\n",
"] = True\n",
"combined_data.loc[\n",
" combined_data[\"start_time\"] == combined_data[\"start\"], \"app_moved\"\n",
"] = False\n",
"combined_data.loc[combined_data[\"call_outcome\"] == \"Verzet/Geannuleerd\"].value_counts(\n",
" [\"app_moved\", \"outcome\"], dropna=False\n",
").unstack()"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
10 changes: 5 additions & 5 deletions output/dvclive/metrics.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"best_score": 0.7272450402128756,
"mean_roc_auc": 0.7272450402128756,
"std_roc_auc": 0.002736622346011254,
"mean_precision": 0.534075517794019,
"mean_recall": 0.014129280503436722
"best_score": 0.7402663469877975,
"mean_roc_auc": 0.7402663469877975,
"std_roc_auc": 0.007219339746532362,
"mean_precision": 0.5441756346300849,
"mean_recall": 0.01516288989391171
}
2 changes: 1 addition & 1 deletion output/dvclive/plots/metrics/best_score.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
step best_score
0 0.7272450402128756
0 0.7402663469877975
2 changes: 1 addition & 1 deletion output/dvclive/plots/metrics/mean_precision.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
step mean_precision
0 0.534075517794019
0 0.5441756346300849
2 changes: 1 addition & 1 deletion output/dvclive/plots/metrics/mean_recall.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
step mean_recall
0 0.014129280503436722
0 0.01516288989391171
2 changes: 1 addition & 1 deletion output/dvclive/plots/metrics/mean_roc_auc.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
step mean_roc_auc
0 0.7272450402128756
0 0.7402663469877975
2 changes: 1 addition & 1 deletion output/dvclive/plots/metrics/std_roc_auc.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
step std_roc_auc
0 0.002736622346011254
0 0.007219339746532362
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "noshow"
version = "1.4.7"
version = "1.4.8"
authors = [
{ name="Ruben Peters", email="[email protected]" },
{ name="Eric Wolters", email="[email protected]" }
Expand Down
Loading

0 comments on commit e65cc54

Please sign in to comment.