Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex geocoding config [MAP-706] [MAP-708] [MAP-707] [MAP-458] #166

Merged
merged 89 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
1454517
Multi-column geogoding script to enrich Britain Elects spreadsheet wi…
janbaykara Dec 10, 2024
fc1d637
Filter council names correctly
janbaykara Dec 10, 2024
32d3667
Better matching
janbaykara Dec 11, 2024
5299521
Start at gen 10
janbaykara Dec 11, 2024
42b671d
Ignore areas without a GSS code
janbaykara Dec 11, 2024
bbdace2
Edge case: remap "Bristol, city of" to "bristol city" for mapit matching
janbaykara Dec 11, 2024
89047e2
Remove logging
janbaykara Dec 11, 2024
e4777aa
Pass in update data
janbaykara Dec 16, 2024
ff72ea6
Merge branch 'main' into feature/map-668-hnh-ward-csv-issues
janbaykara Dec 16, 2024
ca242e9
Rename and reorganise tests for external data source
janbaykara Dec 16, 2024
a60ddd1
Tweaks
janbaykara Dec 16, 2024
b222cb2
Prepare loaders even if EDS has no org
janbaykara Dec 16, 2024
deb4265
Ingest area data even if multi-level geocoding fails
janbaykara Dec 16, 2024
55f62f4
Prepared tests for tricky multi-level geocoding cases
janbaykara Dec 16, 2024
afabd19
Skip geocoding test while areas aren't in the DB
janbaykara Dec 16, 2024
bea8cb4
Add geocoding data for tracing purposes
janbaykara Dec 16, 2024
0bd0119
Tweaks
janbaykara Dec 16, 2024
86f9fd3
Fix tests, use seed data
janbaykara Dec 16, 2024
22b6897
Make coordinates work
janbaykara Dec 16, 2024
673d2ba
Backup postcode geocoding options
janbaykara Dec 17, 2024
333e947
Sharpen test rig
janbaykara Dec 17, 2024
4a93331
Add postgresql to dev container to enable ingestion of seed SQL data
janbaykara Dec 17, 2024
f93e25c
Log geocoding steps to help debugging
janbaykara Dec 17, 2024
1ab63f5
More debugging/testing stuff
janbaykara Dec 17, 2024
05af2bb
Add "Abbey" disambiguation tests
janbaykara Dec 17, 2024
239d071
Handle postcodesIO fail
janbaykara Dec 17, 2024
dfbf902
Merge remote-tracking branch 'origin/feature/eers-and-lads-selectable…
janbaykara Dec 17, 2024
65c3e88
Merge migration
janbaykara Dec 17, 2024
5095f6e
Add trigram matching
janbaykara Dec 17, 2024
5a3806d
Try new area seed
janbaykara Dec 17, 2024
ae1f1ff
Move search term out of resulting data in geocoder statement
janbaykara Dec 17, 2024
b45c20c
Drier code
janbaykara Dec 17, 2024
899941c
Use the correct geographic comparison: intersect is actually what we …
janbaykara Dec 17, 2024
8549417
Debug logging for candidate areas
janbaykara Dec 17, 2024
6e109c9
Got to 100% success rate on present areas!
janbaykara Dec 17, 2024
0ba23ff
Merge
janbaykara Dec 18, 2024
f16b0ba
Fix tests
janbaykara Dec 18, 2024
1be13aa
More fixes
janbaykara Dec 18, 2024
405d339
Lint
janbaykara Dec 18, 2024
5db4d3b
Allow geocoding by mapit type (e.g. for more niche geocoding)
janbaykara Dec 18, 2024
5a17e51
Hide geography column when geocoding_config is set; allow geocoding n…
janbaykara Dec 18, 2024
b30551c
Add more test cases
janbaykara Dec 18, 2024
6f09ce7
Skip geocoding if everything's the same as last time
janbaykara Dec 18, 2024
b218fa9
lint
janbaykara Dec 18, 2024
477a5b0
More cases
janbaykara Dec 18, 2024
efcf7ae
Install postgres in test runner so geocoding test cases can be checked
janbaykara Dec 18, 2024
05ff887
lint
janbaykara Dec 18, 2024
1ba774e
Fix tests
janbaykara Dec 18, 2024
8955153
Consider now-inactive areas if it improves the chance of identifying …
janbaykara Dec 19, 2024
24da4d2
Remove unused code
janbaykara Dec 19, 2024
419e06e
Lint
janbaykara Dec 19, 2024
609d926
Move geocoding_config code to a separate file
janbaykara Dec 19, 2024
acff363
Update fixtures: more areas are now successfuly geocodable!
janbaykara Dec 19, 2024
1d8452d
Update test seed to accomodate historical areas
janbaykara Dec 19, 2024
f913db8
Remove logging
janbaykara Dec 19, 2024
ca69e5f
lint
janbaykara Dec 19, 2024
4c94bef
Fix other tests
janbaykara Dec 19, 2024
14b3b00
Disable geocoding test while the test rig is borked
janbaykara Dec 19, 2024
38cb459
Add new geocoding options: coordinates, address with prefix/suffix/co…
janbaykara Dec 19, 2024
b0dbe26
Run geocoding tests separately
janbaykara Dec 19, 2024
0cd9571
Handle edge cases for coordinates
janbaykara Dec 19, 2024
2e09ad3
Attempt fix of tests
janbaykara Dec 19, 2024
eda0018
Add edge cases for address geocoding
janbaykara Dec 19, 2024
6f7db21
Fix tests
janbaykara Dec 19, 2024
c66cd81
Attempt fix tests
janbaykara Dec 19, 2024
ca94c09
Fix tests
janbaykara Dec 19, 2024
a399632
Use a more sensible naming scheme
janbaykara Dec 28, 2024
d76b0a5
Refactor, use type property, need to fix tests
janbaykara Dec 28, 2024
350ac9c
Fix tests
janbaykara Jan 5, 2025
711fda1
Fix tests
janbaykara Jan 5, 2025
03b8beb
Fix test
janbaykara Jan 5, 2025
e44f911
specify category of area types
janbaykara Jan 5, 2025
34b40ba
Standardise metadata for geocoding
janbaykara Jan 5, 2025
365ca86
Admin UI for sources/data
janbaykara Jan 5, 2025
bbb6506
Fix graphql
janbaykara Jan 5, 2025
7152553
Merge branch 'admin-upgrade' into feature/map-668-hnh-ward-csv-issues
janbaykara Jan 5, 2025
3397206
Fix tests
janbaykara Jan 5, 2025
ee90454
Edit geocoding config from the app
janbaykara Jan 5, 2025
a8cfe59
Lint
janbaykara Jan 5, 2025
0cb8dbc
Fix tests
janbaykara Jan 5, 2025
45724cf
Lint
janbaykara Jan 5, 2025
d149c5c
Re-geocode if the geocoder has changed (e.g. bugfix, new version, etc.)
janbaykara Jan 5, 2025
1fb3b06
Add basic geocoding analytics to dashboard, to help with debugging.
janbaykara Jan 5, 2025
bf88f1a
Delete data source queued jobs, stored records from the inspector
janbaykara Jan 6, 2025
46c195e
Remove irrelevant comments
janbaykara Jan 7, 2025
897253a
Remove unnecessary case change
janbaykara Jan 7, 2025
d9c4d94
Allow children to be passed as props in React.
janbaykara Jan 7, 2025
d9cd7e8
Merge remote-tracking branch 'origin/main' into feature/map-668-hnh-w…
janbaykara Jan 7, 2025
4033fce
fix: change test record email addresses to prevent stale data
joaquimds Jan 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
runs-on: ubuntu-latest
environment: testing
services:
postgres:
db:
image: kartoza/postgis:13
env:
POSTGRES_USER: postgres
Expand All @@ -31,7 +31,7 @@ jobs:
# https://stackoverflow.com/questions/78593700/langchain-community-langchain-packages-giving-error-missing-1-required-keywor
image: python:3.12.3
env:
DATABASE_URL: postgis://postgres:password@postgres:5432/local-intelligence
DATABASE_URL: postgis://postgres:password@db:5432/local-intelligence
CACHE_FILE: /tmp/meep
POETRY_VIRTUALENVS_CREATE: "false"
GOOGLE_MAPS_API_KEY: ${{ secrets.GOOGLE_MAPS_API_KEY }}
Expand Down Expand Up @@ -63,9 +63,11 @@ jobs:
run: |
curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null
echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | tee /etc/apt/sources.list.d/ngrok.list
apt-get update && apt-get install -y binutils gdal-bin libproj-dev ngrok less
apt-get update && apt-get install -y binutils gdal-bin libproj-dev ngrok less postgresql-client
- name: Install poetry
run: curl -sSL https://install.python-poetry.org | python3 -
run: |
curl -sSL https://install.python-poetry.org | python3 -
~/.local/bin/poetry self add poetry-plugin-export
- name: Install python dependencies
run: ~/.local/bin/poetry export --with dev --without-hashes -f requirements.txt --output requirements.txt && pip install -r requirements.txt
- name: Start ngrok tunnelling
Expand All @@ -86,6 +88,10 @@ jobs:
run: gunicorn local_intelligence_hub.asgi:application -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 > server.log 2>&1 &
- name: Run django tests
run: cat .env && coverage run --source=. --branch manage.py test || (cat server.log && exit 1)
- name: Run geocoding tests in isolation
run: |
echo "RUN_GEOCODING_TESTS=1" >> .env
cat .env && python manage.py test hub.tests.test_external_data_source_parsers || (cat server.log && exit 1)
- name: Generate coverage xml
run: coverage xml
- name: Upload coverage.xml
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ __pycache__
.coverage
.media
data/**/*
!data/areas.psql.zip
!data/.gitkeep
!data/areas.psql.zip
.next
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ ENV INSIDE_DOCKER=1 \
SHELL=/bin/bash
RUN curl -sL https://deb.nodesource.com/setup_22.x | bash
RUN apt-get update && apt-get install -y \
binutils gdal-bin libproj-dev git nodejs python3-dev \
binutils gdal-bin libproj-dev git nodejs python3-dev postgresql-client \
&& rm -rf /var/lib/apt/lists/*
RUN curl -sSL https://install.python-poetry.org | python -
ENV PATH="/root/.local/bin:$PATH"
Expand Down
8 changes: 8 additions & 0 deletions bin/import_areas_seed.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

if [ "$ENVIRONMENT" != "production" ]; then
unzip -o data/areas.psql.zip -d data
PGPASSWORD=password psql -U postgres -h db test_local-intelligence < data/areas.psql
else
echo "This command cannot run in production environments."
fi
Binary file modified data/areas.psql.zip
Binary file not shown.
27 changes: 27 additions & 0 deletions hub/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
AreaData,
DataSet,
DataType,
ExternalDataSource,
GenericData,
Membership,
Organisation,
Person,
Expand Down Expand Up @@ -262,3 +264,28 @@ class MembershipAdmin(admin.ModelAdmin):
inline = [
OrganisationInline,
]


# External data source
@admin.register(ExternalDataSource)
class ExternalDataSourceAdmin(admin.ModelAdmin):
list_display = ("name", "orgname")

search_fields = ("name", "orgname")

def orgname(self, obj):
return obj.organisation.name

orgname.admin_order_field = "author" # Allows column order sorting
orgname.short_description = "Author Name" # Renames column head


# Generic data
@admin.register(GenericData)
class GenericDataAdmin(admin.ModelAdmin):
list_display = ("name", "source", "value")

search_fields = ("name", "source", "value")

def source(self, obj):
return obj.data_type.data_set.external_data_source.name
16 changes: 16 additions & 0 deletions hub/analytics.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,22 @@ class AreaStat(TypedDict):
gss: Optional[str]
external_data: dict

def imported_data_count_located(self) -> int:
return (
self.get_analytics_queryset().filter(postcode_data__isnull=False).count()
or 0
)

def imported_data_count_unlocated(self) -> int:
return self.get_analytics_queryset().filter(postcode_data=None).count() or 0

def imported_data_geocoding_rate(self) -> float:
located = self.imported_data_count_located()
total = self.imported_data_count()
if total == 0:
return 0
return (located / total) * 100

# TODO: Rename to e.g. row_count_by_political_boundary
def imported_data_count_by_area(
self,
Expand Down
Empty file added hub/data_imports/__init__.py
Empty file.
Loading
Loading