-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complex geocoding config [MAP-706] [MAP-708] [MAP-707] [MAP-458] #166
Conversation
🔍 Existing Issues For ReviewYour pull request is modifying functions with the following pre-existing issues: 📄 File: hub/management/commands/import_areas.py
Did you find this useful? React with a 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I just had a few minor comments.
if any([job.status == "doing" for job in jobs]): | ||
status = "doing" | ||
elif any([job.status == "failed" for job in jobs]): | ||
status = "failed" | ||
elif all([job.status == "succeeded" for job in jobs]): | ||
status = "succeeded" | ||
elif number_of_jobs_ahead_in_queue <= 0: | ||
status = "succeeded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what's the logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GraphQL query was responding with status succeeded even if you had cancelled
some of the jobs. So I added this logic: if there are no more jobs, report that the status is done. There wasn't a done
status, so I settled for succeeded
because I'm an optimist starting 2025.
@@ -1836,7 +1841,7 @@ async def update_many(self, mapped_records: list[MappedMember], **kwargs): | |||
"Update many not implemented for this data source type." | |||
) | |||
|
|||
def get_record_id(self, record): | |||
def get_record_id(self, record) -> Optional[Union[str, int]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: lots of things break if get_record_id returns None
. Can we make this non-optional? if so, what do we return instead of None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return a false ID without that causing genuine problems? That seems like a dodgy answer to me. Instead I think we should handle the case.
The case here is that I added a Google Sheet with an empty ID cell, so Django did return a None
. I didn't change the behaviour, I just specified typing that reflected reality.
You might be correct but I think that this is a separate ticket to solve the problem of "what happens when data sources return a None ID?" Because this is already a live problem in the Mapped system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do some heuristic work to generate our own IDs, but then what if the ID in the system changes? Then suddenly we'll have duplicate data in Mapped.
I think the best approach is to keep get_record_id
honest and then deal with None IDs with the principle that we can't use them, basically. New ticket for this: https://linear.app/commonknowledge/issue/MAP-767/gracefully-handle-reject-records-with-a-none-id
"ward": "Credenhill", | ||
"expected_area_type_code": "WD23", | ||
"expected_area_gss": "E05012957", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: have a "X city council" vs "X council" case here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow, can you expand?
This PR adds a new, more capable geocoding option that can involve multiple fields.
geography_config
is a new approach to geocoding. When the field is set (see test cases), it allows three new approaches to geocoding:geography_config
is set, geocoding cannot be set differently by the user in the UI. So there are some JSX changes for that.