Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganise interpolation code #2597

Merged
merged 13 commits into from
Jan 28, 2022

Conversation

lonvia
Copy link
Member

@lonvia lonvia commented Jan 27, 2022

This cleans up large parts of the code around interpolations for both the OSM interpolation ways in location _property_osmline and the TIGER data in location_property_tiger:

  • Updates now replace existing interpolations instead of recreating them, avoiding cases where the interpolation was recreated multiple times in an updates because of changes in the underlying nodes.
  • Replace the interpolation type column with a step column. This simplifies queries because there is no longer a need to distinguish between odd and even as long as the start number is potentially corrected to represent the required odd/even number. It also means that Nominatim now supports numerical values for addr:interpolation although the usage is so rare that it doesn't really matter.
  • osmline now only contains interpolations for the inner part of the interpolations, i.e. all numbers not covered by existing address points. This means we can ignore thousands of 'unnecessary' interpolations where there are no additional numbers between start and end housenumber. These kind of interpolations might have a merit in OSM as a marker that the address is approximate. As Nominatim doesn't have this notion, there is no need to index them.
  • Optimize indexes for osmline so that 'inactive' interpolations are no longer included.

This PR also fixes a long-standing bug in the TIGER data import, where the start and end housenumbers are potentially switch to make sure that the start number is smaller than the end number but the interpolation geometry was not reversed. Thus results on such lines would be in the wrong order. If you make heavy use of TIGER housenumber data (i.e. search full addresses in the US a lot) you might want to reimport the TIGER data to fix the tables.

Raises the minimum required PostgreSQL version to 9.6.

lonvia added 13 commits January 27, 2022 11:14
Use the same update mechanism as for updates on the interpolations
themselves. Updates must solely happen in place_insert as this is
the place where actual changes of the data happen.
Do not index 'inactive' rows (with startnumber is null) where possible.
Nodes on an interpolation now only get the address tags of
interpolations and then compute their own parent from that. They no
longer inherit the parent directly.
This replaces the interpolationtype column.
The new code uses the open-ended array notation which is only
available sind psql 9.6.
@mtmail
Copy link
Collaborator

mtmail commented Jan 27, 2022

the start number is smaller than the end number but the interpolation geometry was not reversed.

Was the error in the data (osm-search/TIGER-data#20) or how Nominatim import processed it?

@lonvia
Copy link
Member Author

lonvia commented Jan 27, 2022

@mtmail
Copy link
Collaborator

mtmail commented Jan 27, 2022

Great. ST_REVERSE is much easier to call than reversing the raw WKT in python. I'll still implement it somehow but first needs more tests.

@lonvia lonvia merged commit c50c534 into osm-search:master Jan 28, 2022
@lonvia lonvia deleted the reorganise-interpolations branch January 28, 2022 07:46
@otbutz
Copy link
Contributor

otbutz commented Jan 31, 2022

Raises the minimum required PostgreSQL version to 9.6.

Postgres 9.6 has reached EOL since November 2021. Shouldn't we aim for a supported version?

@lonvia
Copy link
Member Author

lonvia commented Jan 31, 2022

The rule is to update the requirements as necessary. EOL versions fall out of support when either a) the CI cannot test them anymore or b) they do not offer a function we'd like to use.

In this particular case, PG 9.5 does not have support for open-ended array subscripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants