-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Wikidata reconciliation by distance #3663
Comments
That's a good suggestion, but it needs to be implemented in a reconciliation service, not OpenRefine. It could be an enhancement to https://github.com/wetneb/openrefine-wikibase or it could be a specialized reconciliation service. OpenRefine would send columns like |
Sorry, I did not realize that the reconciliation service is developed separately. I guess we can close it here and I'll post the issue there? |
@wetneb May be able to transfer the issue (not sure if that works between organizations). Why don't we wait for him before you go to the effort of recreating it. |
@VojtechDostal - I’m assuming you have a list of buildings and you want to find out which are closet to which. One workaround is to round (to whatever precision) the latitude/longitude coordinates and do a self cell.cross() to find which are in the same quadrant and then calculate the distance between matching rows. This has worked well for me in the past. Could do the same cell.cross() against a separate project with a list of candidates to match against. Note that points near each other but in opposite sides of the edge of a quadrant will not match so it’s best to do this a second time with the center of the quadrants skewed. |
@VojtechDostal I've had good luck with Bing Maps API which is pretty generous with a free developer key (< 125,000 transactions) and just constructing the URL that I need in a new OpenRefine column and using Fetch URLs. But if you already do work on Google Cloud, you might already have many credits you can carry over to use on the Google Maps platform as credit: https://developers.google.com/maps/documentation/geocoding/usage-and-billing |
We're getting a little off track here. The ask was to look things up in a reconciliation service (e.g. Wikidata), not a mapping service or another OpenRefine project. @VojtechDostal It doesn't look like it's possible to transfer this issue to a repo in another org/user. Sorry for getting your hopes up! I'm going to close this one and you can recreate it in https://github.com/wetneb/openrefine-wikibase (presuming that it's Wikidata or another Wikibase-based database that you're interested in reconciling against. |
As you said, this is more of a question of implementing such a feature in a reconciliation service. In practical terms you’re going to want to limit the items you’re doing the distance calculation for. My suggestion above was focused on how to do such limiting. It could be used to build such a reconciliation service or as presented as a workaround given that no such service has been identified. |
I have opened an issue on the reconciliation service: wetneb/openrefine-wikibase#101 |
Linking related issue: #1966 |
Reconciliation by string matching is useful in many cases, but it is currently (to my knowledge) impossible to find closest items to the matched object.
Proposed solution
Use case: I have a list of buildings with coordinates (lat,lon). I'd like to find what the closest item(s) to those coordinates are. Additionally I'd like to be able to filter out results by class (subclass of: building) and suggest only these. High-confidence matches (very close and corresponding names) could be auto-matched.
Alternatives considered
I don't know of any alternative way/hack to load the closest item to given coordinates. However, the Wikidata SPARQL service has a distance service and I think there is also a special API call for exactly this.
The text was updated successfully, but these errors were encountered: