Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update spatial join points to poly #71

Open
npr99 opened this issue Jan 13, 2025 · 0 comments
Open

update spatial join points to poly #71

npr99 opened this issue Jan 13, 2025 · 0 comments

Comments

@npr99
Copy link
Owner

npr99 commented Jan 13, 2025

Lidia Mezei updated the code for URSC 645. See if this should replace current code.

`

def spatial_join_points_to_poly(points_gdf,
polygon_gdf,
point_var,
poly_var,
geolevel,
epsg: int = 4326,
join_column_list: list = [],
buffer_dist: int = 0.001):
"""

Function adds polygon variables to block points.
point_var: Variable with WKT Point Geometry for Polygon GDF --> just the geometry column from geodataframe (in my case, property)
poly_var: Variable with WKT Polygon Geometry for Polygon GDF --> just the geometry column from geodataframe (in my case, counties)
geolevel: Geography level of the polygon - example Place or PUMA --> "County"
join_column_list: Variable list to be transferred from Polygon File to Block File --> variable list of columns from polygon to keep (add to points) (e.g., county FIPS code)
buffer_dist: buffer distance in degrees for lat/lon around point bounds --> could be 0 (in my case, may be)
 - https://en.wikipedia.org/wiki/Decimal_degrees
 - example: 0.001 for 1/1000th of a degree or approximately 100 meters
 - example: 0.0001 for 1/10000th of a degree or approximately 10 meters
Original code idea from 
https://geoffboeing.com/2016/10/r-tree-spatial-index-python/

Geopandas has a function sjoin that could be used
however sjoin can be slow and does not allow for selecting columns

future improvement: if there are multiple polygons for a point,
the function could create multiple rows for the point.
"""
# make copies of input df and gdf
copy_point_gdf = points_gdf.copy(deep=True)
copy_polygon_gdf = polygon_gdf.copy(deep=True)

# Ensure both points and polygons have the same CRS
# print(epsg) # debug code
copy_point_gdf = copy_point_gdf.to_crs(epsg=epsg)
copy_polygon_gdf = copy_polygon_gdf.to_crs(epsg=epsg)

# Find the bounds of the point File

minx = copy_point_gdf.bounds.minx.min() - buffer_dist # subtract buffer from minimum values
miny = copy_point_gdf.bounds.miny.min() - buffer_dist
maxx = copy_point_gdf.bounds.maxx.max() + buffer_dist
maxy = copy_point_gdf.bounds.maxy.max() + buffer_dist
copy_point_gdf_bounds = [minx, miny, maxx, maxy]

# Select polygons within Bounds of Study Area
# build the r-tree index - for polygon file
print("Polygon file has",copy_polygon_gdf.shape[0],geolevel,"polygons.")
sindex_poly_gdf = copy_polygon_gdf.sindex
possible_matches_index = list(sindex_poly_gdf.intersection(copy_point_gdf_bounds))
area_poly_gdf = copy_polygon_gdf.iloc[possible_matches_index]
print("Identified",area_poly_gdf.shape[0],geolevel,"polygons to spatially join.")

# build the r-tree index - Using Representative Point
copy_point_gdf['geometry'] = copy_point_gdf[point_var]
sindex_copy_point_gdf = copy_point_gdf.sindex

#Loops for spatial join are time consuming
#Here is a way to know that the loop is running and how long it takes to run
#https://cmdlinetips.com/2018/01/two-ways-to-compute-executing-time-in-python/
# find the points that intersect with each subpolygon and add ID to Point
for index, polygon in area_poly_gdf.iterrows():
    if index%100==0:
        print('.', end ="")

    # find approximate matches with r-tree, then precise matches from those approximate ones
    possible_matches_index = \
        list(sindex_copy_point_gdf.intersection(polygon['geometry'].bounds))
    possible_matches = copy_point_gdf.iloc[possible_matches_index]
    precise_matches = \
        possible_matches[possible_matches.intersects(polygon['geometry'])]
    for col in join_column_list:
        copy_point_gdf.loc[precise_matches.index,geolevel+col] = polygon[col]

# Switch Geometry back to Polygon
copy_point_gdf['geometry'] = copy_point_gdf[poly_var]

return copy_point_gdf

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant