[feature] changes to sampler/computing classes #39

lachlangrose · 2024-02-21T23:36:24Z

The sampler/compute/sorter classes should be changed so that the compute method is only called with the object that they are working on.

E.g. for a sampler it would be sampler.sample(shapefile), any augmentations should be specified in the constructor of the sampler. This means that the samples can be augmented with additional data but not changing the method used within map2loop. It also removes any unused variables and/or wild card passing of objects through the map_data class.

An example of this for the SamplerDecimator below modified from #38

Another idea is to change the sample/sort/calculate functions to __call__ @RoyThomsonMonash do you have any reason not to? It does make the task of the class less clear but the class name should be descriptive enough? for the sampler you could call using sampler(shapefile)....

class SamplerDecimator(Sampler):
    """
    Decimator sampler class which decimates the geo data frame based on the decimation value
    ie. decimation = 10 means take every tenth point
    Note: This only works on data frames with lists of points with columns "X" and "Y"
    """

    @beartype.beartype
    def __init__(self, geology: geopandas.GeoDataFrame, decimation: int = 1):
        """
        Initialiser for decimator sampler

        Args:
            geology (GeoDataFrame): the geology shapefile to copy layerId from
            decimation (int, optional): stride of the points to sample. Defaults to 1.
        """
        self.geology = geology
        self.sampler_label = "SamplerDecimator"
        decimation = max(decimation, 1)
        self.decimation = decimation

    @beartype.beartype
    def __call__(
        self, spatial_data: geopandas.GeoDataFrame
    ) -> pandas.DataFrame:
        """
        Execute sample method takes full point data, samples the data and returns the decimated points

        Args:
            spatial_data (geopandas.GeoDataFrame): the data frame to sample

        Returns:
            pandas.DataFrame: the sampled data points
        """
        data = spatial_data.copy()
        data["X"] = data.geometry.x
        data["Y"] = data.geometry.y
        data["layerID"] = geopandas.sjoin(
            data,
            self.geology,
            how='left',
        )['ID_right'].values
        data.reset_index(drop=True, inplace=True)
        return pandas.DataFrame(data[:: self.decimation].drop(columns="geometry"))

The text was updated successfully, but these errors were encountered:

RoyThomsonMonash mentioned this issue Feb 22, 2024

Add layer id to structural data using a spatial join #38

Merged

AngRodrigues added high priority on top of our to do medium priority we will do this soon and removed high priority on top of our to do labels May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] changes to sampler/computing classes #39

[feature] changes to sampler/computing classes #39

lachlangrose commented Feb 21, 2024

[feature] changes to sampler/computing classes #39

[feature] changes to sampler/computing classes #39

Comments

lachlangrose commented Feb 21, 2024