Question on Clustering in Simplified detection of urban types #358
-
Hello, I deeply appreciate your sharing momepy and Simplified detection of urban types. I am wondering the part of Clustering. Could you explain how and why you did as below? Do you know any reference book or paper on it? standardized = (percentiles_joined - percentiles_joined.mean()) / percentiles_joined.std() |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hello, this step is standardising values across all columns to ensure they all spread within a roughly the same range of values. The K-Means is based on an Euclidean distance and without this step, characters with large values would overpower those with low.
Fleischmann M, Feliciotti A, Romice O and Porta S (2021) Methodological Foundation of a Numerical Taxonomy of Urban Form. Environment and Planning B: Urban Analytics and City Science, doi: 10.1177/23998083211059835 If you want to read only on standardisation, Wikipedia is great - https://en.wikipedia.org/wiki/Standard_score |
Beta Was this translation helpful? Give feedback.
Hello, this step is standardising values across all columns to ensure they all spread within a roughly the same range of values. The K-Means is based on an Euclidean distance and without this step, characters with large values would overpower those with low.
Fleischmann M, Feliciotti A, Romice O and Porta S (2021) Methodological Foundation of a Numerical Taxonomy of Urban Form. Environment and Planning B: Urban Analytics and City Science, doi: 10.1177/23998083211059835
If you want to read only on standardisation, Wikipedia is great - https://en.wikipedia.org/wiki/Standard_score