Quantization is an optimization problem ~ Minimize MSE/MAE given the number of bins by finding the optimal bin edges. Options:
-
Solve the optimization problem
-
Solve it via CART (which is ~ solving the said problem)
-
Use percentile binning (~ probability density)
-
Use clustering methods like K-Means
-
If the maximum number of bins was not fixed, we could use popular heuristic solutions for inferring k, e.g., Freedman-Diaconis and Sturges, and then we could use an optimization algorithm to find the optimal bin edges. Or we could use DBScan, etc.
In a couple of notebooks, I walk through the options. For #1, #3, #4, and #5, see R nb. For #5, see the python nb (R flakes).