-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variable bandwidth for 3 dimensional data. #26
Comments
Thanks for the kind words, and for raising this issue @ytarricq .
Hope this helps you. 👍 Making that recipe idiot-proof and implementing it in the main library would be a good task. If you (or anyone reading this) is up for it, that's a PR I would merge. |
Thanks for the quick answer ! Made things clearer between the bandwidth matrices/variable bandwidths. |
I'm having the same issue with the fixed bandwidth for all dimensions. |
@philippeller : Since the kernel functions are radial basis functions, I suppose your suggestion would amount to scaling the input data in each dimension, computing the KDE, then scaling back. However, it would hide how the data is scaled from the user. Some options are: min/max scaling, standardizing with the standard deviation and the mean, quantile transformations, etc. I feel that "simple is better than complex" and the "principle of least surprise" applies here. Doing some implicit scaling scheme might confuse users more than it helps them. Stating that "the multidimensional KDE is isotropic" and letting users handle scaling seems simpler to understand and less likely to produce unexpected results. I'm open to suggestions of course. But I would need some details. A high-level wrapper function, or a ScalingTransformer class might be sensible. |
Maybe I'm missing some important point, but I was thinking not an implicit, but an explicit scaling. Let's say the user supplies two dimensional data ( |
That would work. 👍 Thanks for clarifying. I got a little ahead of myself. What you're sketching might be worth implementing. In a different issue #6 we had some discussions about a more general case. It's really an issue of API design. The way I see it: Pros:
Cons:
In conclusion I would merge a PR that implements this. 👍 No promises about when/if I'll find time to do it myself though. |
Hello,
First of all, thanks for the great package.
I'm trying to compute density maps of a 3 dimensional points distribution. I understood from the documentation that a variable bandwith method was available but I couldn't figure out how to set up this option.
Additionnaly, in the case of a fixed bandwidth KDE for multidimensional data, I would have expected as in the stats_models_multivariateKDE implementation to be able to use a bandwidth per dimension but it seems that we can either use a single value of the bandwidth or to use one bandwidth per data point. Is it in order to take into account the weight of each data point that you implemented it this way ?
Thanks in advance.
Cheers
Yoann
The text was updated successfully, but these errors were encountered: