-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider clusters of segments with similar naive averages as segment candidates in their own right #23
Conversation
…s. All tests pass.
Please rerun notebooks so we have plots in the github pages (need to rerun with plot_is_static = False), but it's not super important |
wise_pizza/cluster.py
Outdated
if power_transform: | ||
if len(X[X > 0] > 1): | ||
X[X > 0] = ( | ||
PowerTransformer(standardize=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I forgot something, but why don't we standardize here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't want to mix up the positive and the negative values, we want to transform both separately.
@@ -224,6 +229,7 @@ def explain_changes_in_totals( | |||
force_dim="Change from" if how == "force_dim" else None, | |||
force_add_up=force_add_up, | |||
constrain_signs=constrain_signs, | |||
cluster_values=cluster_values, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding cluster_values parameter also to the explain_changes_in_averages? Of course we everytime call explain_changes_in_totals then, but just to be sure if someone wants to use cluster_values = False parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, will fix
if score > best_score: | ||
best_score = score | ||
best_labels = cluster_labels | ||
best_n = n_clusters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlikely that it will cause problems, but we define best_n only in the if statement, I suggest to add best_n = None before "for" loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
Context
Checklist