-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-label distances #3
Comments
Caption of figure 4: Features do separate given labels. It is possible to build a model to separate labels using these features. |
I am also curious about how "Label separation power of a feature set" is calulated. The paper says it is "inverse of an area under a cross�label-class distances histogram weighted by a function to only select values relatively close to zero. Choice of the weighting function depends on the label and the numbers of feature in the feature space". |
Cross-label weight function
Regarding the cross-label distances, we were able to reproduce the plots in figure 3 on page 7. There is a weight function mentioned in "3.2 Fixing feature space parameters", which does select values relatively close to zero only, but no exact formula is provided. Therefore, we tried to use a linear function, a quadratic function and an exponentially decaying function, but unfortunately we were not able to reproduce the results presented in the paper in figure 4 on page 8. Could you please give us a hint, what kind of function you were using to obtain the plots in figure 4?
X-axis of figure 3 discussion
For our most promising attempt to reproduce the figure 3 on page 7, we used the exact same feature set as described in the chapter "3.1 Feature space" on page 5. Therefore, we would have expected to have a histogram of distances between 0 and pinumber_of_features_in_feature_set. Since we have two features in our feature set, we would have expected the x-axis to span between 0 and 2pi. However, in the figure 3, the range is between 0 and 6 (which is close to 2pi, but not exact). Is this a coincidence based on statistics, that we do not have at least one cross-label distance between 6 and 2pi? Or is the x-axis just trimmed up to the value 6.0?
X-axis of figure 4 discussion
We were not able to reproduce the figure 4 on page 8 at all. As described in chapter "3.3 Chosen market representation", 28 standard price and volume indicators has been used. Therefore, we would expect the x-axis to range between 0 and 3*28 or something like that. Obviously, we did not grasp the point. Could you please elaborate in detail, how we should be able to reproduce this figure? If this figure contains a subset of all 28 features, how should we select those "best" features accordingly?
Caption of figure 4
There is a sentence in the caption of figure 4, which is the following: "Only 0.005% of cross-label-class distances is below 3 on the same dataset as Fig. 3 histogram." We understand, how this percentage 0.005% is computed, but why is it important to mention this percentage of the area up to the very specific value 3? What information does this number provide?
The text was updated successfully, but these errors were encountered: