-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed prediction to run with multithreading #54
base: stable
Are you sure you want to change the base?
Conversation
Hello @skjerns , this is great! I will merge your PR and adapt it to the new major release. Until it's done I will keep this PR open. Best, Darius |
Meanwhile I have found another solution that speeds up things to almost real-time predictions: I altered the example for C: int main(int argc, const char * argv[]) {
if ((argc-1) % n_features != 0){
printf("Need to supply N x %d features flattened, %d were given", n_features, argc-1);
return 1;
}
double features[n_features];
int n_rows = (argc-1) / n_features;
for (int row=0; row < n_rows; row++){
printf("row: %d\\n", row);
for (int i = 0; i < n_features; i++) {
features[i] = atof(argv[i+row*n_features+1]);
}
// calculate outputs for debugging
int class_idx = predict_class_idx(features);
// same as calling label = predict(features)
int label = labels[class_idx];
// now we print the results
printf("labels: ");
for (int i=0; i<n_classes; i++){
printf("%d ", labels[i]);
}
printf("\\n");
printf("class_idx: %d\\n", class_idx);
printf("label: %d", label);
printf("\\n\\n");
}
return 0;} |
In the next release all internal predictions will be multiprocessed by default. Here is the relevant part:
Yes, SIMD operations would be nice. But for now I prefer a simple and intuitive starting point where a developer can change and extend the generated source code easily. Nevertheless I see and understand the need, so I would suggest that we create an additional interactive example (something like that) where we demonstrate the customization and the final benefit. The current scaffold of a template is here. What do you think? |
Thanks for the note! That sounds great. I removed all checks that are related to the operating system: |
great! might be handy to include a edit: Ah, I guess that's done by DEPENDENCIES |
Great! Nice.
I'll leave it up to you. Having the source code of individual language templates would be feasible I guess? |
I saw that the
predict
orintegrity_score
is running quite slow.I've added functionality to let it run with
threading
, making it much faster.It adds a dependency on
joblib
, however this is already a dependency ofsklearn
, so no new dependencies are really added. This makes the code ~8x faster (with 8 threads).I've changed the call from
Shell.check_output
tosubprocess.check_output
.Shell
is callingsubprocess.check_output
in the background anyway, but like this we get another speedup of ~3-4xso a total speedup of ~30x is possible.
Example:
I've also seen that
integrity_score
runs perfectly fine on Windows, given thatgcc
is installed (and the hard-coded blocking of windows is removed). Do you think we can remove the blocking of the function for windows platforms?