Changed prediction to run with multithreading #54

skjerns · 2019-05-15T12:28:59Z

I saw that the predict or integrity_score is running quite slow.

I've added functionality to let it run with threading, making it much faster.
It adds a dependency on joblib, however this is already a dependency of sklearn, so no new dependencies are really added. This makes the code ~8x faster (with 8 threads).
I've changed the call from Shell.check_output to subprocess.check_output. Shell is calling subprocess.check_output in the background anyway, but like this we get another speedup of ~3-4x

so a total speedup of ~30x is possible.

Example:

import numpy as np
import sklearn_porter
from sklearn.ensemble import RandomForestClassifier

train_x = np.random.rand(1000, 8)
train_y = np.random.randint(0, 4, 1000)

rfc = RandomForestClassifier(n_estimators=10)
rfc.fit(train_x, train_y)
        
porter = sklearn_porter.Porter(rfc, language='c')
porter.integrity_score(train_x) # ~30 times faster.

I've also seen that integrity_score runs perfectly fine on Windows, given that gcc is installed (and the hard-coded blocking of windows is removed). Do you think we can remove the blocking of the function for windows platforms?

nok · 2019-06-25T11:13:13Z

Hello @skjerns ,

this is great! I will merge your PR and adapt it to the new major release. Until it's done I will keep this PR open.

Best, Darius

skjerns · 2019-06-25T12:22:55Z

Meanwhile I have found another solution that speeds up things to almost real-time predictions:

I altered the int main(){..} such that it accepts several data points as input, not just one. This way, I can verify several hundred inputs in one call. I'll make another PR proposing this soon if you want. However, it's a bit deeper alteration of the code and needs to be done for each language individually, so might not be preferable.

example for C:

int main(int argc, const char * argv[]) {
    if ((argc-1) % n_features != 0){
            printf("Need to supply N x %d features flattened, %d were given", n_features, argc-1);
            return 1;
        }
    double features[n_features];
    int n_rows = (argc-1) / n_features;
    for (int row=0; row < n_rows; row++){
        printf("row: %d\\n", row);
        for (int i = 0; i < n_features; i++) {
            features[i] = atof(argv[i+row*n_features+1]);
        }
        // calculate outputs for debugging
        int class_idx = predict_class_idx(features);
        // same as calling label = predict(features)
        int label = labels[class_idx];
        
        // now we print the results
        printf("labels: ");
        for (int i=0; i<n_classes; i++){        
            printf("%d ", labels[i]);
        }
        printf("\\n");
        printf("class_idx: %d\\n", class_idx);
        printf("label: %d", label);
        printf("\\n\\n");
    }
    return 0;}

nok · 2019-12-19T00:43:22Z

In the next release all internal predictions will be multiprocessed by default. Here is the relevant part:
https://github.com/nok/sklearn-porter/blob/release/1.0.0/sklearn_porter/Estimator.py#L652-L682

I altered the int main(){..} such that it accepts several data points as input, not just one. This way, I can verify several hundred inputs in one call. I'll make another PR proposing this soon if you want. However, it's a bit deeper alteration of the code and needs to be done for each language individually, so might not be preferable.

Yes, SIMD operations would be nice. But for now I prefer a simple and intuitive starting point where a developer can change and extend the generated source code easily. Nevertheless I see and understand the need, so I would suggest that we create an additional interactive example (something like that) where we demonstrate the customization and the final benefit. The current scaffold of a template is here.

What do you think?

nok · 2019-12-19T00:48:25Z

I've also seen that integrity_score runs perfectly fine on Windows, given that gcc is installed (and the hard-coded blocking of windows is removed). Do you think we can remove the blocking of the function for windows platforms?

Thanks for the note! That sounds great. I removed all checks that are related to the operating system:
https://github.com/nok/sklearn-porter/blob/release/1.0.0/sklearn_porter/Estimator.py#L701

skjerns · 2019-12-21T10:37:11Z

I've also seen that integrity_score runs perfectly fine on Windows, given that gcc is installed (and the hard-coded blocking of windows is removed). Do you think we can remove the blocking of the function for windows platforms?

Thanks for the note! That sounds great. I removed all checks that are related to the operating system:
https://github.com/nok/sklearn-porter/blob/release/1.0.0/sklearn_porter/Estimator.py#L701

great! might be handy to include a gcc_installed() function with printed warnings etc.

edit: Ah, I guess that's done by DEPENDENCIES

skjerns · 2019-12-21T10:47:26Z

In the next release all internal predictions will be multiprocessed by default. Here is the relevant part:
https://github.com/nok/sklearn-porter/blob/release/1.0.0/sklearn_porter/Estimator.py#L652-L682

Great! Nice.

I altered the int main(){..} such that it accepts several data points as input, not just one. This way, I can verify several hundred inputs in one call. I'll make another PR proposing this soon if you want. However, it's a bit deeper alteration of the code and needs to be done for each language individually, so might not be preferable.

Yes, SIMD operations would be nice. But for now I prefer a simple and intuitive starting point where a developer can change and extend the generated source code easily. Nevertheless I see and understand the need, so I would suggest that we create an additional interactive example (something like that) where we demonstrate the customization and the final benefit. The current scaffold of a template is here.

What do you think?

I'll leave it up to you. Having the source code of individual language templates would be feasible I guess?

skjerns added 5 commits May 15, 2019 14:27

Changed prediction to run with multiprocessing

f31de4a

Update Porter.py

13ab9b1

Update Porter.py

b142360

Update Porter.py

b96071f

Update requirements.txt

fed1559

update dependency/import

19bb498

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changed prediction to run with multithreading #54

Changed prediction to run with multithreading #54

skjerns commented May 15, 2019 •

edited

Loading

nok commented Jun 25, 2019

skjerns commented Jun 25, 2019 •

edited

Loading

nok commented Dec 19, 2019

nok commented Dec 19, 2019

skjerns commented Dec 21, 2019 •

edited

Loading

skjerns commented Dec 21, 2019

Changed prediction to run with multithreading #54

Are you sure you want to change the base?

Changed prediction to run with multithreading #54

Conversation

skjerns commented May 15, 2019 • edited Loading

nok commented Jun 25, 2019

skjerns commented Jun 25, 2019 • edited Loading

nok commented Dec 19, 2019

nok commented Dec 19, 2019

skjerns commented Dec 21, 2019 • edited Loading

skjerns commented Dec 21, 2019

skjerns commented May 15, 2019 •

edited

Loading

skjerns commented Jun 25, 2019 •

edited

Loading

skjerns commented Dec 21, 2019 •

edited

Loading