-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pre-trained embedding features #110
Comments
Also crema... which I think is gonna be my use case. Why would this involve an api change? |
Absolutely. Though with crema, there's a circular dependency problem: crema depends on pumpp, so we can't have pumpp depend on crema. Probably the best option here is to add a pumpp feature extractor class to each crema model. This should be fairly easy to do. Extending foreign objects (ie from another package) is generally a bad idea, but in this case, we're in control of both packages so it shouldn't be a huge deal. |
Caveat about wrapping pre-trained embedding in general: |
I like the idea of making import pumpp
from pumpp.feature.integrations import vggish, openl3, keras
pump = pumpp.Pump(vggish.VGGish(...), openl3.OpenL3(...), keras.H5Model(...)) If we do that, then we don't have to worry about circular dependencies so we could add (Or alternatively we can have them imported by default and do: from pumpp.feature.base import FeatureExtractor
class Crema(FeatureExtractor):
def __init__(self, name='crema', model_dir='./model', *a, **kw):
from crema.models.base import CremaModel
self.model = CremaModel()
self.model._instantiate(os.path.abspath(model_dir))
super().__init__(name, *a, **kw)
def transform_audio(self, y):
return self.model.outputs(y=y, sr=self.sr) ) |
Actually maybe the best option would be to do what @bmcfee suggested (crema model extending Pump) and then add vggish and openl3 as crema models. |
Description
pumpp features currently rely on low-level librosa implementations, but we could also have wrappers for pre-trained feature extractors like openl3 and vggish (the latter as implemented by openmic).
There's some details to work out in terms of standardizing the parameters (hop size, etc).
The text was updated successfully, but these errors were encountered: