-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reworking the FFT in power_spectrum #2
Comments
after reading over the docs a couple more times it looks like the output of this particular function isn't getting normalized. normalization only happens for inverse ffts(unless I am misinterpreting)
it's happening along the last axis, but the number of elements isn't determined by the number of elements in that vector, and it looks like the output should be a 2d matrix. |
There is an example in C if that helps? |
thanks, for the links. It's definitely useful to get some reference to other implementations. I'm going to look at this again sometime Monday. |
The 42io/dataset is quite a good example as dumping to stdout is quite a good method to create a pipeline. Also spec augment is now quite commonly used and is an optional part of MFCC creation. @skewballfox https://github.com/HEnquist/realfft but compared to say pocketfft or fftw I don't think there is a neon implementation and RUST itself from a complete noob perspective looks at 1st glance lacking in embedded arm optimisation. |
Still pretty new to this stuff, is this supposed to be handled directly by the library or more as a compile time optimization? I think the language has support for (most) neon instruction sets, going off this issue and the last corresponding PR. also, a bit off topic, but what are some good resources for learning about this topic? |
Dunno as I am the same with C as I can hack bits and pieces together I had a little look out of interest with RUST but apart from 'let mut' is about bringing the dog in the only thing it seems to me you either create your own crates or you have limits on what RUST supplies. I have been wondering especially with voiceAI being so Arm concentric that maybe even the ArmLibs might be the way to go but with FFT its the usual of FFTW but the new one that tensorflow and torch have employed is pocketfftw. I doubt I am any further on than you apart from when you mentioned reverse iFFT (well you mentioned real FFT) but reverse got a mention. I still think plain old C is a better option as its so much more portable across platforms in a general way (yeah you will get args against) but also the existing codebase and examples is much more. |
Some nugget of info about SIMD, C, Rust and embedded. Neither C nor Rust provide any architecture-agnostic SIMD and both rely on libraries that take profit of those to offer something more comfortable. Not long ago Rust (1.59.0) Rust added stable (it was on nightly before that) support for some SIMD on Aarch64. Overall, C support is much more mature, (specially in embedded) while Rust's is improving rapidly. Other than ready-made libs (which is important, but can be wrapped) I can't see a huge impact in Rust vs clang C embedded-wise. And the funny thing is that Rust has opportunities to be faster than C (ownership, which enables making non-allocating code even on complex codebases) (pointer alias which does not happen in Rust and thus another source for potential optimizations). And on the topic of RustFFT and Neon, there's a PR working exactly on that, now that it is finally in Rust stable. |
That is exactly what I was saying as working on still doesn't mean its available which I was pointing out its not. All the complex math libs such as FFT OpenBlas and various others are highly optimised after years of dev where the likes of Intel & Arm create API identical optimised libs and obviously you don't understand but its likely a fresh generalised lib such as in Rust is not as performant as say FFTW or PocketFFT and definitely not as performant as platform optimised vendor versions which Rust lacks. I am not bothered what you use but that is very much the situation, but prob no bother as really its already part of Torch.audio & TF.signal so probably you don't even need to provide unless fanning out to multiple models with the same MFCC. |
coming back to this, as this is actually the current source of failure for the MFCC test. in both the python version and this implementation, the shape of frames is my intuition says that zero padding would probably be the correct as this is a 512 point FFT, and there are only 320 entries along the specified axis, but I need to confirm. |
Okay, so I've been tackling translating this this morning, and I realized that translation for this part is going to be a bit more complicated.
np.fft.rfft
doesn't stand for reverse, it's an fft calculated for real input and outputs complex numbers(see the docs). for more information on the difference betweennp.fft.fft
andnp.fft.rfft
please see this stack overflow questionthere are only two crates that handle this:
also, it seems that the
fft
s from numpy normalize a portion of the returned matrixes, where the third argument(norm=None)
is actually indicating that it should use the default normalization option(backward) that appears to be distinct from an inverse fft(see the list real ffts).ndrustfft
may be a better fit here, but I'm still trying to figure out if it has options for normalization.The text was updated successfully, but these errors were encountered: