Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
-
Updated
Feb 23, 2021 - Python
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)
End-2-end speech synthesis with recurrent neural networks
This repository contains PyTorch implementation of 4 different models for classification of emotions of the speech.
Easier audio-based machine learning with TensorFlow.
CNN 1D vs 2D audio classification
A simple audio feature extraction library
A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.
Urban sound source tagging from an aggregation of four second noisy audio clips via 1D and 2D CNN (Xception)
Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Zafar's Audio Functions in Matlab for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Attention-based Hybrid CNN-LSTM and Spectral Data Augmentation for COVID-19 Diagnosis from Cough Sound
基于梅尔频谱的信号分类和识别
Basic wavenet and fftnet vocoder model.
Framework for one-shot multispeaker system based on Deep Learning
Code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features" arXiv:2110.08862, 2021.
Open Source Implementation of Neural Voice Cloning with Few Audio Samples (Baidu Research)
Cough detection with Log Mel Spectrogram, Wavelet Transform, Deep learning and Transfer learning techniques
This study converts piano recordings to mel spectrogram and classifies them by SOTA pre-trained neural network backbones in CV. Comparative experiments show that SqueezeNet achieves a best classification accuracy of 92.37%.|该项目将钢琴录音转为为mel频谱图,使用微调后的前沿计算机视觉领域预训练深度学习骨干网络对其进行分类,对比实验可知SqueezeNet作为最优网络正确率可达92.37%
Add a description, image, and links to the mel-spectrogram topic page so that developers can more easily learn about it.
To associate your repository with the mel-spectrogram topic, visit your repo's landing page and select "manage topics."