Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: need at least one array to concatenate #5

Open
javierrodenas opened this issue Jun 11, 2019 · 5 comments
Open

ValueError: need at least one array to concatenate #5

javierrodenas opened this issue Jun 11, 2019 · 5 comments

Comments

@javierrodenas
Copy link

javierrodenas commented Jun 11, 2019

Hi!!!
I have the following error training the model :

File ".\ubm.py", line 202, in 
ubm.train()
File ".\ubm.py", line 50, in train
iterations=(1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8)
File "C:\Users\jrodenas\Desktop\SpeakerRecognition\Speaker-Recognition\sidekit\mixture.py", line 672, in EM_split
self._init(features_server, feature_list, num_thread)
File "C:\Users\jrodenas\Desktop\SpeakerRecognition\Speaker-Recognition\sidekit\mixture.py", line 629, in _init
features = features_server.stack_features_parallel(feature_list, num_thread=num_thread)
File "C:\Users\jrodenas\Desktop\SpeakerRecognition\Speaker-Recognition\sidekit\features_server.py", line 666, in stack_features_parallel
return numpy.concatenate(output, axis=0)
ValueError: need at least one array to concatenate

I have done the data_init first and it was well created. Then I ran extrac_feature and finally, ubm train.
How can I solve that?
Thank you in advance!!

@yangxiaokang
Copy link

mybe, you can try it on linux

@Anwarvic
Copy link
Owner

Anwarvic commented Jun 13, 2019

First, I believe that the features_server is empty and didn't load any features. Why is that? The most probable cause is the location containing the features. To make sure everything as expected, do the following:

  • First, open the configuration file conf.yaml and tell me the values of these YAML objects outpath and sampling_rate.
  • Then, go to the outpath and you should find at least the following folders audio, feat, and task.
  • Inside {outpath}/feat, you should find two folders at least enroll and test. I need to know how many files inside each.
  • Also, I need to see the code calling ubm.EM_split() in ubm.py.
  • Finally, I need to know the line responsible for creating the FeatureServer inside ubm.py. I'm expecting something like this server = self.createFeatureServer("enroll")... is it right?

@Anwarvic Anwarvic mentioned this issue Jun 13, 2019
@javierrodenas
Copy link
Author

javierrodenas commented Jun 14, 2019

@Anwarvic first of all, thanks for your answer.

Answering your questions:

  • As outpath I have ./SpeakerRecognition/Speaker-Recognition/Merged_Arabic_Corpus_of_Isolated_Words/ and as sample_rate I have 44100 (default value).

  • On the other hand, I find audio, feat and task folders in outpath but inside feat folder I can only see enroll folder but no test folder. Inside the enroll folder there are 6 files:

  • enroll_idmap.h5

  • plda_idmap.h5

  • test_idmap.h5

  • test_ndx.h5

  • test_trials.txt

  • tv_idmap.h5

  • Also, ubm.EM_split() has the deault structure:

        ubm.EM_split(
            features_server=server, #sidekit.FeaturesServer used to load data
            feature_list=train_list, #list of feature files to train the model
            distrib_nb=self.NUM_GAUSSIANS, #number of Gaussian distributions
            num_thread=self.NUM_THREADS, # number of parallel processes
            save_partial=False, # if False, it only saves the last model
            iterations=(1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8)
            )
if __name__ == "__main__":
    conf_filename = "conf.yaml"
    ubm = UBM(conf_filename)
    ubm.train()
    ubm.evaluate()
    ubm.plotDETcurve()
    print( "Accuracy: {}%".format(ubm.getAccuracy()) )
  • Finally, you are right. The creation of the FeatureServer is done with server = self.createFeatureServer("enroll")

On the other hand, inside the folder of audio I can find data, enroll and test folders but are empty. Beside this, task folder has the same **6 files ** as feat/enroll.

Thank you in advance.

@Anwarvic
Copy link
Owner

Now, the problem is that you haven't extracted the features from the data yet. So, follow these steps:

  • First, download the data from here.
  • Then, run data_init.py. After running it, you will find two folders has been created at {outpath}. these two files are:
    • {outpath}\audio: which will contain two folders at least... enroll and test. Inside each folder you will find audio files that you can listen to.
    • {outpath}\task: which will contain these five files that you have mentioned above.
  • Then, you need to run extract_features.py script which will create another directory in the {outpath} called feat. Inside this folder you should find two other folders at least. They are enroll and test.
  • After running these two scripts. You can now run ubm.py with no problem.

If you need more information, please check this README.md file as I explained as many details as I could.

@TeppieC
Copy link

TeppieC commented Nov 10, 2020

Hi, maybe it's too late but I believe that you forgot to install sox so the convert_wav() did not work as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants