Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names' #19

Open
shelbywhite opened this issue Mar 22, 2023 · 7 comments

Comments

@shelbywhite
Copy link

Trying to run this code on Google Colab and seeing this error now. Simply just trying to use the demo provided in this repo, but now it's throwing the following error:


AttributeError Traceback (most recent call last)
in
3 # Fit the Concept model to the images and vocabulary
4 concept_model = ConceptModel()
----> 5 concepts = concept_model.fit_transform(img_names, docs=selected_nouns)
6
7 # Get the predicted probabilities for each concept cluster for each image

1 frames
/usr/local/lib/python3.9/dist-packages/concept/_model.py in _extract_textual_representation(self, docs)
400 # Extract vocabulary from the documents
401 self.vectorizer_model.fit(docs)
--> 402 words = self.vectorizer_model.get_feature_names()
403
404 # Embed the documents and extract similarity between concept clusters and words

AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names'

@MaartenGr
Copy link
Owner

Ah, I believe that is an issue with the scikit-learn version. I believe that if you install a sklearn version pre 1.0, then it should work.

@renswilderom
Copy link

Hello Maarten, I had the same issue. Installing a sklearn version older than 1.0 will probably work indeed.

What I understand from this SO post, is that get_feature_names is depreciated and replaced by get_feature_names_out() from sklearn version 1.0 and higher.

@MaartenGr
Copy link
Owner

Also, I would advise using BERTopic instead as that has more options for multi-modal topic modeling.

@renswilderom
Copy link

OK - thanks for the tip. I was already using BERTopic for text, but didn't know it had this multimodal feature. Great!

@BingBing20230401
Copy link

thanks.!! it solved my problem too!

--What I understand from this SO post, is that get_feature_names is depreciated and replaced by get_feature_names_out() from sklearn version 1.0 and higher.

@bjornekstrom
Copy link
Contributor

bjornekstrom commented Oct 29, 2024

Would it be possible to make this work with a newer version of scikit-learn?

Edit: I edited get_feature_names() to get_feature_names_out() for noun passing to Concept to work with newer versions of scikit-learn. I take it that the version of pip needs to be updated but now the _model.py version in the repo works. A pull request has been created. #24

@bjornekstrom
Copy link
Contributor

This has been solved through #24.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants