-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add discussion of pandas (in lieu of numpy?) #2
Comments
Anyone who is interested in this tutorial is probably ready for pandas. Upon reflection, I don't think it adds too much complexity. |
I also think including pandas is a good idea. DataFrames and label-based slicing of them is very useful in our context and actually makes things a lot more intuitive. |
How should I weave it in? Should there be a separate tutorial showing slicing by label etc? |
Not sure about this. Maybe a brief section on dealing with tabular data being output by Mallet or NMF could be added somewhere and then referenced in the various relevant places: reading such data into a pandas dataFrame, slicing by label, etc. It is kind of a bridge between TM/NMF and the visualization part, so maybe it could fit at the beginning of the Visualization chapter, as well. |
Pandas does make many operations much easier. Need to find sensible ways of integrating mentions of its uses. In principle, I think the tutorials should only require familiarity with the "basic" numpy/scipy stack.
The text was updated successfully, but these errors were encountered: