Section 08: Text Mining

Text mining essentially means converting a group of documents into a meaningful numeric representation (i.e. data set). This data set can be joined with more standard structured data or be left on its own, and then statistical or machine learning analysis is conducted on this data set for inferential or predictive purposes.

Class notes

Overview of text mining techniques
Text Mining with SAS Text Miner - Blackboard electronic reserves
SAS Code Basic Text Manipulation example
(Beginning at Line 140; also available on SODA environment in 'SAS_Workshop' folder.)
Python Code Basic Text Manipulation example
- Part 1
- Part 2
EM Text Miner example
Enron Sample Data - Blackboard electronic reserves

Sample Quiz

Quiz Key

Supplementary References

Text Analytics Using SAS Enterprise Miner - Blackboard electronic reserves
Term Embedding References
- GloVe
  by Jeffrey Pennington, Richard Socher, and Christopher D. Manning
- Word2Vec
  - Efficient Estimation of Word Representations in Vector Space
    by Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean
  - Linguistic Regularities in Continuous Space Word Representations
    by Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig
  - Word2Vec software

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

08_text_mining.md

08_text_mining.md

Section 08: Text Mining

Class notes

Sample Quiz

Quiz Key

Supplementary References

Files

08_text_mining.md

Latest commit

History

08_text_mining.md

File metadata and controls

Section 08: Text Mining

Class notes

Sample Quiz

Quiz Key

Supplementary References