Text mining essentially means converting a group of documents into a meaningful numeric representation (i.e. data set). This data set can be joined with more standard structured data or be left on its own, and then statistical or machine learning analysis is conducted on this data set for inferential or predictive purposes.
-
Text Mining with SAS Text Miner - Blackboard electronic reserves
-
SAS Code Basic Text Manipulation example
(Beginning at Line 140; also available on SODA environment in 'SAS_Workshop' folder.) -
Python Code Basic Text Manipulation example
-
Enron Sample Data - Blackboard electronic reserves
-
Text Analytics Using SAS Enterprise Miner - Blackboard electronic reserves
-
Term Embedding References
-
GloVe
by Jeffrey Pennington, Richard Socher, and Christopher D. Manning -
Word2Vec
-
Efficient Estimation of Word Representations in Vector Space
by Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean -
Linguistic Regularities in Continuous Space Word Representations
by Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig
-
-