C Program that analyzes the frequency of a set of keywords in a given text document. Returns textual analysis metrics including:
- total word count
- frequency of each keyword per 1000 words
- total number of occurrences of each keyword
- average word length
- average sentence length (in words).
These metrics when assessed over a broad number of documents of the same authorship can be used to generate an author's stylistic thumbprint, which serves as a powerful research tool in many applications of textual analysis, such as when determening the likely authorship of an anonymous text.
Notes:
- Number of keywords and length of each keyword are hardcoded in the KEYS and KEYLEN macros, respectively.
- Unless text file is within same directory as the program, include the full filepath when prompted for the document's file name.
Questions? Contact [email protected]
- Xander Leatherwood
Special thanks to Ed Crotty.