HathiTrust Research Center Part I: Text Analysis with Extracted Features
Event box
HathiTrust Research Center Part I: Text Analysis with Extracted Features
Learn how to access and work with the HathiTrust Extracted Features datasets, including Word Frequencies in English-Language Literature, 1700-1922 and Geographic Locations in English-Language Literature, 1701-2011. These datasets, derived from the full-text of over 17 million volumes, allow researchers to analyze a large body of both copyrighted and public-domain text through its volume-level metadata, page-level metadata, part-of-speech-tagged tokens, and token counts. No prior knowledge or experience with HathiTrust Research Center or text analysis necessary.