• Installing and introducing the Natural Language Toolkit (NLKT):
-Introducing the Python programming language with emphasise for Natural Language Processing (NLP);
-Using Python string datatypes for the processing of words and texts.
• Accessing text corpora (i.e., etext repositories) and the processing of raw text:
-Introduces the Gutenberg, Reuters, Inaugural Address and Annotated text corpus as well as Web and Chat text;
-Looking at text corpus structure.
• Categorizing words and classifying text:
-Tagging corpora;
-Mapping words to properties through Python Dictionaries;
-Transformation based tagging;
-Determining the category of a word;
-Supervised, decision trees, naive bayes and ent
|