Use Java to programmatically analyse the grammar of plain English text i.e. Natural Language Processing or programmatically create text in English i.e. Natural Language Generation or classify various pieces of text according to some criteria i.e. Text Categorisation.

All text samples used here are Star Trek quotes (in English, not Klingon) sourced from either Wikipedia or Memory Alpha. After an entirely unscientific thought process, I decided such quotes are grammatically correct, adequately diverse and of average complexity.

Won’t bore you with grammar or linguistic theory as this is written from a code monkey’s rather than a PhD student’s perspective. Think Scotty instead of Spock. There are links to more theory at each page if you absolutely must.

Disclaimer: No tribbles were harmed in the making of this repo.

Tutorials

  1. CorenNLP Introduction