Building Machine Learning Systems with Python by Richert Willi Coelho Luis Pedro

Building Machine Learning Systems with Python by Richert Willi Coelho Luis Pedro

Author:Richert, Willi, Coelho, Luis Pedro [Richert, Willi, Coelho, Luis Pedro]
Language: eng
Format: mobi
Published: 0101-01-01T00:00:00+00:00

Determining the word types

Determining the word types is what part of speech (POS) tagging is all about. A POS tagger parses a full sentence with the goal to arrange it into a dependence tree, where each node corresponds to a word and the parent-child relationship determines which word it depends on. With this tree, it can then make more informed decisions; for example, whether the word "book" is a noun ("This is a good book.") or a verb ("Could you please book the flight?").

You might have already guessed that NLTK will also play a role also in this area. And indeed, it comes readily packaged with all sorts of parsers and taggers. The POS tagger we will use, nltk.pos_tag(), is actually a full-blown classifier trained using manually annotated sentences from the Penn Treebank Project (http://www.cis.upenn.edu/~treebank). It takes as input a list of word tokens and outputs a list of tuples, each element of which contains the part of the original sentence and its part of speech tag:



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Web Analytics