Filewatcher File Search File Search
Catalog
Content Search
» » » » » Lingua-EN-Bigram-0.03.tar.gz » Content »
pkg://Lingua-EN-Bigram-0.03.tar.gz:264166/Lingua-EN-Bigram-0.03/  info  downloads

README


README

This module is designed to: 1) pull out all of the ngrams (multi-word
phrases) in a given text, and 2) list these phrases according to their
frequency. Using this module is it possible to create lists of the most
common phrases in a text as well as order them by their probable
occurance, thus implying significance. This process is useful for the
purposes of textual analysis and "distant reading".

The two-word phrases (bi-grams) are also listable by their T-Score. The
T-Score, as well as a number of the module's other methods, is
calculated as per Nugues, P. M. (2006). An introduction to language
processing with Perl and Prolog: An outline of theories, implementation,
and application with special consideration of English, French, and
German. Cognitive technologies. Berlin: Springer.

-- 
Eric Lease Morgan <eric_morgan@infomotions.com>
August 23, 2010
Results 1 - 1 of 1
Help - FTP Sites List - Software Dir.
Search over 15 billion files
© 1997-2017 FileWatcher.com