[Corpora-List] Simple POS tagger

From: David L. Hoover (david.hoover@verizon.net)
Date: Mon Oct 18 2004 - 20:10:37 MET DST

  • Next message: Lou Burnard: "[Corpora-List] BNC Baby : new xml corpora and software"

    I need a simple POS tagger (preferably freeware) for a modest corpus of
    contemporary American Poetry (total corpus is about 1,500,000 words, but
    the samples are mostly under 100,000 words, and I would be happy with a
    program that could handle even only much smaller samples, say 10,000 words.

    I am mainly interested in noun and verb statistics, and do not need to
    process the tags further or to use the tagged text in any other way.
    Basically, I want to determine the percentage of the text tokens that
    are in the various word classes.

    I'm working with a fairly robust Windows XP Professional computer, and
    would prefer something that won't take a lot of extra
    installation/configuration work.

    I have done some research, but there are so many choices it is difficult
    to know where to start.

    Any favorites?

    Thanks,
    David Hoover

    -- 
    David L. Hoover, Director of Undergraduate Studies & Webmaster 
               NYU English Department, 212-998-8832
              http://www.nyu.edu/gsas/dept/english/ 
    

    "If you pick up a starving dog and make him prosperous, he will not bite you. This is the principal difference between a dog and a man." -- Mark Twain, Pudd'nhead Wilson's Calendar



    This archive was generated by hypermail 2b29 : Mon Oct 18 2004 - 20:20:13 MET DST