[Corpora-List] Simple POS tagger

From: David L. Hoover (david.hoover@verizon.net)
Date: Mon Oct 18 2004 - 20:10:37 MET DST

Next message: Lou Burnard: "[Corpora-List] BNC Baby : new xml corpora and software"

Previous message: Luisa Bentivogli: "[Corpora-List] MEANING-05 workshop. Second announcement and call for papers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I need a simple POS tagger (preferably freeware) for a modest corpus of
contemporary American Poetry (total corpus is about 1,500,000 words, but
the samples are mostly under 100,000 words, and I would be happy with a
program that could handle even only much smaller samples, say 10,000 words.

I am mainly interested in noun and verb statistics, and do not need to
process the tags further or to use the tagged text in any other way.
Basically, I want to determine the percentage of the text tokens that
are in the various word classes.

I'm working with a fairly robust Windows XP Professional computer, and
would prefer something that won't take a lot of extra
installation/configuration work.

I have done some research, but there are so many choices it is difficult
to know where to start.

Any favorites?

Thanks,
David Hoover

-- David L. Hoover, Director of Undergraduate Studies & Webmaster NYU English Department, 212-998-8832 http://www.nyu.edu/gsas/dept/english/

"If you pick up a starving dog and make him prosperous, he will not bite you. This is the principal difference between a dog and a man." -- Mark Twain, Pudd'nhead Wilson's Calendar

Next message: Lou Burnard: "[Corpora-List] BNC Baby : new xml corpora and software"
Previous message: Luisa Bentivogli: "[Corpora-List] MEANING-05 workshop. Second announcement and call for papers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Oct 18 2004 - 20:20:13 MET DST