Re: sentence parsed corpus

Ted Dunning (T.Dunning@dcs.shef.ac.uk)
Wed, 28 Jun 95 13:45:31 BST

Is a sentence parsed corpus available somewhere?

this is rather a general question. you don't mention what language
you want, nor what sort of parsing you want. since you didn't send
this message message from the US (where i really come from, btw), we
have to assume that you are interested in other languages in addition
to English.

assuming that you want English texts, and that you don't care about
what sort of parsing is done, then the standard answer is that the
Penn TreeBank is the answer. it can be had from the Linguistic Data
Consortium (LDC) whose current contact address will almost certainly
be posted here within a few moments.

if you want any language other than English, i don't think that there
is much available data. i have heard rumors about a japanese parsed
corpus being available from EDR (220,000 sentences apparently). i
don't have any details, a contact address or price for the EDR.

it would be *wonderful* if somebody were to go to the trouble of
creating such a parsed corpus for virtually any language