All,
Here is a summary of responses that I recieved from my
post regarding corpora in Uzbek. Many thanks for
everyone's help in this and my apologies for being so
late in getting this out.
Kevin McTait said:
try the ECI (European Corpus Initiative) CD ROM. On it
there is an English-Uzbek corpus in the form of a
novel (cannot remember which one). THe Uzbek is
transliterated into the Latin script tho.
Ramesh recommended the TELRI TRACTOR archive:
http://www.tractor.de or http://www.telri.de
Trond Trosterud recommended the U. of Helsinki:
http://www.ling.helsinki.fi/uhlcs/
A little bit of poking around in directions suggested
by Tomaz Erjavec turned up the following sites:
Two links the University of Leiden that can be found
here:
http://iias.leidenuniv.nl/kreeft/IIASNONLINE/Newsletters/Newsletter10/Regional/Contents.html#AnchorCA
The Central Asian Languages Corpora
"The Uzbek corpus was completed in 1996. It contains
1,100,000 tokens approximately in 23 corpus texts from
388 different modern published sources." (from the
site)
http://www.let.uu.nl/oosters/CALC1.html
The LDC may have some data, though I didn't find it in
my very quick search and didn't try hard after that.
http://www.ldc.upenn.edu/
Neal Audenaert
neal_audenaert@acm.org
__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail
http://personal.mail.yahoo.com/
This archive was generated by hypermail 2b29 : Thu Jul 05 2001 - 15:48:06 MET DST