We are releasing Totalita, a diachronic corpus of written Czech from the communist regime period (1948–1989). The corpus served as a material base for the dictionary published in 2010.
We are releasing Totalita, a diachronic corpus of written Czech from the communist regime period (1948–1989). The corpus served as a material base for the dictionary published in 2010.
Together with our colleagues from JÚĽŠ, we have launched the Czechoslovak Word of the Week series. A new word appears every Monday morning, the columns are then reprinted in Friday editions of both Czech Deník N and Slovak Denník N.
An update of Treq, the online tool for looking up translation equivalents, is out! Its database has been updated to release 15 of the InterCorp parallel corpus.
The second generation of ONLINE corpora was published. As a continuation of the first generation and thanks to its daily updates, ONLINE2 is a perfect source of data to examine current trends in public discourse.
We have made publicly available APIs for querying KonText and Treq. The number of applications with open API will grow in the future.
In cooperation with ICL we created a new corpus of contemporary Czech poetry (KSP). It contains poems published in 1990–2020 either in print or on web literary forums. Sized 35 mil. words, KSP ranks among the largest corpora of its kind in the world.
We are pleased to announce that our dear colleague Václav Cvrček has been appointed Professor of Czech Language. Congratulations!
SYN release 10 was published as another update of the SYN corpus of contemporary written Czech. With journalistic texts from 2020, its size reached almost 4.9 billion words.
Release 14 of the InterCorp parallel corpus has been published at the end of January. An overview of all the enhancements can be found in the version history at the CNC wiki.
The DIALEKT corpus has more than doubled in size to 223 thousand words in its newly published release 2. It is complemented by the Mapka application which has gained new features, e.g. downloadable custom map layers with user-defined points.