On the occasion of 25 years since the foundation of the Institute of the Czech National Corpus, a new web application has been released. Word at a Glance presents a quick and user-friendly way to get word profiles based entirely on corpus data.
On the occasion of 25 years since the foundation of the Institute of the Czech National Corpus, a new web application has been released. Word at a Glance presents a quick and user-friendly way to get word profiles based entirely on corpus data.
We are proud to announce that Czech National Corpus has been officially recognized as a CLARIN K-centre in the area of corpus linguistics with emphasis on the empirical research of Czech.
Taming the Corpus: From Inflection and Lexis to Interpretation – a new book on empirical research based on Czech data has just been published by Springer.
Two new corpora have been published in November 2018: Koditex, specialized corpus compiled for multi-dimensional analysis of Czech registers, and nkjp_1m, manually annotated one-million subcorpus of the National Corpus of Polish.
An updated version of the KonText interface is now available with new functions, such as translation equivalents from Treq displayed directly in KonText (for parallel corpora) or CQL editor with syntax highlighting.
Conference and panel presentations have been made available online.
Release 6 of the corpus of contemporary written Czech SYN has been published on 18th December 2017. Size of the SYN release 6 has exceeded 4 billion words.
We are proud to announce that three new spoken corpora are out now: brand new ORTOFON and dialectal DIALEKT corpus, both with a two-level transcription, as well as ORAL, the unification of ORAL-series corpora. All the corpora feature lemmatisation and morphological tagging.
A new version of Treq, the online tool for looking up translation equivalents based on the InterCorp parallel corpus, is out! Now you can search for multiword units, use regular expressions in the query and also search in translations from/to English (in addition to Czech).
LINDSEI_CZ learner corpus of spontaneous spoken English by advanced speakers, whose L1 is Czech, has been published in January 2017. The corpus was compiled by Tomáš Gráf within the framework of the LINDSEI project.