Grants, Sponsors
Awarded grants
| Czech National Corpus (LM2011023; 2012-2016) Ministry of Education, Youth and Sports Large Research, Development and Innovation Infrastructures |
![]() |
Within the framework of this project, ICNC strives for extensive and continuous data coverage of the Czech language (and other languages in comparison with Czech) aiming thus to build up a foundation for basic and applied research. The main activities include:
- continuous development and building of language corpora of various types as representative, linguistically processed textual bases for empirical and exact research of the Czech language; these are primarily corpora covering Czech in its present state (synchronic corpora of written and spoken language), in its historical development (diachronic corpus), and in translation comparison with other languages (parallel corpora);
- continuous development and enhancement of structural and specialized linguistic annotation of language corpora;
- complex processing of other corpora compiled by other research groups in the Czech Republic and abroad;
- free and open public service of providing internet user access to all corpora by the means of specialized corpus tools;
- providing of data packages (i.e. processed and annotated collections of language data) to other research groups in the Czech Republic as well as abroad, in various formats according to their needs and suitable especially for linguistic research and natural language processing.
Completed projects
- Research project of the Ministry of Education, Youth and Sports entitled The Czech National Corpus and Corpora of Other Languages, VZ MSM 0021620823, (2005-2011)
- Large Language Corpora and their Automatic Analysis, Czech Science Foundation (2003-2005)
- Research project of the Ministry of Education, Youth and Sports "Czech National Corpus and Corpora of Other Languages" (1999-2004)
- Corpus of Czech written texts (V. Petkevič, Czech Science Foundation)
- Program Tools for Computer Processing of Czech Texts (J. Peregrin, Czech Science Foundation)
- Czech Phraseology, its Importance and Lexicographic Processing (F. Čermák, GAUK)
- Computer Processed Corpus of Spoken Czech (F. Čermák, GAUK)
- Czech in the Age of Computers (A complex project, Czech Science Foundation), 1996-2001
- Electronic Corpus of the Czech Language (An enhancement of the reseach at universities, MŠMT ČR), 1996-2000
- Electronisation of Diachronic Lexicography Techniques (P. Nejedlý, R. Blatná, Czech Science Foundation)



