How to Quote the CNC Corpora
The corpus material is to be quoted in the bibliography in the following ways:
![]() |
Corpus SYN2010: Czech National Corpus - SYN2010. Institute of the Czech National Corpus, Praha 2010. Accessible at WWW: <http://www.korpus.cz>. |
![]() |
Corpus SYN2009PUB: Czech National Corpus - SYN2009PUB. Institute of the Czech National Corpus FF UK, Praha 2010. Accessible at WWW: <http://www.korpus.cz>. |
![]() |
Corpus SYN2006PUB: Czech National Corpus - SYN2006PUB. Institute of the Czech National Corpus FF UK, Praha 2006. Accessible at WWW: <http://www.korpus.cz>. |
![]() |
Corpus SYN2005: Czech National Corpus - SYN2005. Institute of the Czech National Corpus, Praha 2005. Accessible at WWW: <http://www.korpus.cz>. |
![]() |
Corpus SYN2000: Czech National Corpus - SYN2000. Institute of the Czech National Corpus, Praha 2000. Accessible at WWW: <http://www.korpus.cz>. |
![]() |
Corpus ORAL2008: Czech National Corpus - ORAL2008. Institute of the Czech National Corpus, Praha 2008. Accessible at WWW: <http://www.korpus.cz>. |
![]() |
Corpus PMK: Czech National Corpus - PMK. Institute of the Czech National Corpus, Praha 2001. Accessible at WWW: <http://www.korpus.cz>. |
- When quoting other corpora, refer to them in a similar manner
with the exception of the InterCorp corpora, corpora of the SYN-series,
DIAKORP and other non-reference corpora
which are not, unlike the above mentioned corpora, unchangeable sources
of reference, and that is why it is necessary to include information
about the time of access, just like in the case of quoting web sites.
![]() |
SYN Czech National Corpus - SYN. Institute of the Czech National Corpus, Praha.23.05.2013 , Accessible at WWW: <http://www.korpus.cz>. |
![]() |
InterCorp: Czech National Corpus - InterCorp, Institute of the Czech National Corpus, Praha.23.05.2013 , Accessible at WWW: <http://www.korpus.cz>. |
- When quoting a particular work use the list of sources of the
corpus SYN2000, SYN2005
or SYN2010.
- If you use the lemmatisation or morphological tags (attributes lemma or tag in the SYN-series corpora), quote also the following works:
Jan Hajič: Disambiguation of Rich Inflection (Computational Morphology of Czech). Vol. 1. Karolinum Charles University Press, Praha 2004.
Tomáš Jelínek (2008): Nové značkování v Českém národním korpusu. In: Naše řeč, 91, 1, pp. 13-20.
Drahomíra Spoustová, Jan Hajič, Jan Votrubec, Pavel Krbec, Pavel Květoň: The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech. In: Proceedings of the Workshop on Balto-Slavonic Natural Language Processing. ACL 2007, Praha. pp. 67-74.
Vladimír Petkevič (2006): Reliable Morphological Disambiguation of Czech: Rule-Based Approach is Necessary. In: Insight into the Slovak and Czech Corpus Linguistics (Šimková M. ed.). Veda, Bratislava, pp. 26-44.
Syd:
Cvrček, V. - Vodnřička, P.: SyD - Korpusový průzkum variant. FF UK. Praha 2011. Dostupný z WWW: <http://syd.korpus.cz>.
Cvrček, V. - Vondřička, P.: Výzkum variability v korpusech češtiny. In: F. Čermák (ed).: Korpusová lingvistika Praha 2011. 2. Výzkum a výstavba korpusů. NLN. Praha, (s. 184.195).
Corpus frWac:
A. Ferraresi, S. Bernardini, G. Picci and M. Baroni (2010) “Web Corpora for Bilingual Lexicography: A Pilot Study of English/French Collocation Extraction and Translation”. In Xiao, R. (ed.) Using Corpora in Contrastive and Translation Studies. Newcastle: Cambridge Scholars Publishing.( PDF)
Corpora deWac, itWac, ukWac:
M. Baroni, S. Bernardini, A. Ferraresi and E. Zanchetta. 2009. The WaCky Wide Web: A Collection of Very Large Linguistically Processed Web-Crawled Corpora. Language Resources and Evaluation 43(3): 209-226.( PDF, pre-print version)



