Formalized contrastive lexical description: a framework for bilingual dictionaries

Pavel Vondřička

pavel_vThe goal of the study is a design of a framework for formalized representation of lexical knowledge, as presented in bilingual dictionaries. Little research has been done on the possibilities of representation and storage of the knowledge acquired in the process of lexicographical analysis and used in the synthesis of dictionary entries. Separation of content from a particular form would allow for re-use of the data for several purposes (including NLP) and for flexible customization of dictionaries for different users.

In the first part, general abstract principles of representation of lexical knowledge are sought. The structure of different dictionary entries is analyzed. Modern technical approaches, which may contribute to an efficient representation of the knowledge, are summarized and a generic abstract model for its representation is defined in terms of objects and relations, together with a proposal for a modular implementation separating the language and dictionary specific components.

The second part demonstrates the use of the model for one particular task: a detailed description of a group of Norwegian nouns in contrast with their Czech equivalents. The nouns are analyzed and a possible representation of the knowledge is presented using the proposed generic model and task specific specifications.

Vondřička, P.:  Formalized contrastive lexical description: a framework for bilingual dictionaries. LINCOM Studies in Computational Linguistics, 2014.
ISBN 978-3-86288-428-5


Proverbs: Their Lexical and Semantic Features

František Čermák

For centuries, the habit of collecting proverbs, has been domain of interest and proverbs_fcsubsequent study attracting ethnographers and historians, primarily, though later on a number of scholars from other disciplines have made the field now truly interdisciplinary. Only rather recently, also linguistics, notably lexicology and phraseology, has started to offer linguistic insights into proverbs, too. However, unlike other disciplines, corpus linguistics approaches, used here enable more general insights using large amounts of data, also in paremiology. In general, lexical studies published here aim to point to what any proverb cannot do without, namely to words it is built on, its meaning and use.

This volume, devoted to a number of languages, their proverbs and lexicon is an attempt along these lines trying to bring together what has been published in many places. Twelve contributions offered here are based on modified and improved versions of what has come out elsewhere before Thus, the reader may inspect and compare them collected side by side here for the first time. Broadly, the book may be viewed as made up of items dealing with General Aspects (I, first three items), contributions to Lexicon and Pragmatics proper (II, the following five items), supplemented by studies on Specific topics (III, the last four contributions) where also paremiological minima are to be found.

Čermák, F: Proverbs: Their Lexical and Semantic Features. Proverbium in cooperation with the Institute of the Czech National Corpus, Supplement Series, Vol. 37, ed. W. Mieder. The University of Vermont, Burlington, Vermont 2014.
ISBN: 978-0-9846456-1-9

A Frequency Dictionary of Czech: Core Vocabulary for Learners

František Čermák, Michal Křen (eds)

Following the lines established by the Routledge Frequency Dictionaries series, the freq_dictdictionary is aimed at learners and all other students of Czech. It is the first Czech frequency dictionary based on a balanced selection of both written and authentic spoken Czech (corpora SYN2005, ORAL2006, ORAL2008). It provides the 5,000 most frequently used words in the language listed in a detailed frequency-based index, as well as in alphabetical and part-of-speech indexes. All entries in the rank frequency list feature the English equivalent, a sample sentence with English translation and an indication of register variation.

Čermák, F.,  Křen, M. (eds): A Frequency Dictionary of Czech: Core Vocabulary for Learners. Routledge, London 2011.
ISBN 978-0-415-57661-1 (hardback)
ISBN 978-0-415-57662-8 (paperback)
ISBN 978-0-415-57663-5 (data CD)

InterCorp: Exploring a Multilingual Corpus

František Čermák, Patrick Corness, Aleš Klégr (eds)

Exploration of grammar, lexis, translations, applications, and methodological issues are intercorp_enstudied and illustrated on language pairs or on a group of more languages. This is supplemented by broad and general contributions delineating the field of comparative multilingual corpus linguistics showing possible directions of comparative research based on a multilingual parallel corpus.

Čermák, F., Klégr, A., Corness, P. (eds): InterCorp: Exploring a Multilingual Corpus. Nakladatelství Lidové noviny. Praha 2010.
ISBN 978-80-7422-042-5

Úvod > Research > Publications