Consiglio Nazionale delle Ricerche

Tipo di prodottoArticolo in rivista
TitoloAcquisition of lexical information from a large textual Italian corpus
Anno di pubblicazione2003
Formato-
Autore/iCalzolari N., Bindi R.
Affiliazioni autori-
Autori CNR e affiliazioni
  • REMO BINDI
  • NICOLETTA ZAMORANI
Lingua/e-
Abstractinformation others than those usually found in machine readable dictionaries or manually encoded by lexicographers are urgently needed. Different sources must be exploited if we want to overcome the “lexical bottleneck” of Natural Language Processing. Very interesting data can be found by processing large textual corpora, where the actual usage of the language can be truly investigated. These data refer, typically, to various kinds of syntagmatic relations, which are particularly problematic in many NLP applications. The paper describes how this data can be at least partially extracted by processing and analysing large text corpora, with quantitative/statistic methods. We describe two types of quantitative analyses whose aim is to extract information on the strength of association between two words, and on fixed phrases and idioms. We observe how the measure of the association ratio provides quantitative evidence to a number of lexical, syntactic and semantic relationships between word-pairs. One of the claims is that the linguistic information embodied in all these quite different types of lexical collocations can be helpful for lexical disambiguation in analysis and crucial for lexical selection in generation. This is a step towards a more objective lexicography and a more “data-based” linguistics.
Lingua abstract-
Altro abstract-
Lingua altro abstract-
Pagine da117
Pagine a131
Pagine totali-
Rivista-
Numero volume della rivista16-17
Fascicolo della rivista-
DOI-
Verificato da refereeSì: Internazionale
Stato della pubblicazione-
Indicizzazione (in banche dati controllate)-
Parole chiave-
Link (URL, URI)-
Titolo parallelo-
Data di accettazione-
Note/Altre informazioni-
Strutture CNR
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR-
Progetti Europei-
Allegati

Dati storici
I dati storici non sono modificabili, sono stati ereditati da altri sistemi (es. Gestione Istituti, PUMA, ...) e hanno solo valore storico.
RivistaLinguistica Computazionale
NoteIn A. Zampolli, N. Calzolari, L. Cignoni, (eds.), Computational Linguistics in Pisa - Linguistica Computazionale a Pisa, IEPI, Pisa-Roma.