Consiglio Nazionale delle Ricerche

Tipo di prodottoContributo in atti di convegno
TitoloCreation of a bottom-up corpus-based ontology for Italian Linguistics
Anno di pubblicazione2012
FormatoElettronico
Autore/iElisa Bianchi, Mirko Tavosanis, Emiliano Giovannetti
Affiliazioni autoriUniversità di Pisa, Istituto di Linguistica Computazionale "A. Zampolli" - CNR
Autori CNR e affiliazioni
  • EMILIANO GIOVANNETTI
Lingua/e
  • inglese
AbstractThis paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a meta-search engine for query refinement. The ontology was constructed with the software Protege 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The starting point was the automatic term extraction from a corpus of web documents concerning the domain of interest (304,000 words); as regards corpus construction, we describe the main criteria of the web documents selection and its critical points, concerning the definition of user profile and of degrees of specialisation. We then describe the process of term validation and construction of a glossary of terms of Italian Linguistics; afterwards, we outline the identification of synonymic chains and the main criteria of ontology design: top classes of ontology are Concept (containing taxonomy of concepts) and Term (containing terms of the glossary as instances), while concepts are linked through part-whole and involved-role relation, both borrowed from Wordnet. Finally, we show some examples of the application of the ontology for query refinement.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da2641
Pagine a2647
Pagine totali7
Rivista-
Numero volume della rivista-
Serie/Collana-
Titolo del volumeLanguage Resources and Evaluation
Numero volume della serie/collana-
Curatore/i del volume-
ISBN-
DOI-
Editore
  • European Language Resources Association ELRA, Paris (Francia)
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • ISI Web of Science (WOS) (Codice:000323927702118)
Parole chiaveOntologies, Italian Linguistics, Query refinement
Link (URL, URI)-
Titolo convegno/congressoLREC 2012 - Eight International Conference on Language Resources and Evaluation
Luogo convegno/congressoIstanbul
Data/e convegno/congresso23-25 maggio 2012
RilevanzaInternazionale
RelazioneContributo
Titolo parallelo-
Note/Altre informazioni-
Strutture CNR
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR
  • IC.P02.013.001 : Tecniche linguistico-semantiche per il supporto alla traduzione e alla consultazione di testi
Progetti Europei-
Allegati
Creation of a bottom-up corpus-based ontology for Italian Linguistics (documento privato )
Tipo documento: application/pdf