Consiglio Nazionale delle Ricerche

Tipo di prodottoContributo in atti di convegno
TitoloUsing LMF to Shape a Lexicon for the Biomedical Domain
Anno di pubblicazione2008
Formato-
Autore/iMonachini M.; Quochi V.; Del Gratta R.; Calzolari N.
Affiliazioni autoriILC-CNR
Autori CNR e affiliazioni
  • VALERIA QUOCHI
  • RICCARDO DEL GRATTA
  • MONICA MONACHINI
  • NICOLETTA ZAMORANI
Lingua/e-
AbstractThis paper describes the design, implementation and population of the BioLexicon in the framework of BootStrep, an FP6 project. The BioLexicon (BL) is a lexical resource designed for text mining in the bio-domain. It has been conceived to meet both domain requirements and upcoming ISO standards for lexical representation. The data model and data categories are compliant to the ISO Lexical Markup Framework and the Data Category Registry. The BioLexicon integrates features of lexicons and terminologies: term entries (and variants) derived from existing resources are enriched with linguistic features, including sub-categorization and predicate-argument information, extracted from texts. Thus, it is an extendable resource. Furthermore, the lexical entries will be aligned to concepts in the BioOntology, the ontological resource of the project. The BL implementation is an extensible relational database with automatic population procedures. Population relies on a dedicated input data structure allowing to upload terms and their linguistic properties and "pull-and-push" them in the database. The BioLexicon teaches that the state-of-the-art is mature enough to aim at setting up a standard in this domain. Being conformant to lexical standards, the BioLexicon is interoperable and portable to other areas.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da153
Pagine a157
Pagine totali-
Rivista-
Numero volume della rivista-
Serie/Collana-
Titolo del volume-
Numero volume della serie/collana-
Curatore/i del volumeC. Delogu; M. Falcone (eds.)
ISBN-
DOI-
Editore-
Verificato da refereeSì: Internazionale
Stato della pubblicazione-
Indicizzazione (in banche dati controllate)-
Parole chiaveDomain terminologies, Computational lexicons, Lexical standards, Lexical architectures
Link (URL, URI)-
Titolo convegno/congressoLangTech 2008 - Tecnologia applicata alla linguistica
Luogo convegno/congressoRoma
Data/e convegno/congresso28-29 February 2008
RilevanzaInternazionale
RelazioneContributo
Titolo parallelo-
Note/Altre informazioni-
Strutture CNR
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR
  • IC.P02.005.001 : Risorse e Tecnologie Linguistiche: modelli, metodi di sviluppo, applicazioni, disegno di strategie internazionali
Progetti Europei-
Allegati

Dati storici
I dati storici non sono modificabili, sono stati ereditati da altri sistemi (es. Gestione Istituti, PUMA, ...) e hanno solo valore storico.
Area disciplinareLanguage & Linguistics
Area valutazione CIVRScienze dell'Antichità, filologico-letterarie e storico-artistiche
NoteIn: LangTech 2008 (Roma, 28-29 February 2008). Atti, pp. 153 - 157. Cristina Delogu, Mauro Falcone (eds.), 2008.
Descrizione sintetica del prodottoABSTRACT: This paper describes the design, implementation and population of the BioLexicon in the framework of BootStrep, an FP6 project. The BioLexicon (BL) is a lexical resource designed for text mining in the bio-domain. It has been conceived to meet both domain requirements and upcoming ISO standards for lexical representation. The data model and data categories are compliant to the ISO Lexical Markup Framework and the Data Category Registry. The BioLexicon integrates features of lexicons and terminologies: term entries (and variants) derived from existing resources are enriched with linguistic features, including sub-categorization and predicate-argument information, extracted from texts. Thus, it is an extendable resource. Furthermore, the lexical entries will be aligned to concepts in the BioOntology, the ontological resource of the project. The BL implementation is an extensible relational database with automatic population procedures. Population relies on a dedicated input data structure allowing to upload terms and their linguistic properties and "pull-and-push" them in the database. The BioLexicon teaches that the state-of-the-art is mature enough to aim at setting up a standard in this domain. Being conformant to lexical standards, the BioLexicon is interoperable and portable to other areas.