Consiglio Nazionale delle Ricerche

Tipo di prodottoContributo in atti di convegno
TitoloBootstrapping a Verb Lexicon for Biomedical Information Extraction
Anno di pubblicazione2009
Formato-
Autore/iVenturi G.; Montemagni S.; Marchi S.; Sasaki Y.; Thompson P.; McNaught J.; Ananiadou S.
Affiliazioni autoriCNR-ILC, Pisa, National Centre for Text Mining, University of Manchester, UK
Autori CNR e affiliazioni
  • GIULIA VENTURI
  • SIMONETTA MONTEMAGNI
  • SIMONE MARCHI
Lingua/e
  • inglese
AbstractThe extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possible. We are building a large-scale lexical resource for the biology domain, providing information about predicate-argument structure that has been bootstrapped from a biomedical corpus on the subject of E. Coli. The lexicon is currently focussed on verbs, and includes both automatically-extracted syntactic subcategorization frames, as well as semantic event frames that are based on annotation by domain experts. In addition, the lexicon contains manually-added explicit links between semantic and syntactic slots in corresponding frames. To our knowledge, this lexicon currently represents a unique resource within in the biomedical domain.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da137
Pagine a148
Pagine totali12
Rivista-
Numero volume della rivista-
Serie/Collana-
Titolo del volume-
Numero volume della serie/collana-
Curatore/i del volume-
ISBN9783642003813
DOI10.1007/978-3-642-00382-0_11
Editore
  • Springer-Verlag, Berlin Heidelberg (Germania)
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)-
Parole chiavedomain-specific lexical resources, Biological Language Processing, syntax-semantic linking
Link (URL, URI)-
Titolo convegno/congresso10th International Conference on Intelligent Text Processing and Computational Linguistics
Luogo convegno/congressoMexico City, Mexico
Data/e convegno/congresso1-7/03/2009
RilevanzaInternazionale
RelazioneContributo
Titolo parallelo-
Licenza-
Scadenza embargo-
Note/Altre informazioni-
Strutture CNR
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR
  • IC.P02.004.002 : Architetture bio-computazionali del lessico mentale
Progetti Europei-
Allegati

Dati storici
I dati storici non sono modificabili, sono stati ereditati da altri sistemi (es. Gestione Istituti, PUMA, ...) e hanno solo valore storico.
Area disciplinareLanguage & Linguistics
Area valutazione CIVRScienze dell'Antichità, filologico-letterarie e storico-artistiche
NoteIn: CICLing-2009 - 10th International Conference on Intelligent Text Processing and Computational Linguistics (Mexico City, Mexico, March 1-7 2009). Proceedings, vol. LNCS 5449 pp. 137 - 148. Springer-Verlag, 2009.
Descrizione sintetica del prodottoABSTRACT: The accurate extraction of information from texts requires both syntactic and semantic resources. We are developing a verb dictionary for use in the processing of biomedical texts that includes both syntactic subcategorisation frames and semantic event frames, and links them together. In this paper, we describe the acquisition of syntactic subcategorisation frames from a large corpus of abstracts of the subject of E. Coli, together with the extraction of linguistic event frames from a subset of this corpus, in which the biological process of E. coli gene regulation has been linguistically annotated by a group of biologists. Finally, we report on work carried out to link the syntactic and semantic information together, by mapping syntactic arguments of subcategorisation frames to semantic arguments of the event frames.