Home |  English version |  Mappa |  Commenti |  Sondaggio |  Staff |  Contattaci Cerca nel sito  
Istituto di linguistica computazionale "Antonio Zampolli"

Torna all'elenco Contributi in atti di convegno anno 2012

Contributo in atti di convegno

Tipo: Contributo in atti di convegno

Titolo: Flexible Acquisition of Subcategorization Frames in Italian

Anno di pubblicazione: 2012

Formato: Elettronico

Autori: Caselli, Tommaso; Frontini, Francesca; Quochi, Valeria; Rubino, Francesco and Russo, Irene

Affiliazioni autori: Istituto di Linguistica Computazionale "A. Zampolli", CNR, Italy

Autori CNR:

  • TOMMASO CASELLI
  • FRANCESCA FRONTINI
  • VALERIA QUOCHI
  • FRANCESCO RUBINO
  • IRENE RUSSO

Lingua: inglese

Abstract: Lexica of predicate-argument structures constitute a useful tool for several tasks in NLP. This paper describes a web-service system for automatic acquisition of verb subcategorization frames (SCFs) from parsed data in Italian. The system acquires SCFs in an unsupervised manner. We created two gold standards for the evaluation of the system, the first by mixing together information from two lexica (one manually created and the second automatically acquired) and manual exploration of corpus data and the other annotating data extracted from a specialized corpus (environmental domain). Data filtering is accomplished by means of the maximum likelihood estimate (MLE). The evaluation phase has allowed us to identify the best empirical MLE threshold for the creation of a lexicon (P=0.653, R=0.557, F1=0.601). In addition to this, we assigned to the extracted entries of the lexicon a confidence score based on the relative frequency and evaluated the extractor on domain specific data. The confidence score will allow the final user to easily select the entries of the lexicon in terms of their reliability: one of the most interesting feature of this work is the possibility the final users have to customize the results of the SCF extractor, obtaining different SCF lexica in terms of size and accuracy.

Lingua abstract: inglese

Pagine da: 2842

Pagine a: 2848

Pagine totali: 7

Titolo del volume: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)

Curatore/i del volume: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet U?ur Do?an, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis

ISBN: 9782951740877

Editore: European Language Resources Association ELRA, Paris (FRA)

Referee: Sė: Internazionale

Indicizzato da:

  • DBLP [conf/lrec/CaselliRFRQ12]
  • ISI Web of Science (WOS) [000323927702149]

Parole chiave:

  • lexicon
  • automatic acquisition
  • subcategorisation frames

URL: http://www.lrec-conf.org/proceedings/lrec2012/summaries/390.html

Congresso nome: Eight International Conference on Language Resources and Evaluation (LREC'12)

Congresso luogo: Istanbul, Turkey

Congresso data: 23-25 Maggio 2012

Congresso rilevanza: Internazionale

Congresso relazione: Contributo

Strutture CNR:

Moduli:

 
Torna indietro Richiedi modifiche Invia per email Stampa
Home Il CNR  |  I servizi News |   Eventi | Istituti |  Focus