Consiglio Nazionale delle Ricerche

Tipo di prodottoContributo in atti di convegno
TitoloCapturing Coercions in Texts: a First Annotation Exercise
Anno di pubblicazione2010
FormatoElettronico
Autore/iJezek E.; Quochi V.
Affiliazioni autoriDepartment of Theoretical and Applied Linguistics, University of Pavia, ILC-CNR, Pisa
Autori CNR e affiliazioni
  • VALERIA QUOCHI
Lingua/e
  • inglese
AbstractIn this paper we report the first results of an annotation exercise of argument coercion phenomena performed on Italian texts. Our corpus consists of ca 4000 sentences from the PAROLE sottoinsieme corpus (Bindi et al. 2000) annotated with Selection and Coercion relations among verb-noun pairs formatted in XML according to the Generative Lexicon Mark-up Language (GLML) format (Pustejovsky et al., 2008). For the purposes of coercion annotation, we selected 26 Italian verbs that impose semantic typing on their arguments in either Subject, Direct Object or Complement position. Every sentence of the corpus is annotated with the source type for the noun arguments by two annotators plus a judge. An overall agreement of 0.87 kappa indicates that the annotation methodology is reliable. A qualitative analysis of the results allows us to outline some suggestions for improvement of the task: 1) a different account of complex types for nouns has to be devised and 2) a more comprehensive account of coercion mechanisms requires annotation of the deeper meaning dimensions that are targeted in coercion operations, such as those captured by Qualia relations.
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da1464
Pagine a1471
Pagine totali-
Rivista-
Numero volume della rivista-
Serie/Collana-
Titolo del volumeProceedings of the Seventh International Conference on Language Resources and Evaluation - LREC'10
Numero volume della serie/collana-
Curatore/i del volumeNicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias
ISBN2-9517408-6-7
DOI-
Editore
  • European Language Resources Association ELRA, Paris (Francia)
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • ISI Web of Science (WOS) (Codice:000356879506038)
Parole chiaveCorpus (creation, annotation, etc.), Knowledge Discovery/Representation, Semantics
Link (URL, URI)http://www.lrec-conf.org/proceedings/lrec2010/summaries/713.html
Titolo convegno/congressoSeventh International Conference on Language Resources and Evaluation
Luogo convegno/congressoValletta, Malta
Data/e convegno/congresso17-23 Maggio 2010
RilevanzaInternazionale
RelazioneContributo
Titolo parallelo-
Note/Altre informazioni-
Strutture CNR
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR
  • IC.P02.005.001 : Risorse e Tecnologie Linguistiche: modelli, metodi di sviluppo, applicazioni, disegno di strategie internazionali
Progetti Europei-
Allegati

Dati storici
I dati storici non sono modificabili, sono stati ereditati da altri sistemi (es. Gestione Istituti, PUMA, ...) e hanno solo valore storico.
Area disciplinareLanguage & Linguistics
Area valutazione CIVRScienze dell'Antichità, filologico-letterarie e storico-artistiche
NoteIn: LREC'10 - Seventh International Conference on Language Resources and Evaluation (Valletta, Malta, 17-23 May 2010). Proceedings, pp. 1464 - 1471. Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias (eds.). European Language Resources Association (ELRA), 2010.
Descrizione sintetica del prodottoABSTRACT: In this paper we report the first results of an annotation exercise of argument coercion phenomena performed on Italian texts. Our corpus consists of ca 4000 sentences from the PAROLE sottoinsieme corpus (Bindi et al. 2000) annotated with Selection and Coercion relations among verb-noun pairs formatted in XML according to the Generative Lexicon Mark-up Language (GLML) format (Pustejovsky et al., 2008). For the purposes of coercion annotation, we selected 26 Italian verbs that impose semantic typing on their arguments in either Subject, Direct Object or Complement position. Every sentence of the corpus contains information about corpus-derived typed selectional preferences for verbs in the targeted argument slots and is annotated with the source type for the noun arguments by two annotators plus a judge. An overall agreement of 0.87 kappa indicates that the annotation methodology is reliable. A qualitative analysis of the results allows us to outline some suggestions for impro