Tipo di prodottoContributo in atti di convegno
TitoloIntegrating NLP Tools in a Distributed Environment: A Case Study Chaining a Tagger with a Dependency Parser
Anno di pubblicazione2012
Autore/iRubino, Francesco; Frontini, Francesca; Quochi, Valeria
Affiliazioni autoriIstituto di Linguistica Computazionale "A. Zampolli", CNR, Pisa
Autori CNR e affiliazioni
  • inglese
AbstractThe present paper tackles the issue of PoS tag conversion within the framework of a distributed web service platform for the automatic creation of language resources. PoS tagging is now considered a "solved problem"; yet, because of the differences in the tagsets, interchange of the various PoS taggers vailable is still hampered. In this paper we describe the implementation of a PoS-tagged-corpus converter, which is needed for chaining together in a workflow the FreeLing PoS tagger for Italian and the DESR dependency parser, given that these two tools have been developed independently. The conversion problems experienced during the implementation, related to the properties of the different tagsets and of tagset conversion in general, are discussed together with the solutions adopted. Finally, the converter is evaluated by assessing the impact of conversion on the performance of the dependency parser by comparing with the outcome of the native pipeline. From this we learn that in most cases parsing errors are due to actual tagging errors, and not to conversion itself. Besides, information on accuracy loss is an important feature in a distributed environment of (NLP) services, where users need to decide which services best suit their needs
Lingua abstractinglese
Pagine da2125
Pagine a2131
Pagine totali7
Titolo del volumeProceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
Curatore/i del volumeNicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet U?ur Do?an, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis
  • European language resources association (ELRA), Paris (Francia)
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)
  • ISI Web of Science (WOS) (Codice:000323927702032)
  • PUMA (Codice:/cnr.ilc/2012-A3-006)
Parole chiavePoS tag conversion, interoperability, NLP pipelines
Link (URL, URI)
Titolo convegno/congressoLanguage Resources and Evaluation Conference 2012
Luogo convegno/congressoIstanbul, Turchia
Data/e convegno/congresso23-25 Maggio 2012
Strutture CNR
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR
  • IC.P02.005.002 : Infrastrutture per l'interoperabilità e l'integrazione di Risorse e Tecnologie Linguistiche
Progetti Europei
