Consiglio Nazionale delle Ricerche

Tipo di prodottoContributo in volume
TitoloBootstrapping and Collaboratively Enriching the Italian Domain WordNet through the WiKyoto Knowledge Editor
Anno di pubblicazione2010
Formato-
Tipologia di contributo in volume-
Autore/iRonzano F.; Monachini M.; Marchetti A.; Tesconi M.; Calzolari N.
Affiliazioni autoriCNR-IIT, Pisa, CNR-ILC, Pisa
Autori CNR e affiliazioni
  • MAURIZIO TESCONI
  • FRANCESCO RONZANO
  • ANDREA MARCHETTI
  • MONICA MONACHINI
  • NICOLETTA ZAMORANI
Lingua/e
  • inglese
SintesiEnhancing the development of multilingual resources is of utmost importance for use in computer applications. The need of ever growing resources for effective multilingual content processing has given impulse to a radical change in the perspective of language resource (LR) creation, structuring, exploitation and maintenance. The Web has played a key role in this process: indeed the possibility to access growing amounts of structured and unstructured data as well as the ease of creating and sharing contents between distributed communities of users have strongly affected the methodologies and techniques to bootstrap, enrich and access LRs. From static knowledge bases usually created and maintained by groups of experts and tailored to the specific exploitation contexts, LRs have turned into dynamic repositories of linguistic knowledge. Their content is usually easily accessible over the Web and often exploited aggregated and optimized on-the-fly by on-line information mining services. In this context, the adoption of standardized data formats to facilitate interoperability and data exchange is essential. Moreover, the creation and maintenance of these resources has taken great advantage from the possibility to harvest Web data in order to bootstrap or enrich them. Several new frameworks have been proposed to support access, search, integration and interoperability of "new generation" LRs. Wide distributed communities of Web users are more and more directly or indirectly involved in keeping language resources updated or in extending them. After a brief description of modern LRs, we focus our attention on two essential issues involving them: the need for standard formats that support interoperability in a distributed Web context and the possibility for the Web communities to collaboratively maintain and enrich these resources. In particular, we present the Italian WordNet (IWN) and its exploitation in the context of the KYOTO Project, as a real-world scenario where standardization, interlinking, enrichment as well as collaborative editing are put into practice.
Lingua sintesieng
Altra sintesi-
Lingua altra sintesi-
Pagine da181
Pagine a208
Pagine totali27
Serie/Collana-
Titolo del volumeMultilinguality and Interoperability in Language Processing with Emphasis on Romanian
Numero volume della serie/collana-
Curatore/i del volumeTufis D.; Forascu I.
ISBN978-973-27-1972-5
Edizione/Versione-
DOI-
Editore
  • Romanian Academy Publishing House, Bucharest (Romania)
Verificato da refereeSì: Internazionale
Stato della pubblicazione-
Indicizzazione (in banche dati controllate)
  • PUMA (Codice:2010-A1-010)
Parole chiaveNLP, collaborative editing, wordnet, knowledge representation, wiki
Link (URL, URI)http://www.racai.ro/Multilinguality%20and%20Interoperability/TOC.html
Titolo parallelo-
Note/Altre informazioniThe KYOTO Project is a complex knowledge-driven environment built with the aim of enabling communities of users to mine information form textual documents, sharing the collected facts across cultures, languages and domains. The semantic ground supporting all the information mining tasks of KYOTO is constituted by the Multilingual Knowledge Base, composed by a collection of WordNets encoding language-specific lexical patterns for each language covered by KYOTO. All of them are mapped to the language-independent entities of the KYOTO Central Ontology. In the context of KYOTO, we describe the process followed to define WordNet-LMF (WN-LMF), the standard format tailored to represent lexical resources adhering to the lexical knowledge WordNet model, useful to easily integrate general and domain lexicons in KYOTO. We present the conversion of the IWN to the WN-LMF standard, as a necessary pre-requisite for IWN to be integrated in the Multilingual Knowledge Base. We expose a (semi)-automatic procedure which allows IWN to upgrade ILI connections to the last version available of the Princeton English WordNet, 3.0. We also consider the Species2000 SKOS thesaurus, a knowledge resource with a data structure different from WordNet: we present its conversion to WN-LMF. To enable the multilingual and multicultural community of KYOTO users to maintain and extend KYOTO knowledge resources, we introduce the Wikyoto Knowledge Editor: it is the Web-based wiki environment useful to navigate, collaboratively enrich the Multilingual Knowledge Base. We describe its Web interface by a practical use case concerning the extension of the Italian Domain WordNet
Strutture CNR
  • IIT — Istituto di informatica e telematica
  • ILC — Istituto di linguistica computazionale "Antonio Zampolli"
Moduli/Attività/Sottoprogetti CNR
  • IC.P02.005.001 : Risorse e Tecnologie Linguistiche: modelli, metodi di sviluppo, applicazioni, disegno di strategie internazionali
  • INT.P01.007.003 : XML Technologies For Semantic Web Applications and Secure Workflows
Progetti Europei-
Allegati
Bootstrapping.pdf (documento privato )
Tipo documento: application/pdf

Dati storici
I dati storici non sono modificabili, sono stati ereditati da altri sistemi (es. Gestione Istituti, PUMA, ...) e hanno solo valore storico.
Area disciplinare
  • Information Technology & Communications Systems
  • Language & Linguistics
Area valutazione CIVR
  • Scienze dell'Antichità, filologico-letterarie e storico-artistiche
  • Scienze e tecnologie per una società dell'informazione e della comunicazione
CittàBucharest
CollanaMultilinguality and Interoperability in Language Processing with Emphasis on Romanian
EditoreRomanian Academy Publishing House
NoteIn: Multilinguality and Interoperability in Language Processing with Emphasis on Romanian. pp. 181 - 208. Dan Tufis and Corina Forascu (eds.). Bucharest: Romanian Academy Publishing House, 2010.
Descrizione sintetica del prodottoEnhancing the development of multilingual resources is of utmost importance for use in computer applications. The need of ever growing resources for effective multilingual content processing has given impulse to a radical change in the perspective of language resource (LR) creation, structuring, exploitation and maintenance. The Web has played a key role in this process: indeed the possibility to access growing amounts of structured and unstructured data as well as the ease of creating and sharing contents between distributed communities of users have strongly affected the methodologies and techniques to bootstrap, enrich and access LRs. From static knowledge bases usually created and maintained by groups of experts and tailored to the specific exploitation contexts, LRs have turned into dynamic repositories of linguistic knowledge. Their content is usually easily accessible over the Web and often exploited, aggregated and optimized on-the-fly by on-line information mining services.