Consiglio Nazionale delle Ricerche

Tipo di prodottoArticolo in rivista
TitoloUse of "off-the-shelf" information extraction algorithms in clinical informatics: A feasibility study of MetaMap annotation of Italian medical notes
Anno di pubblicazione2016
FormatoElettronico
Autore/iEmma Chiaramello (a) , Francesco Pinciroli (a,b), Alberico Bonalumi (c), Angelo Caroli (c), Gabriella Tognola (a)
Affiliazioni autori(a) Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni (IEIIT), Consiglio Nazionale delle Ricerche (CNR), Piazza L. da Vinci, 32, 20133 Milano, Italy (b) e-HealthLAB, Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, Piazza L. da Vinci, 32, 20133 Milan, Italy (c) UOC Sistemi Informativi e Informatici, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico Milano, Via Francesco Sforza, 35, 20122 Milano, Italy UOC Sistemi Informativi e Informatici, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico Milano, Via Francesco Sforza, 35, 20122 Milano, Italy Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni (IEIIT), Consiglio Nazionale delle Ricerche (CNR), Piazza L. da Vinci, 32, 20133 Milano, Italy
Autori CNR e affiliazioni
  • EMMA CHIARAMELLO
  • GABRIELLA TOGNOLA
Lingua/e
  • inglese
AbstractInformation extraction from narrative clinical notes is useful for patient care, as well as for secondary use of medical data, for research or clinical purposes. Many studies focused on information extraction from English clinical texts, but less dealt with clinical notes in languages other than English. This study tested the feasibility of using "off the shelf" information extraction algorithms to identify medical concepts from Italian clinical notes. Among all the available and well-established information extraction algorithms, we used MetaMap to map medical concepts to the Unified Medical Language System (UMLS). The study addressed two questions: (Q1) to understand if it would be possible to properly map medical terms found in clinical notes and related to the semantic group of "Disorders" to the Italian UMLS resources; (Q2) to investigate if it would be feasible to use MetaMap as it is to extract these medical concepts from Italian clinical notes. We performed three experiments: in EXP1, we investigated how many medical concepts of the "Disorders" semantic group found in a set of clinical notes written in Italian could be mapped to the UMLS Italian medical sources; in EXP2 we assessed how the different processing steps used by MetaMap, which are English dependent, could be used in Italian texts to map the original clinical notes on the Italian UMLS sources; in EXP3 we automatically translated the clinical notes from Italian to English using Google Translator, and then we used MetaMap to map the translated texts. Results in EXP1 showed that the Italian UMLS Metathesaurus sources covered 91% of the medical terms of the "Disorders" semantic group, as found in the studied dataset. We observed that even if MetaMap was built to analyze texts written in English, most of its processing steps worked properly also with texts written in Italian. MetaMap identified correctly about half of the concepts in the Italian clinical notes. Using MetaMap's annotation on Italian clinical notes instead of a simple text search improved our results of about 15 percentage points. MetaMap's annotation of Italian clinical notes showed recall, precision and F-measure equal to 0.53, 0.98 and 0.69, respectively. Most of the failures were due to the impossibility for MetaMap to generate meaningful variants for the Italian language, suggesting that modifying MetaMap to allow generating Italian variants could improve the performance. MetaMap's performance in annotating automatically translated English clinical notes was in line with findings in the literature, with similar recall (0.75), F-measure (0.83) and even higher precision (0.95). Most of the failures were due to a bad Italian to English translation of medical terms, suggesting that using an automatic translation tool specialized in translating medical concepts might be useful to obtain better performances. In conclusion, performances obtained using MetaMap on the fully automatic translation of the Italian text are good enough to allow to use MetaMap "as it is" in clinical practice
Lingua abstractinglese
Altro abstract-
Lingua altro abstract-
Pagine da22
Pagine a32
Pagine totali11
RivistaJournal of biomedical informatics
Attiva dal 2001
Editore: Academic Press, - San Diego, CA
Paese di pubblicazione: Stati Uniti d'America
Lingua: inglese
ISSN: 1532-0464
Titolo chiave: Journal of biomedical informatics
Titolo proprio: Journal of biomedical informatics.
Titolo abbreviato: J Biomed Inform
Titolo alternativo: Biomedical informatics
Numero volume della rivista63
Fascicolo della rivista-
DOI10.1016/j.jbi.2016.07.017
Verificato da refereeSì: Internazionale
Stato della pubblicazionePublished version
Indicizzazione (in banche dati controllate)-
Parole chiaveInformation extraction; Unstructured clinical notes; Italian language; MetaMap; Failure analysis; EHR data reuse
Link (URL, URI)-
Titolo parallelo-
Data di accettazione-
Note/Altre informazioni-
Strutture CNR
  • IEIIT — Istituto di elettronica e di ingegneria dell'informazione e delle telecomunicazioni
Moduli CNR
  • INT.P02.002.004 : Approcci innovativi al tissue engineering
Progetti Europei-
Allegati
Use of ''off-the-shelf" information extraction algorithms in clinical informatics: A feasibility study of MetaMap annotation of Italian medical notes (documento privato )
Tipo documento: application/pdf