CLINEXTRACT: A structured lexicon for the automated extraction of clinical concepts from the multisource medical records of aged people with hearing disabilities

Hearing impairment is very common in older adults: 360 million people over 65 have disabling hearing loss worldwide. It is well recognized that a successful plan for hearing rehabilitation of older people shall take into account not only the technological aspects, such as what type of hearing aid or implanted hearing device is used, but also other aspects broadly related to auditory disability. In this view, key factors include the perceived hearing difficulties in real life, the impact on quality of life of subject's hearing disability, speech perception abilities in real environments, etc.
Many valuable clinical instruments exist to measure all these different aspects of hearing disability. Last, but not least, we can now make profitable use of innovative tools and services derived from the application of information and communication technologies (ICT) for healthcare, i.e., the so-called 'eHealth' paradigm.
Unfortunately, most clinicians cannot make profit of this wealth of existing information of their patients as it is dispersed and saved in different repositories of the patient record, with about no sustainable opportunity to collate and analyze data in a multidimensional but tuned approach. Also, most of this information is available as unstructured text that frequently is still to be extracted from clinical notes.
CLINEXTRACT is the first attempt, specific to the hearing healthcare domain, to design and develop an easy-to-use, multi-source and multi-dimensional architecture for extracting and collating together audiological clinical information from diversified sources of the patient health record. Two different data types relevant to the planning, management and measurement of the outcomes of the audiological treatment are managed: i) textual narrative information related to the past medical history, current complaints, etiology and audiological diagnosis, risk factors for hearing loss, surgical procedure to implant the hearing devices and ii) numerical information extracted from the audiometric tests, the technical setup of the hearing devices, and from the scores calculated from questionnaires to measures the perceived impact of hearing loss on daily life. All the data thus extracted are put on a suitably designed graphical timeline to allow the clinician to monitor the treatment and to adjust it according to the ongoing patient outcomes. By using state-of-the-art ICT technologies ranging from Natural Language Processing to Information Extraction and data- and text-mining, and leveraging an ad-hoc developed lexicon for the hearing healthcare domain, CLINEXTRACT understands which information contained in unstructured medical notes is useful to plan the patient rehabilitation and puts the extracted information into the proper textual and temporal contexts. All these processes are directly performed on the original medical documents, as they are generated by the clinician using the natural language: the process is very straightforward as it is not required to for the clinicians to change the routines they follow to generated their notes nor to have particular skills in ICT. Textual narrative information and numerical information thus extracted from the multiple sources are then saved into a centralized repository where they can be analyzed in a tuned approach.
