Optimizing the adaptability of clinical information extraction systems: Deep Learning techniques and user-feedback propagation
Project information

Large amounts of medical data are today available, which offers tremendous opportunities for optimization of healthcare quality and patient security. However, despite this large, increasing number of medical documents, the relevant information is not always easily available, and can be difficult to extract from the plethora of data. Moreover, even as medical documents are mostly digitalized, their free text content is not always mined to find relevant, valuable structured information for care, financial or epidemiologic purposes. Although Natural Language Processing technology already offers great tools and solutions to automate the processing of medical documents, NLP engines’ performance often decreases if the extraction context (language, medical specialty, hospital, physician’s writing style) changes. Although, according to Patrick and Li (2010), information extraction systems should reach at least 90% (perhaps 95%) accuracy to be effectively used in hospitals, until now, there is at least a 5 to 10% gap between the performance reached by state-of-the-art systems in NLP tasks like Named Entity Recognition (NER) and this milestone.  This project will study the possibility of a scalable NLP engine able to adapt easily to such new contexts and to be as close as possible to this 90-95% accuracy milestone. To reach this goal, on the one hand, the system extracting medical information should be able to reach high performance regardless of the language, medical specialty, hospital or medical doctor and with reasonable adjustments. On the other hand, the system should continuously improve its performance by leveraging human user corrections. For this purpose, we will explore and combine approaches like dynamic piping, neural networks, the human-in-the-loop paradigm and persistent learning.  The project is a collaboration with LynxCare Clinical Informatics, a medical IT company focusing on promoting access to medical information and reducing administrative costs in many hospitals.


Project Leader(s): 
Walter Daelemans
External Collaborator(s): 


Syndicate content