Researchers at New York University and Princeton have developed a framework that evaluates clinical notes and autonomously assigns a risk score indicating whether patients will be readmitted within 30 days. They claim that the code and model parameters, which are publicly available on Github, handily outperform baselines. VentureBeat reports: As the researchers point out in a preprint paper on Arxiv.org, clinical notes use abbreviations and jargon, and they’re often lengthy, which poses an AI system design challenge. To overcome it, they used a natural language processing method — Google’s bidirectional encoder representations from transformers, or BERT — that captures interactions between distant words in sentences by incorporating global, long-range information. Each clinical note is represented as a collection of tokens, or subword units extracted from text in a preprocessing step. From multiple sequences of these, ClinicalBERT identifies which tokens are associated with which sequence. It also learns the position of tokens from variables corresponding to the sequences, and inserts a special token used in classification tasks in front of every sequence.
To train ClinicalBERT, the team sourced a corpus of clinical notes and masked 15 percent of the input tokens, forcing the model to predict the concealed tokens and whether any two given two sentences were in consecutive order. Then, drawing on the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC-III), an electronic health records data set comprising over two million notes from 58,976 hospital admissions of 38,597 patients, the researchers fine-tuned the system for clinical forecasting tasks. Tested on a sample set consisting of 30 pairs of medical terms designed to assess medical term similarity, the authors report, ClinicalBERT achieved a high correlation score, indicating that its tokens captured similarity between medical concepts terms. Heart-related concepts like myocardial infarction, atrial fibrillation, and myocardium were close together, they say, and renal failure and kidney failure were also close.