LeafAI: query generator for clinical cohort discovery rivaling a human programmer

by   Nicholas J. Dobbins, et al.

Objective: Identifying study-eligible patients within clinical databases is a critical step in clinical research. However, accurate query design typically requires extensive technical and biomedical expertise. We sought to create a system capable of generating data model-agnostic queries while also providing novel logical reasoning capabilities for complex clinical trial eligibility criteria. Materials and Methods: The task of query creation from eligibility criteria requires solving several text-processing problems, including named entity recognition and relation extraction, sequence-to-sequence transformation, normalization, and reasoning. We incorporated hybrid deep learning and rule-based modules for these, as well as a knowledge base of the Unified Medical Language System (UMLS) and linked ontologies. To enable data-model agnostic query creation, we introduce a novel method for tagging database schema elements using UMLS concepts. To evaluate our system, called LeafAI, we compared the capability of LeafAI to a human database programmer to identify patients who had been enrolled in 8 clinical trials conducted at our institution. We measured performance by the number of actual enrolled patients matched by generated queries. Results: LeafAI matched a mean 43 across 8 clinical trials, compared to 27 queries by a human database programmer. The human programmer spent 26 total hours crafting queries compared to several minutes by LeafAI. Conclusions: Our work contributes a state-of-the-art data model-agnostic query generation system capable of conditional reasoning using a knowledge base. We demonstrate that LeafAI can rival a human programmer in finding patients eligible for clinical trials.


page 12

page 18

page 26


The Leaf Clinical Trials Corpus: a new resource for query generation from clinical trial eligibility criteria

Identifying cohorts of patients based on eligibility criteria such as me...

Information Extraction of Clinical Trial Eligibility Criteria

Clinical trials predicate subject eligibility on a diversity of criteria...

Information Extraction of Clinical Trial Eligibility CriteriaYitong

Clinical trials predicate subject eligibility on a diversity of criteria...

Knowledge-guided Text Structuring in Clinical Trials

Clinical trial records are variable resources or the analysis of patient...

An Information Extraction Approach to Prescreen Heart Failure Patients for Clinical Trials

To reduce the large amount of time spent screening, identifying, and rec...

Effective Matching of Patients to Clinical Trials using Entity Extraction and Neural Re-ranking

Clinical trials (CTs) often fail due to inadequate patient recruitment. ...

Hybrid Approaches for our Participation to the n2c2 Challenge on Cohort Selection for Clinical Trials

Objective: Natural language processing can help minimize human intervent...

Please sign up or login with your details

Forgot password? Click here to reset