arXiv 2409.10576

Language Models and Retrieval Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports

By Mohamed Sobhi Jabal, Pranav Warman, et al.

Published 2024-09-15

Citation lineage

Review the prior work and downstream research connected to this paper.

Purpose: To develop and evaluate an automated system for extracting structured clinical information from unstructured radiology and pathology reports using open-weights large language models (LMs) and retrieval augmented generation (RAG), and to assess the effects of model configuration variables on extraction performance. Methods and Materials: The study utilized two datasets: 7,294 radiology reports annotated for…

View the original paper on arXiv