Image Text Extraction and Natural Language Processing of Unstructured Data from Medical Reports

Описание

Тип публикации: статья из журнала

Год издания: 2024

Идентификатор DOI: 10.3390/make6020064

Ключевые слова: machine learning, knowledge extraction

Аннотация: <jats:p>This study presents an integrated approach for automatically extracting and structuring information from medical reports, captured as scanned documents or photographs, through a combination of image recognition and natural language processing (NLP) techniques like named entity recognition (NER). The primary aim was to develПоказать полностьюop an adaptive model for efficient text extraction from medical report images. This involved utilizing a genetic algorithm (GA) to fine-tune optical character recognition (OCR) hyperparameters, ensuring maximal text extraction length, followed by NER processing to categorize the extracted information into required entities, adjusting parameters if entities were not correctly extracted based on manual annotations. Despite the diverse formats of medical report images in the dataset, all in Russian, this serves as a conceptual example of information extraction (IE) that can be easily extended to other languages.</jats:p>

Ссылки на полный текст

Издание

Журнал: Machine Learning and Knowledge Extraction

Выпуск журнала: Т. 6, № 2

Номера страниц: 1361-1377

ISSN журнала: 25044990

Издатель: MDPI

Персоны

Malashin Ivan (Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia)
Masich Igor (Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia)
Tynchenko Vadim (Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia)
Gantimurov Andrei (Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia)
Nelyub Vladimir (Scientific Department, Far Eastern Federal University, 690922 Vladivostok, Russia)
Borodulin Aleksei (Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State Technical University, 105005 Moscow, Russia)

Вхождение в базы данных

Ядро РИНЦ (eLIBRARY.RU)

Researcher Support Служба поддержки публикационной активности СФУ

Служба поддержки публикационной активности СФУ

Image Text Extraction and Natural Language Processing of Unstructured Data from Medical Reports

Описание

Издание

Персоны

Вхождение в базы данных

Служба поддержки публикационной активности СФУ

Личный кабинет

Image Text Extraction and Natural Language Processing of Unstructured Data from Medical Reports

Описание

Издание

Персоны

Вхождение в базы данных