We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Automated Coding of Medical Diagnostics from Free-Text: the Role of Parameters Optimization and Imbalanced Classes.

Formal Metadata

Title
Automated Coding of Medical Diagnostics from Free-Text: the Role of Parameters Optimization and Imbalanced Classes.
Title of Series
Number of Parts
12
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The extraction of codes from Electronic Health Records (EHR) data is an important task because extracted codes can be used for different purposes such as billing and reimbursement, quality control, epidemiological studies, and cohort identification for clinical trials. The codes are based on standardized vo-cabularies. Diagnostics, for example, are frequently coded using the Interna-tional Classification of Diseases (ICD), which is a taxonomy of diagnosis codes organized in a hierarchical structure. Extracting codes from free-text medical notes in EHR such as the discharge summary requires the review of patient data searching for information that can be coded in a standardized manner. The manual human coding assignment is a complex and time-consuming process. The use of machine learning and natural language processing approaches have been receiving an increasing attention to automate the process of ICD coding. In this article, we investigate the use of Support Vector Machines (SVM) and the binary relevance method for multi-label classification in the task of auto-matic ICD coding from free-text discharge summaries. In particular, we ex-plored the role of SVM parameters optimization and class weighting for addressing imbalanced class. Experiments conducted with the Medical Infor-mation Mart for Intensive Care III (MIMIC III) database reached 49.86% of f1-macro for the 100 most frequent diagnostics. Our findings indicated that opti-mization of SVM parameters and the use of class weighting can improve the ef-fectiveness of the classifier.
Keywords