Impact of Medical Data Imprecision on Learning Results
Test data measured by medical instruments often carry imprecise ranges that include the true values. The latter are not obtainable in virtually all cases. Most learning algorithms, however, carry out arithmetical calculations that are subject to uncertain influence in both the learning process to obtain models and applications of the learned models in, e.g. prediction. In this paper, we initiate a study on the impact of imprecision on prediction results in a healthcare application where a pre-trained model is used to predict future state of hyperthyroidism for patients. We formulate a model for data imprecisions. Using parameters to control the degree of imprecision, imprecise samples for comparison experiments can be generated using this model. Further, a group of measures are defined to evaluate the different impacts quantitatively. More specifically, the statistics to measure the inconsistent prediction for individual patients are defined. We perform experimental evaluations to compare prediction results based on the data from the original dataset and the corresponding ones generated from the proposed precision model using the long-short-term memories (LSTM) network. The results against a real world hyperthyroidism dataset provide insights into how small imprecisions can cause large ranges of predicted results, which could cause mis-labeling and inappropriate actions (treatments or no treatments) for individual patients.
READ FULL TEXT