Journal List > J Korean Soc Med Inform > v.14(1) > 1102980

Lee: Comparisons of predictive modeling techniques for breast cancer in Korean women

Abstract

OBJECTIVE

To develop breast cancer prediction models and to compare their predictive performance by using Bayesian Networks (BN), Naive Bayes (NB), Classification and Regression Trees (CART), and Logistic Regression (LR).

METHODS

The dataset consisting of 109 breast cancer patients and 100 healthy women was used. Hugin Researcher(TM) 6.7 and Poulin-Hugin 1.5, both of which are NB modeling software, were used. For the LRmodel and CART, ECMiner was used.

RESULTS

The highest area under the receiver operating characteristic curve (AUC) was shown in the Tree augmented NBmodel as .90. The lowest AUCwas CARTwith .48; that of the LR model was .86. Two BN models with prior knowledge and without prior knowledge did not show any difference at all (.64 vs. .65). The lifts of four models (Simple NB, Tree Augmented NB, Hierarchical NB, LR) were 1.9. The AUCs in both the NB and LR models were higher than that of the previously established models that have been published by using LR methods.

CONCLUSION

NB could be preferred to LR in the development of a predictive model to promote regular screening tests and early detection,which ismore or less free fromstatistical assumptions and limitations.

TOOLS
Similar articles