1. Steyerberg EW. a practical approach to development, validation, and updating. New York: Springer;2009.
2. D'Agostino RB Sr, Vasan RS, Pencina MJ, Wolf PA, Cobain M, Massaro JM, et al. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation. 2008; 117:743–753.
3. Yang HI, Yuen MF, Chan HL, Han KH, Chen PJ, Kim DY, et al. Risk estimation for hepatocellular carcinoma in chronic hepatitis B (REACH-B): development and validation of a predictive score. Lancet Oncol. 2011; 12:568–574.
4. Kwak JY, Jung I, Baek JH, Baek SM, Choi N, Choi YJ, et al. Image reporting and characterization system for ultrasound features of thyroid nodules: multicentric Korean retrospective study. Korean J Radiol. 2013; 14:110–117.
5. Kim SY, Lee HJ, Kim YJ, Hur J, Hong YJ, Yoo KJ, et al. Coronary computed tomography angiography for selecting coronary artery bypass graft surgery candidates. Ann Thorac Surg. 2013; 95:1340–1346.
6. Yoon YE, Lim TH. Current roles and future applications of cardiac CT: risk stratification of coronary artery disease. Korean J Radiol. 2014; 15:4–11.
7. Shaw LJ, Giambrone AE, Blaha MJ, Knapper JT, Berman DS, Bellam N, et al. Long-term prognosis after coronary artery calcification testing in asymptomatic patients: a cohort study. Ann Intern Med. 2015; 163:14–21.
8. Lee K, Hur J, Hong SR, Suh YJ, Im DJ, Kim YJ, et al. Predictors of recurrent stroke in patients with ischemic stroke: comparison study between transesophageal echocardiography and cardiac CT. Radiology. 2015; 276:381–389.
9. Suh YJ, Hong YJ, Lee HJ, Hur J, Kim YJ, Lee HS, et al. Prognostic value of SYNTAX score based on coronary computed tomography angiography. Int J Cardiol. 2015; 199:460–466.
10. Sunshine JH, Applegate KE. Technology assessment for radiologists. Radiology. 2004; 230:309–314.
11. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015; 162:55–63.
12. Bossuyt PM, Leeflang MM. Chapter 6: Developing Criteria for Including Studies. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 0.4 [updated September 2008]. Oxford: The Cochrane Collaboration;2008.
13. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996; 49:1373–1379.
14. Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995; 48:1503–1510.
15. Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007; 165:710–718.
16. Steyerberg EW, Harrell FE Jr, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001; 54:774–781.
17. Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006; 25:127–141.
18. Altman DG, Bland JM. Absence of evidence is not evidence of absence. BMJ. 1995; 311:485.
19. Austin PC, Tu JV. Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol. 2004; 57:1138–1146.
20. Austin PC, Tu JV. Bootstrap methods for developing predictive models. Am Stat. 2004; 58:131–137.
21. Austin PC. Bootstrap model selection had similar performance for selecting authentic and noise variables compared to backward variable elimination: a simulation study. J Clin Epidemiol. 2008; 61:1009–1017.e1.
22. Little RJA, Rubin DB. Statistical analysis with missing data. 2nd ed. New York: John Wiley & Sons;2014.
23. Pepe MS, Janes H, Longton G, Leisenring W, Newcomb P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am J Epidemiol. 2004; 159:882–890.
24. Nagelkerke NJ. A note on a general definition of the coefficient of determination. Biometrika. 1991; 78:691–692.
25. Tjur T. Coefficients of determination in logistic regression models—A new proposal: the coefficient of discrimination. Am Stat. 2009; 63:366–372.
26. Rufibach K. Use of Brier score to assess binary predictions. J Clin Epidemiol. 2010; 63:938–939. author reply 939
27. Hosmer DW Jr, Lemeshow S. Applied logistic regression. New York: John Wiley & Sons;2004.
28. D'Agostino R, Nam BH. Evaluation of the performance of survival analysis models: discrimination and calibration measures. In : Balakrishnan N, Rao CO, editors. Handbook of statistics: advances in survival analysis. Vol 23. Amsterdam: Elsevier;2004. p. 1–25.
29. Hosmer DW, Hosmer T, Le Cessie S, Lemeshow S. A comparison of goodness-of-fit tests for the logistic regression model. Stat Med. 1997; 16:965–980.
30. Park SH, Goo JM, Jo CH. Receiver operating characteristic (ROC) curve: practical review for radiologists. Korean J Radiol. 2004; 5:11–18.
31. Harrell FE. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. New York: Springer;2001.
32. Pencina MJ, D'Agostino RB Sr, Song L. Quantifying discrimination of Framingham risk functions with different survival C statistics. Stat Med. 2012; 31:1543–1553.
33. Van Calster B, Van Belle V, Vergouwe Y, Timmerman D, Van Huffel S, Steyerberg EW. Extending the c-statistic to nominal polytomous outcomes: the Polytomous Discrimination Index. Stat Med. 2012; 31:2610–2626.
34. Van Oirbeek R, Lesaffre E. An application of Harrell's C-index to PH frailty models. Stat Med. 2010; 29:3160–3171.
35. Wolbers M, Blanche P, Koller MT, Witteman JC, Gerds TA. Concordance for prognostic models with competing risks. Biostatistics. 2014; 15:526–539.
36. Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JD. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005; 58:475–483.
37. Collins GS, Ogundimu EO, Altman DG. Sample size considerations for the external validation of a multivariable prognostic model: a resampling study. Stat Med. 2016; 35:214–226.
38. Collins GS, de Groot JA, Dutton S, Omar O, Shanyinde M, Tajar A, et al. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol. 2014; 14:40.
39. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988; 44:837–845.
40. Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007; 115:928–935.
41. Demler OV, Pencina MJ, D'Agostino RB Sr. Misuse of DeLong test to compare AUCs for nested models. Stat Med. 2012; 31:2577–2587.
42. Ware JH. The limitations of risk factors as prognostic tools. N Engl J Med. 2006; 355:2615–2617.
43. Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008; 27:157–172. discussion 207-212
44. Pepe MS. Problems with risk reclassification methods for evaluating prediction models. Am J Epidemiol. 2011; 173:1327–1335.
45. Leening MJ, Vedder MM, Witteman JC, Pencina MJ, Steyerberg EW. Net reclassification improvement: computation, interpretation, and controversies: a literature review and clinician's guide. Ann Intern Med. 2014; 160:122–131.
46. Pepe MS, Janes H. Commentary: reporting standards are needed for evaluations of risk reclassification. Int J Epidemiol. 2011; 40:1106–1108.
47. Widera C, Pencina MJ, Bobadilla M, Reimann I, Guba-Quint A, Marquardt I, et al. Incremental prognostic value of biomarkers beyond the GRACE (Global Registry of Acute Coronary Events) score and high-sensitivity cardiac troponin T in non-ST-elevation acute coronary syndrome. Clin Chem. 2013; 59:1497–1505.
48. Pencina MJ, D'Agostino RB Sr, Steyerberg EW. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med. 2011; 30:11–21.
49. Pepe MS, Kerr KF, Longton G, Wang Z. Testing for improvement in prediction model performance. Stat Med. 2013; 32:1467–1482.
50. Pepe MS, Janes H, Li CI. Net risk reclassification p values: valid or misleading? J Natl Cancer Inst. 2014; 106:dju041.
51. Kerr KF, McClelland RL, Brown ER, Lumley T. Evaluating the incremental value of new biomarkers with integrated discrimination improvement. Am J Epidemiol. 2011; 174:364–374.
52. Pencina MJ, D'Agostino RB, Pencina KM, Janssens AC, Greenland P. Interpreting incremental value of markers added to risk prediction models. Am J Epidemiol. 2012; 176:473–481.
53. Sullivan LM, Massaro JM, D'Agostino RB Sr. Presentation of multivariate data for clinical use: The Framingham Study risk score functions. Stat Med. 2004; 23:1631–1660.
54. Imperiale TF, Monahan PO, Stump TE, Glowinski EA, Ransohoff DF. Derivation and Validation of a Scoring System to Stratify Risk for Advanced Colorectal Neoplasia in Asymptomatic Adults: A Cross-sectional Study. Ann Intern Med. 2015; 163:339–346.
55. Schnabel RB, Sullivan LM, Levy D, Pencina MJ, Massaro JM, D'Agostino RB Sr, et al. Development of a risk score for atrial fibrillation (Framingham Heart Study): a community-based cohort study. Lancet. 2009; 373:739–745.
56. Janssen KJ, Moons KG, Kalkman CJ, Grobbee DE, Vergouwe Y. Updating methods improved the performance of a clinical prediction model in new patients. J Clin Epidemiol. 2008; 61:76–86.
57. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, et al. STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies. Radiology. 2015; 277:826–832.
58. Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015; 162:W1–W73.