This article has been
cited by other articles in ScienceCentral.
Keywords: Comments, PET/CT, Thyroid lesions
To the Editor:
We have read the recent paper by Shi et al. (
1), titled “Diagnostic value of volume-based fluorine-18-fluorodeoxyglucose PET/CT parameters for characterizing thyroid incidentaloma,” with great interest. Diagnostic and prognostic models are typically evaluated using measures of accuracy that do not address clinical consequences (
2). The receiver operating characteristic curve (discrimination) is developed by varying the cut-off point used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates (
3). In this paper, Shi et al. (
1) constructed several logistic regression models to assess the clinical value of fluorine-18-fluorodeoxyglucose positron emission tomography/computed tomography (PET/CT) for differentiating malignant from benign focal thyroid incidentaloma (metabolic tumor volume [MTV] 4.0, MTV 3.5, MTV 3.0, MTV 2.5, total lesion glycolysis [TLG] 4.0, TLG 3.5, TLG 3.0, TLG 2.5, etc.). However, the area under the curve [AUC] value just represents the predictive accuracy (
4). In clinical settings, the AUC may be a poor measure of performance in risk prediction models in certain clinical scenarios. 1) The models need not be accurate at extreme ranges, and 2) there may be situations in which a model with a higher AUC may not be desirable (
5).
Decision curve analysis (DCA) is an increasingly used method for evaluating diagnostic tests and predictive models by integrating the clinical consequences of false positives and false negatives (
2). Plotting the net benefit against the threshold probability yields the “decision curve.” In addition to its many other advantages, this method takes into consideration the patient's choice to put themselves at a risk of false negatives or false positives (
6). What the decision curve tells you is the range of threshold probabilities for which the prediction model would be of value (
7). Therefore, the threshold TLG 4.0 of 2.475 should be interpreted with caution. DCA is recommended to find the optimal threshold for each model (
7).