Letter to the Editor: Appropriate Statistical Analysis and Research Reporting

Durga Prasanna Misra; Anupam Wakhlu; Vikas Agarwal; Aman Sharma; Vir Singh Negi

doi:10.3346/jkms.2017.32.8.1379

We read with interest the article on presenting results of statistical analysis by Farrokh Habibzadeh (1), the Past President of the World Association of Medical Editors and one of the leading eastern Mediterranean editors. We appreciate efforts of the editors of the Journal of Korean Medical Science who initiate discussion on such an important issue with involvement of highly skilled editors. We would like to share our own thoughts as authors, reviewers, and editors.

Farrokh Habibzadeh rightly mentions that P values are overemphasized by scientists while reporting studies. It is paramount to understand that the P value is simply an estimate of rejecting the null hypothesis when it is actually true. This means that a P value of less than 0.05 (the golden egg as thought by many scientists) actually means that though the obtained results are significantly different, yet the null hypothesis could be being falsely rejected up to 5% of the time (2). Understanding the essence of the statistical significance helps us to look beyond P values. Early-career scientists often discuss with their mentors the demerits of P values and importance of confidence intervals (CI), but fail to understand beyond that.

The CI is essentially a measure of precision. Simply speaking, it is an estimate of how the study results would hold true when repeatedly done in different populations. While looking at CI, it is essential to look at what is the quantum of the CI. The narrower the CI, the greater is the validity of the presented results in the context of the general population. Statistically significant (P < 0.05) but broad CI may not necessarily reflect the true value of a particular outcome in the population.

The author of the index article mentions about how inexperienced researchers often run a large number of comparisons and report those with significant P values, even when the objective of that particular study was never to look at these eventually “statistically significant” comparisons. This inappropriate practice is also referred to as “P hacking.” When one runs a large number of statistical comparisons, there is every chance that some of these will be statistically significant, although their clinical relevance will be uncertain.

Whenever a dataset presents normally distributed data, we should use the mean with standard deviations (SDs) for presentation, otherwise the medians (with range or interquartile range [IQR]) should be reported. There are some caveats to this rule. A relatively small dataset, although normally distributed, may still be susceptible to skewing by extreme results, and hence, median (with range or IQR) are more appropriate in this particular case. If in doubt whether to use parametric or non-parametric tests, it is better to use non-parametric tests (i.e., Mann-Whitney U test and Wilcoxon paired rank test) and report accordingly. When reporting scores, such as disease activity scores in rheumatoid arthritis or spondyloarthropathy, the dataset may be normally distributed, however, one should still use non-parametric tests for analysis (since scores are essentially non-continuous variables).

Finally, inexperienced researchers desperate for publishable results may be tempted to report statistically non-significant results as statistically significant. Reviewers and editors are likely to be active researchers themselves and well versed with using statistical analytical tools. When the results do not seem significant, but are still accompanied with significant P values, evaluators should have a look at the datasets and judge the significance themselves to avoid publishing erroneous reports. We must never resort to alteration of results and their statistical significance in order to improve the marketability of a particular study. With rapidly developing post-publication review, such mistakes are likely to be identified with subsequent corrections or even retractions.

Statistical analysis is a critical component of scientific papers, however, this must be employed and reported appropriately, particularly in the context of clinical relevance.

Letter to the Editor: Appropriate Statistical Analysis and Research Reporting

Notes

References