Assessing the Quality of Randomized Controlled Trials Published in the Journal of Korean Medical Science from 1986 to 2011

Jae Hoon Chung; Dong Hyuk Kang; Jung Ki Jo; Seung Wook Lee

doi:10.3346/jkms.2012.27.9.973

Abstract

Low quality clinical trials have a possibility to have errors in the process of deriving the results and therefore distort the study. Quality assessment of clinical trial is necessary in order to prevent any clinical application erroneous results is important. Randomized controlled trial (RCT) is a design for evaluate the effectiveness of medical procedure. This study was conducted by extracting the RCTs from the original articles published in the Journal of Korean Medical Science (JKMS) from 1986 to 2011 and conducting a qualitative analysis using three types of analysis tools: Jadad scale, van Tulder scale and Cochrane Collaboration risk of bias Tool. To compare the quality of articles of JKMS, quality analysis of the RCTs published in Yonsei Medical Journal (YMJ) and Korean Journal of Internal Medicine was also conducted. In the JKMS, YMJ and Korean Journal of Internal Medicine, the quantitative increase of RCT presented over time was observed but no qualitative improvement of RCT was observed over time. From the results of this study, it is required for the researchers to plan for and perform higher quality studies.

INTRODUCTION

Randomized controlled trials (RCTs) have increasingly become acknowledged as important standards of evidence-based medicine. RCTs follow a study design that can reduce bias and produce the most valuable data among study methods, making them the most reliable assessments of the effectiveness of medical treatments (1). However, even the RCT study design cannot eliminate all bias, which can occur at the designing, conducting, reporting, or application phase and can lead to derivation of incorrect results (2). Peer review within journals is an indispensable prevention method against error and can assess the validity of studies so as to prevent incorrect information being used as the basis of clinical application (3). Moreover, objective assessment of articles on methodological quality is essential because it can heighten the quality of medical care (4). An article's methodological quality can represent its overall quality and should therefore be assessed at the design, conduct, and analysis levels (5, 6). In addition, throughout the complete assessment process, any unnecessary or erroneous data can be identified, thereby eliminating it from clinical relevance and saving medical expenses (7).

There are several methods to assess the methodological quality of clinical trials, including scales, individual markers, and checklists. The scales method allows for easy inter-study comparison and is more advantageous than other methods when performing quantitative assessment on the quality of a clinical trial. Randomization, double blinding, and drop out are the three factors of scale that directly relate to reducing bias. The Jadad quality assessment scale (Jadad scale) is a representative quality assessment tool consisting of these three items (8). The Jadad scale has been widely used because of its simple assessment questions and capacity to make assessment easy, but it does not include an assessment item for allocation concealment. There is however an individual marker method that assesses allocation concealment, which is a way to randomize the allocation sequence to evade any selection bias in the allocation of patients for treatment (9). The van Tulder scale and Cochrane collaboration risk of bias tool (CCRBT) include an assessment item for allocation concealment. Recent studies have examined issues in quality assessment in RCTs. For example, Kim et al. analyzed all RCTs published in five Korean medical journals (10). The Journal of Korean Medical Science (JKMS) is the flagship journal of the Korea Academy of Medical Sciences and is the only science citation index Medical Journal in Korea. It was launched in 1986 and includes evidence-based, scientifically written articles aimed at introducing Korean medical sciences to the world and facilitating international medical information exchanges. However, there has been no quality assessment of RCTs published in the JKMS. Yonsei Medical Journal (YMJ) is a clinical science citation index expanded journal which has been published since 1960 by the Yonsei University College of Medicine and it covers all the subjects related to medicine based on either clinical or basic research. Korean Journal of Internal Medicine is an international medical journal published by the Korean Association of Internal Medicine. To suggest the direction of future studies and to improve medical practice in Korea, this study assessed the quality of RCTs presented in the JKMS, YMJ and Korean Journal of Internal Medicine using the Jadad scale, van Tulder scale, and CCRBT.

MATERIALS AND METHODS

Study cohort

A total of 2,257 original articles published in the JKMS over 26 yr from 1986 (volume 1) to 2011 (volume 27) were manually searched. A total of 2,117 and 626 original articles published in YMJ and Korean Journal of Internal Medicine were also searched respectively.

Selection of RCTs

Two reviewers independently determined all RCT reports published in the JKMS, YMJ, Korean Journal of Internal Medicine using PubMed MEDLINE database and KoreaMed. They used search limits and searched terms such as "random", "randomized", and "randomly" in the methods sections of these reports. The other reviewer made a final selection by adjusting the data collected by the two reviewers.

Assessment of the quality of RCTs

Quality assessment was conducted using the Jadad scale; the van Tulder scale and CCRBT were used as individual indices. All assessments were performed by two specialists in Urology, and if there were different outcomes, they adjusted the discrepancy in the results through discussion. Starting from 1986, the quality analysis of RCTs was conducted in 5-yr units. Moreover, the quality assessment was conducted by type of intervention, presence of funding and reviewed by institutional review board (IRB).

Jadad scale

The Jadad scale is also known as the Oxford quality scoring system and assesses RCT-related literature. It is composed of five points in total; two in relation to randomization, two in relation to blinding, and one in relation to the drop out rate (8). When the report includes only general comments with no detailed description of randomization and blinding, one point in each category is given. One point is added when there is a detailed description of the appropriate method. However, when the description method is inappropriate, one point is deducted. When the specified number and reasons for drop outs by each subject group are provided, one point is given. Even if there are no drop outs, this should be described in the statement. When the total is ≥ 3 points, it is assessed as high quality but when it is ≤ 2 points, it is assessed as low quality. However, if it was not possible for the design of the study to be double blinded, it is assessed as high quality when the total score is ≥ 2 points.

van Tulder scale

The van Tulder scale is designed to make assessments on 11 components including randomization, allocation concealment, baseline characteristics, patient blinding, caregiver blinding, observer blinding, co-intervention, compliance, drop out rate, end-point assessment time point, and intention-to-treat analysis (11). Its assessment method is to select 'yes', 'no', or 'don't know' for each item, and when ≥ 5 items are satisfied (≥ 5 points), the quality of the report is deemed high.

CCRBT

The CCRBT assesses the quality of RCTs in six classifications: sequence generation, allocation concealment, blinding, incomplete outcome data, selective outcome reporting, and other potential threats to validity. The assessment indicates 'yes', 'no', or 'unclear' for each domain, designating low, high, and unclear risk of bias, respectively. In cases where the first three questions are answered with 'yes' and when no important concerns related to the last three domains are identified, it is classified as having a low risk of bias, while cases where it is assessed in ≤ 2 domains with 'unclear' or 'no', it is classified as having a moderate risk of bias. The cases assessed in ≥ 3 domains with 'unclear' or 'no' as classified as having a high risk of bias.

Statistical analysis

The one way-ANOVA and Kruskal Wallis test was used to compare and analyze the respective scores obtained by each assessment tool, while a chi-square test was used to compare and analyze the ratio of the high quality articles and the quality assessment outcomes from CCRBT. The quality assessment of RCTs according to publication year was analyzed by one-way ANOVA. SPSS v.18.0 was used for all statistical analyses and a P value of < 0.05 was considered statistically significant.

RESULTS

Quantitative variation of RCTs over time

From 1986 to 2011, 44 RCT-based articles were published among 2,275 original articles in JKMS. Among them, no articles were published from 1986-1989, three articles (6.82%) from 1991-1995, two articles (4.55%) from 1996-1997, seven articles (15.91%) from 2001-2005, 26 articles (59.09%) from 2006-2010, and six articles (2.96%) in 2011 (P = 0.025). The number of RCT in YMJ increased over time (P < 0.001). However, the quantitative increase of RCT in Korean Journal of Internal Medicine was not statistically significant (P = 0.276; Table 1).

Qualitative variation of RCTs over time

Jadad quality assessment scale

The results of quality assessment are presented in 5-yr units starting in 1986. The mean Jadad scale for RCTs published in the JKMS from 1991 to 1995 was 1.33 ± 0.58 and showed a slight increase to 3.00 ± 1.27 in 2011 (P = 0.120). In addition, there were no high quality articles published from 1991 to 1995, but this number increased to four (80.00%) of six RCTs in 2011 (P = 0.111). All three journals, the quality of RCTs showed no statistical difference according to publication year in 1-yr units (Fig. 1).

van Tulder assessment scale

The mean van Tulder scale score of RCTs reported in the JKMS from 1991 to 1995 was 3.33 ± 2.31, while that of RCTs reported from 2006 to 2010 was 5.12 ± 2.22 and 6.83 ± 1.84 in 2011 (P = 0.048). The number of high quality articles in the JKMS was one (33.33%) from 1991 to 1995 but increased to five (83.33%) of six RCTs articles published in 2011 (P = 0.169). No statistical difference was seen in quality assessment of RCTs in JKMS, YMJ, and Korean Journal of Internal Medicine according to publication year in 1-yr units, respectively (Fig. 1).

Cochrane collaboration risk of bias tool

There were no low risk of bias articles among RCTs published in the JKMS up to 2005 by CCRBT assessment, but this increased to three low risk of bias articles (11.54%) presented from 2006 to 2010. In addition, in 2011, there were two articles with a low risk of bias (33.33%; P = 0.106). The CCRBT assessment according to publication time of YMJ and Korean Journal of Internal Medicine showed no statistical difference (P = 0.797, P = 0.110, respectively; Table 2).

Analysis of RCT quality by medical subject

Among the RCT-based articles presented in the JKMS during 26 yr (1986-2011), 16 articles were in internal medicine, seven in surgery, five in basic medicine, and 16 in other specialties including pediatrics, anesthesiology, and rehabilitation medicine. In the quality assessment by subject, there were no statistically significant differences observed between groups. From the results of CCRBT, two articles (40%) from basic medicine and three (18.75%) from internal medicine were identified as having low risks of bias. In addition, the quality of overall RCTs in three journals according medical subjects showed no significant difference statistically (Table 3).

Analysis of factors related to the quality of the articles

There was no statistically significant difference in quality observed between drug and non-drug studies in the JKMS. However, in YMJ the quality of RCTs showed significant difference between drug and non-drug studies (Jadad scale; P = 0.001, van Tulder scale; P < 0.001). No significant differences were found between studies with and without funding in all journals. The quality assessment of overall RCTs in three journals using van Tulder scale presented significant difference according existence of funding (P = 0.018). From the Jadad scale assessment, the ratio of high quality articles was higher in studies reviewed by IRB than studies with no such review (P = 0.041). The quality assessment of total RCTs in all of three journals showed higher scores in studies reviewed by IRB than no IRB reviewed studies (Table 4).

DISCUSSION

In this study, the quality of original RCT reports published in the JKMS, YMJ and Korean Journal of Internal Medicine was assessed. While the number of RCTs published has gradually increased, no statistically significant increase in the quality of the articles was observed. The small number of double blinded reports and the absence of methodological details described for concealment of allocation have hindered high quality assessment.

There had been three previous analyses on the quality of RCTs published in Korean journals. Kim et al. performed an analysis on RTCs published in five different national academic journals: the Korean Journal of Internal Medicine, the Journal of the Korean Surgical Society, the Korean Journal of Obstetrics and Gynecology, the Korean Journal of Pediatrics, and the Korean Journal of Family Medicine (10). According to that report, the number of RCTs with a Jadad score of ≥ 2 points increased in the 1990s compared to the 1980s. In addition, the number of papers with high quality articles increased from two to seven during the same time period. In another study, Chung et al. analyzed RTCs published in the Korean Journal of Family Medicine from 1980 to 2005 (4). Among the 1,290 original articles published, there were 23 RCT articles (1.8%), and their mean Jadad scale score was 1.87. The authors reported that the proportion of RCTs increased from 1.09% of the original articles in the 1980s to 2.63% in the 2000s, with a corresponding 1.17-point increase in the Jadad scale from 1.00 in the 1980s to 2.17 points in the 2000s. These two studies clearly indicate that the number and quality of RCT articles increased over time. This is considered a result of the growing influence of evidence-based medicine. Therefore, the overall number of RCTs that provides a high quality level of evidence is increasing (12). In the present study, the number of RCTs increased as in these previous reports. However, the quality status of the RCTs analyzed here did not significantly change over time. Specifically, the quality status did quantitatively increase over time, but a significant difference was observed only from the assessment by the van Tulder scale in terms of quality in the JKMS. However, when the qualities of articles were compared in 1-yr units, rather than 5-yr units, no statistically significant differences were observed.

Recently, Lee et al. (13) analyzed the quality of RCTs published in the Korean Journal of Urology using the Jadad scale. Their study showed that there were 28 RCT articles (0.89%) out of 3,156 original articles presented since 1991 and during the following 20 yr. The mean Jadad scale score of those RCT articles was 1.75. In addition, eight high quality articles were found out of those 28 articles. The number of RCT articles, which totaled five before 2000, increased to 23 by 2000, and their quality improved over time. Moreover, Lee et al. (13) showed that only one article in the Korean Journal of Urology had adequate allocation concealment. In the present study, among the RCTs published by the JKMS, the majority did not even comment on concealment of allocation. In 2006, an article contained the context related to the concealment of allocation for the first time, but only eight articles (18.18%) in total provided details. There is the only one article contained the context related to the concealment of allocation in the Korean Journal of Internal Medicine. Schulz et al. (14) explained that without concealment of allocation, randomization tends to be damaged in the process of study performance, even if the randomization is well conducted. This omission could distort the effects of intervention by ≥ 40%. Hewitt et al. (15) presented that cases showing inappropriate or uncertain concealment of allocations constituted almost 46% of RCTs published in four different major medical journals (British Medical Journal, the Journal of the American Medical Association, Lancet, and New England Journal of Medicine) in 2000. If appropriate planning and preventions are taken by the medical academics of Korea on double blind and concealment of allocation in RCTs, the quality of such reports would improve.

There have been few studies conducted worldwide on qualitative analysis of RCTs. Uetani et al. (1) analyzed whether RCTs conducted in Japan and published in medical journals met the consolidated standards for reporting of trials (CONSORT) statements, which provide guidance on reporting RCTs and is comprised of a checklist, flow diagram, explanations, and extensions (16). They showed that 98 RCT articles had been published in various academic journals from January to March in 2004, and only 11 RCT articles were in compliance with the CONSORT statements. However, since the CONSORT statement was not a quality assessment tool, it was not possible to digitize the qualitative analysis.

There are various types of qualitative assessment tools for RCTs including Campell, Moher, Chalmers, Jadad, van Tulder, Newell's, and Cochrane. The interesting point in this study is that by using three different tools, we found differences in the qualitative analysis outcomes of RCTs. The assessment of the quality of trials remains controversial, and there is no consensus on highly accurate and valid tools (17). However, in this study, efforts were made to overcome such limitations by using three different tools: the Jadad scale, van Tulder scale, and CCRBT. These are representative assessment tools used most commonly nationwide and worldwide. In particular, the Jadad scale has advantages in the simplicity of the assessment questions and ease of assessment performance, but it does not include assessment items for the most important item of RCT assessment: concealment of allocation. Therefore, additional analyses were performed using the van Tulder scale and CCRBT to supplement in this regard.

Furthermore, in this study, articles that were reviewed and approved by an institutional review board (IRB) were, on average, of higher quality. To the best of our knowledge, no previous studies have been conducted in connection to the association of IRB review with the quality of articles. The review of IRB serves to obtain acknowledgement of the feasibility of the study design and performance in the study protocol. The recent system to validate study protocols in RCTs in order to obtain IRB approval has played an important role in raising the quality of articles since IRB review is considered an international quality standard. Studies that used placebo and drug products easy for double blinding were considered high quality in comparison to the studies with non-drugs. However, there were no significant intergroup differences observed from the results of this study in JKMS. However, overall quality assessment of RCTs in three journal, there was statistically difference according intervention type. Among studies that analyzed RCTs previously published by the Korean Journal of Urology and the Journal of Korean Academy of Family Medicine, those receiving financial support were able to establish well-organized study designs and perform orderly research, resulting in many high quality articles (4, 13). By contrast, Clifford et al. analyzed 100 RCT articles published in five different peer-reviewed, high impact, general medical journals (Annals of Internal Medicine, the British Medical Journal, the Journal of the American Medical Association, Lancet, and New England Journal of Medicine) and found no association between the funding source and the quality of the article (18). This study also did not reveal significant differences in the qualitative assessment of articles depending on the presence/absence of funding except van Tulder scale scoring. This result is not considered to be an accurate conclusion, however, because the number of analyzed articles was small in this report. Therefore, reassessment through analyses of more RCTs is required.

The limitations of this study include a probability for subjective judgment of the researcher likely to intervene in extraction of RCTs and their quality assessments. Therefore, two medical doctors independently extracted RCTS while the assessments were performed independently by two reviewers; thereafter, the outcomes were adjusted to secure objectivity and reliability. The assessment of RCTs published by the JKMS, YMJ, and Korean Journal of Internal Medicine was implemented for the first time in this study. Suggestions for the qualitative improvement of medical research in the Republic of Korea would be a significant contribution of this study.

While the number of RCTs published in the JKMS, YMJ, and Korean Journal of Internal Medicine has gradually increased according to publication time, the quality of these reports has remained unchanged. Therefore, national medical academics should focus more efforts in performing high quality studies to ensure appropriate randomization, reviews by IRB, financial support, and inclusion of allocation concealment during study performance.