Journal List > J Korean Med Sci > v.31(5) > 1023272

Chang, Kim, Shin, Jang, Yeon, and Lee: Methodological Quality Appraisal of 27 Korean Guidelines Using a Scoring Guide Based on the AGREE II Instrument and a Web-based Evaluation

Abstract

This study evaluated the methodological quality of CPGs using the Korean AGREE II scoring guide and a web-based appraisal system and was conducted by qualified appraisers. A total of 27 Korean CPGs were assessed under 6 domains and 23 items on the AGREE II instrument using the Korean scoring guide. The domain scores of the 27 guidelines were as following: the mean domain score was 82.7% (median 84.7%, ranging from 55.6% to 97.2%) for domain 1 (scope and purpose); 53.4% (median 56.9%, ranging from 11.1% to 95.8%) for domain 2 (stakeholder involvement); 63.0% (median 71.4%, ranging from 13.5% to 90.6%) for domain 3 (rigor of development); 88.9% (median 91.7%, ranging from 58.3% to 100.0%) for domain 4 (clarity of presentation); 30.1% (median 27.1%, ranging from 3.1% to 67.7%) for domain 5 (applicability); and 50.2% (median 58.3%, ranging from 0.0% to 93.8%) for domain 6 (editorial independence). Three domains including scope and purpose, rigor of development, and clarity of presentation were rated at more than 60% of the scaled domain score. Three domains including stakeholder involvement, applicability, and editorial independence were rated at less than 60% of the scaled domain score. Finally, of the 27 guidelines, 18 (66.7%) were rated at more than 60% of the scaled domain score for rigor of development and were categorized as high-quality guidelines.

INTRODUCTION

Many Korean academic societies and clinical research centers have made efforts to develop and disseminate high-quality clinical practice guidelines (CPGs), and guideline developers mainly include an increasing number of interested physicians who participate in workshops to learn guideline development methodologies. However, a study reported that the quality assessment results using an AGREE Instrument (1234) for 66 Korean CPGs developed from 2004 to 2009 were not good (5), and an unsatisfactory quality of Korean guidelines has been reported (6).
In this study, 27 Korean CPGs, all of which were developed between 2013 and 2014 with external funding (National Strategic Coordinating Center for Clinical Research, NSCR) and a required evidence-based development methodology, were assessed, and the domain scores were compared to differentiate between high- and poor-quality guidelines. To make a correct judgement and increase the reliability, guideline appraisers learned how to apply a Korean AGREE II scoring guide (78) and utilize practical implementation tools, such as a worksheet, to identify explicit elements in detail and to easily find the elements of disagreement among appraisers. A web-based appraisal system was developed in 2012 and was applied to maximize reliability.

MATERIALS AND METHODS

Guidelines for appraisal

A total of 27 Korean CPGs were assessed under 6 domains and 23 items on the Korean AGREE II instrument using the scoring guide. Of the 27 guidelines, 20 (74.1%) were developed by 7 clinical research centers for rheumatoid arthritis, stroke, depression, end-stage renal disease, ischemic heart disease, chronic obstructive pulmonary disease and type 2 diabetes over a 2-year period (2013-2014). Of the remaining guidelines, 6 (22.2%) were developed by 4 university hospitals, include Korea, Yonsei, Cha, and Chonnam National University Hwasun. One guideline was developed by a Hospital Nurses Association (Appendix 1).

Guideline appraisal process

There were two steps to the appraisal process. First, each guideline was assessed by 4 appraisers who were qualified guideline evaluation members of the Executive Committee for CPGs, Korean Academy of Medical Sciences (KAMS), to increase the reliability of the assessment. Second, 1 senior appraiser made summary statements on 23 items after adjusting for disagreement (Fig. 1). Disagreement was defined as more than 4 judgement score differences among appraisers on the same item using a 7-point rating scale. A score of 1 (strongly disagree) should be given when there is no information or if the concept is very poorly reported, and a score of 7 (strongly agree) should be given when the full criteria for ‘how to rate’ and/or ‘further considerations’ articulated in the user’s manual are met. For the level of judgement between 2 and 6, a score is assigned depending on the completeness and quality of the reporting. Scores increase as more criteria are met and considerations are addressed.
Fig. 1
A framework for the systematic web-based quality appraisal of CPGs in Korea.
jkms-31-682-g001
A total of 37 qualified appraisers participated in the first-step evaluation of 27 CPGs over 3 months (September to November 2014) and an average of 2.9 (range 1 to 5) guidelines were assessed by each appraiser. During the second-step evaluation, 11 senior appraisers assessed an average of 2.5 (range 1 to 4) guidelines each over another 2-month period (December 2014 to January 2015).

AGREE II scoring guide

To reduce differences among evaluators on the assessment of the quality of the CPGs, a scoring guide, consisting of ninety-two criteria for anchor points 1, 3, 5, and 7 relating to 23 items on the AGREE II instrument, was developed by the Executive Committee for CPGs, Korean Academy of Medical Sciences, Korea. It was based on the ‘user’s manual’ and ‘how to rate’ description in the AGREE II instrument and was rated through a Delphi consensus process.

Providing an education program for appraisers

An education program was provided to 58 physician participants to assess the quality of CPGs developed over a 2-year period (2013 to 2014). A series of courses included how to rate and apply a developed AGREE II scoring guide in Korea.

Web-based appraisal system

A web-based appraisal system from the Korean Medical Guideline Information Center (http://www.guideline.or.kr) was applied. To develop the web-based appraisal system, 3 CPGs were evaluated using the system. After the pre-test was conducted, revisions were made based on the comments from the internal review panels.

Analysis

Domain scores were calculated by summing all of the scores of individual items in a domain and by scaling the total as a percentage of the maximum possible score for that domain. Guidelines that were rated as at least 60% of the scaled domain score for rigor of development were categorized as high-quality guidelines.

RESULTS

The domain scores of the 27 guidelines were as follows (Table 1): the mean domain score was 82.7% (median 84.7%, ranging from 55.6% to 97.2%) for domain 1 (scope and purpose); 53.4% (median 56.9%, ranging from 11.1% to 95.8%) for domain 2 (stakeholder involvement); 63.0% (median 71.4%, ranging from 13.5% to 90.6%) for domain 3 (rigor of development); 88.9% (median 91.7%, ranging from 58.3% to 100.0%) for domain 4 (clarity of presentation); 30.1% (median 27.1%, ranging from 3.1% to 67.7%) for domain 5 (applicability); and 50.2% (median 58.3%, ranging from 0.0% to 93.8%) for domain 6 (editorial independence).
Table 1

Scaled domain percentages of 27 Korean CPGs

jkms-31-682-i001
CPG No. Mean scaled domain percentages of 4 appraisers (%)
Scope and purpose Stakeholder involvement Rigor of development Clarity of presentation Applicability Editorial independence
1 80.6 58.3 37.0 72.2 44.8 70.8
2 88.9 59.7 76.0 91.7 27.1 58.3
3 80.6 62.5 77.1 84.7 36.5 58.3
4 93.1 48.6 72.4 91.7 21.9 62.5
5 93.1 58.3 90.6 93.1 47.9 93.8
6 83.3 76.4 76.0 98.6 24.0 66.7
7 81.9 55.6 71.4 88.9 26.0 79.2
8 69.4 66.7 76.0 91.7 12.5 64.6
9 63.9 44.4 38.0 88.9 20.8 47.9
10 86.1 62.5 45.3 80.6 22.9 68.8
11 69.4 55.6 71.4 84.7 14.6 77.1
12 94.4 56.9 69.8 75.0 39.6 58.3
13 95.8 70.8 67.2 98.6 27.1 22.9
14 91.7 51.4 72.9 90.3 22.9 66.7
15 97.2 51.4 82.3 97.2 28.1 93.8
16 84.7 58.3 62.5 86.1 18.8 35.4
17 69.4 15.3 15.6 83.3 3.1 0.0
18 61.1 11.1 13.5 58.3 4.2 0.0
19 84.7 52.8 41.1 90.3 55.2 20.8
20 91.7 56.9 49.5 91.7 54.2 18.8
21 97.2 65.3 79.7 95.8 33.3 33.3
22 94.4 63.9 90.1 98.6 40.6 50.0
23 66.7 48.6 61.5 94.4 47.9 20.8
24 97.2 55.6 90.6 100.0 67.7 83.3
25 94.4 95.8 90.6 95.8 27.1 83.3
26 55.6 16.7 29.7 91.7 16.7 6.3
27 66.7 22.2 54.2 86.1 26.0 12.5
CPGs, clinical practice guidelines.
Three domains including scope and purpose, rigor of development, and clarity of presentation were rated at more than 60% of the scaled domain score. And three domains including stakeholder involvement, applicability, and editorial independence were rated at less than 60% of the scaled domain score. Finally, of the 27 guidelines, 18 (66.7%) were rated at more than 60% of the scaled domain score for rigor of development and were categorized as high-quality guidelines (Table 2).
Table 2

Quality appraisal of 27 Korean CPGs using AGREE II

jkms-31-682-i002
AGREE II domain 60% ≥ 60% < Total
No. % No. % No. %
Scope and purpose 26 96.3 1 3.7 27 100
Stakeholder involvement 8 29.6 19 70.4 27 100
Rigor of development 18 66.7 9 33.3 27 100
Clarity of presentation 26 96.3 1 3.7 27 100
Applicability 1 3.7 26 96.3 27 100
Editorial independence 12 44.4 15 55.6 27 100
CPGs, clinical practice guidelines; AGREE II, Appraisal of Guidelines for Research & Evaluation II.

DISCUSSION

An appraisal of 27 Korean CPGs was performed using the AGREE II. Overall, the guidelines scored highest on clarity of presentation, with a mean score was of 88.9% (median 91.7%) and lowest on applicability, with a mean score was of 30.1% (median 27.1%). The mean scores of CPGs developed by WHO were higher than those of Korean CPGs for the domains of stakeholder involvement, rigor of development, applicability, and editorial independence (Table 3) (9).
Table 3

Comparison of the mean scaled domain percentages between WHO and Korean CPGs

jkms-31-682-i003
AGREE II domain Mean scores of WHO guidelines* (%) 27 Korean CPGs (%)
Pre GRC (n = 10) Post GRC (n = 10) Mean ± SD (Median) Range
Scope and purpose 62.2 80.4 82.7 ± 12.79 (84.7) 55.6 to 97.2
Stakeholder involvement 49.8 61.2 53.4 ± 18.68 (56.9) 11.1 to 95.8
Rigor of development 30.7 68.3 63.0 ± 22.30 (71.4) 13.5 to 90.6
Clarity of presentation 60.9 78.2 88.9 ± 9.24 (91.7) 58.3 to 100.0
Applicability 49.1 61.6 30.1 ± 15.62 (27.1) 3.1 to 67.7
editorial independence 20.9 73.6 50.2 ± 28.95 (58.3) 0.0 to 93.8
CPGs, clinical practice guidelines; AGREE II, Appraisal of Guidelines for Research & Evaluation II.
*Source: Sinclair D et al. (9), GRC, guideline review committee.
Three domains including stakeholder involvement, applicability, and editorial independence showed less than 60% of the scaled domain score. The mean score of stakeholder involvement in our study was 50.2% (median 58.3%). Reasons for this mean score of less than 60% include no qualified data regarding the views and preferences of patients and/or the public in Korea; no guidelines that consider the patient's values and preferences in Korea; appraisers who did not accurately understand the definition of the views and preferences of the target population (patients, public, etc.); and appraisers who did not identify the proper elements. Tudor et al. (10) reported the lowest score in stakeholder involvement (19%).
Our finding showed that the mean score on the applicability domain was low. Korea has a relatively short history in the development and use of CPGs (7). Gagliardi and Brouwers (11) reported that applicability scored lower than all other domains, and the mean and median domain scores for applicability across 137 guidelines published in 2008 or later were 43.6% and 42.0%, respectively. Some reports have given median applicability scores as low as 10.5% (12) and 20.8% (13). Burgers et al. (14) reported that differences in the applicability of the guidelines could not be explained by the variables studied, including care level, scope, type of guideline, year of publication, type of agency, and whether the guideline was produced within a structured and coordinated program.
To increase score for applicability, following criteria should be met and addressed when developing guidelines; describe and consider facilitators and barriers to its application, provide advice and/or tools on how the recommendations can be put into practice, consider the potential resource implications of applying the recommendations, and present monitoring and/or auditing criteria.
The mean score for rigor of development on the 27 Korean CPGs was 63.0% (median 71.4%), and 66.7% of evaluated guidelines were rated at more than 60% of the scaled domain score for rigor of development and were categorized as high-quality guidelines. Korean guideline developers have learned the elements of high-quality guidelines through workshops. In contrast, Al-Ansary et al. (15) reported low scores for rigor of development (ranging 8.3% to 30% for 9 CPGs).
In general, tailored education programs on the stages of guideline development and practical training on software packages such as Revman5 and GRADEpro will be needed for guideline developers in Korea.
The web-based appraisal system is a very easy and useful e-tool for preparing details about assessment criteria and taking into consideration such tools as the Korean AGREE II scoring guide, and qualified appraisers who have received an appropriate scoring education can accurately assess the good and/or poor quality of CPGs. Furthermore, standardizing and clarifying the evaluation process using e-tool and the Korean scoring guide will be beneficial to guideline appraisers.

Appendix

Appendix 1

27 Korean CPGs that were developed with external funding (NSCR)

jkms-31-682-a001

Notes

Funding This study was supported by the National Strategic Coordinating Center for Clinical Research (NSCR), Republic of Korea.

DISCLOSURE The authors have no potential conflicts of interest to disclose.

AUTHOR CONTRIBUTION Design of the study: Chang SG, Kim DI, Shin ES, Lee YS. Data collection and analysis: Shin ES, Jang JE, Yeon JY. Writing manuscript: Shin ES, Chang SG, Kim DI. Revision: Shin ES, Chang SG, Kim DI, Lee YS. Approval of final version of this manuscript: all authors.

References

1. Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Hanna SE, Makarski J. Development of the AGREE II, part 1: performance, usefulness and areas for improvement. CMAJ. 2010; 182:1045–1052.
2. Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Hanna SE, Makarski J. Development of the AGREE II, part 2: assessment of validity of items and tools to support application. CMAJ. 2010; 182:E472–8.
3. Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna SE, et al. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010; 182:E839–42.
4. Makarski J, Brouwers MC, Enterprise AG. The AGREE Enterprise: a decade of advancing clinical practice guidelines. Implement Sci. 2014; 9:103.
5. Jo MW, Lee JY, Kim NS, Kim SY, Sheen S, Kim SH, Lee SI. Assessment of the quality of clinical practice guidelines in Korea using the AGREE Instrument. J Korean Med Sci. 2013; 28:357–365.
6. Ahn HS, Kim HJ. Development and implementation of clinical practice guidelines: current status in Korea. J Korean Med Sci. 2012; 27:Suppl. S55–60.
7. Lee YK, Shin ES, Shim JY, Min KJ, Kim JM, Lee SH; Executive Committee for CPGs. Korean Academy of Medical Sciences. Developing a scoring guide for the Appraisal of Guidelines for Research and Evaluation II instrument in Korea: a modified Delphi consensus process. J Korean Med Sci. 2013; 28:190–194.
8. Oh MK, Jo H, Lee YK. Improving the reliability of clinical practice guideline appraisals: effects of the Korean AGREE II scoring guide. J Korean Med Sci. 2014; 29:771–775.
9. Sinclair D, Isba R, Kredo T, Zani B, Smith H, Garner P. World Health Organization guideline development: an evaluation. PLoS One. 2013; 8:e63715.
10. Tudor KI, Kozina PN, Marušić A. Methodological rigour and transparency of clinical practice guidelines developed by neurology professional societies in Croatia. PLoS One. 2013; 8:e69877.
11. Gagliardi AR, Brouwers MC. Do guidelines offer implementation advice to target users? A systematic review of guideline applicability. BMJ Open. 2015; 5:e007047.
12. Jokhan S, Whitworth MK, Jones F, Saunder A, Heazell AE. Evaluation of the quality of guidelines for the management of reduced fetal movements in UK maternity units. BMC Pregnancy Childbirth. 2015; 15:54.
13. Sabharwal S, Patel V, Nijjer SS, Kirresh A, Darzi A, Chambers JC, Malik I, Kooner JS, Athanasiou T. Guidelines in cardiac clinical practice: evaluation of their methodological quality using the AGREE II instrument. J R Soc Med. 2013; 106:315–322.
14. Burgers JS, Cluzeau FA, Hanna SE, Hunt C, Grol R. Characteristics of high-quality guidelines: evaluation of 86 clinical guidelines developed in ten European countries and Canada. Int J Technol Assess Health Care. 2003; 19:148–157.
15. Al-Ansary LA, Tricco AC, Adi Y, Bawazeer G, Perrier L, Al-Ghonaim M, AlYousefi N, Tashkandi M, Straus SE. A systematic review of recent clinical practice guidelines on the diagnosis, assessment and management of hypertension. PLoS One. 2013; 8:e53744.
TOOLS
ORCID iDs

Sung-Goo Chang
https://orcid.org/http://orcid.org/0000-0003-0733-7266

Dong-Ik Kim
https://orcid.org/http://orcid.org/0000-0001-7527-3829

Ein-Soon Shin
https://orcid.org/http://orcid.org/0000-0002-5086-6086

Ji-Eun Jang
https://orcid.org/http://orcid.org/0000-0002-0691-0289

Ji-Yun Yeon
https://orcid.org/http://orcid.org/0000-0002-6029-910X

Yoon-Seong Lee
https://orcid.org/http://orcid.org/0000-0002-1899-3632

Similar articles