Original Article
Adding a “GRADE” to the quality appraisal of rheumatoid arthritis guidelines identifies limitations beyond AGREE-II

https://doi.org/10.1016/j.jclinepi.2014.07.005Get rights and content

Abstract

Objectives

To assess how well treatment recommendations for rheumatoid arthritis (RA) address Grading of Recommendations Assessment, Development and Evaluation (GRADE) steps and determine whether these steps can be adequately appraised using Appraisal of Guidelines Research & Evaluation II (AGREE-II).

Study Design and Setting

We systematically reviewed English-language treatment recommendations for the pharmacologic management of RA since 2000, assessed how well GRADE steps were addressed, rated AGREE-II quality, and compared the findings.

Results

GRADE steps were poorly addressed by the 44 included guidelines. Few guidelines discussed study limitations and/or risk of bias (23%), inconsistency (50%), indirectness (39%), imprecision (23%), or potential for publication bias (0%). Observational evidence was cited in 96% but rarely evaluated systematically. Only one guideline considered evidence on patients' preferences for health outcomes, and few provided an explicit justification for the strength of evidence or recommendation. The five GRADE steps that overlapped with AGREE-II questions were addressed more frequently (by 54–100% of guidelines) than the 13 GRADE steps not directly assessed by AGREE-II (0–50%). Among the nine guidelines rated as “Recommended for use” by AGREE-II, 8 of 13 GRADE steps were not addressed consistently by any guideline.

Conclusion

GRADE's steps are poorly addressed by RA recommendations. AGREE-II provides a broad assessment of quality but lacks sufficient granularity to assess how well a guideline addresses GRADE's steps.

Introduction

What is new?

Key findings

  1. There was a broad-based failure of treatment recommendations for rheumatoid arthritis (RA) to address the steps that GRADE proposes to appraise evidence and develop high-quality transparent treatment recommendations.

  2. AGREE-II quality ratings were also low, with only nine guidelines rated as “Recommended for use.” Among these nine guidelines, GRADE steps were still poorly addressed.

What this adds to what was known?
  1. Gaps in guideline quality have been demonstrated across many fields using AGREE-II. This study demonstrates that the steps recommended by GRADE are poorly addressed by RA guidelines and that although AGREE-II provides a broad assessment of quality, it lacks sufficient granularity to assess how well a guideline addresses GRADE's steps.

What is the implication and what should change now?
  1. Neither AGREE-II nor GRADE provides a complete picture of guideline quality and both should be considered when developing or appraising the quality of guidelines.

  2. Further collaboration between AGREE and GRADE should be encouraged to develop harmonized standards for guideline development and quality appraisal.

At their best, clinical practice guidelines (CPG) provide an efficient mechanism for translating evidence and evidence-based consensus opinion to end users in an easy accessible format [1]. Unfortunately, the quality of many guidelines is poor [2]. Guideline development has been hampered by a myriad of methodological approaches, inconsistent reporting, and heterogeneous evidence-grading systems. In response, several groups have focused on obtaining international consensus on key aspects of guideline development and reporting. One such group, Grading of Recommendations Assessment, Development and Evaluation (GRADE), was formed in the year 2000. They focused their efforts on developing a broadly applicable systematic method for appraising and grading the strength of evidence and formulating and grading recommendations. The product of their efforts, the GRADE framework, was introduced in 2003–2004 [3], [4]. GRADE is more than just a grading system [5]. It outlines specific steps that should be taken before appraising the evidence, when appraising and summarizing the evidence, and when translating evidence summaries into treatment recommendations. GRADE has been adopted by the World Health Organization (WHO) and other guideline agencies and is becoming an international standard for the development of treatment recommendations.

Appraising evidence and developing treatment recommendations, however, are just part of formulating a clinical practice guideline. The Appraisal of Guidelines Research and Evaluation (AGREE) collaboration, also through international consensus, has developed a tool to allow users to assess the quality of the entire process of guideline development. The AGREE instrument has been recently updated to AGREE-II [6]. In total, AGREE-II includes 23 questions, grouped into six quality domains: Scope and Purpose; Stakeholder Involvement; Rigor of Development; Clarity of Presentation; Applicability; and Editorial Independence. Guidelines are also classified into an overall quality category and rated on a 7-point scale. AGREE-II is also intended to provide a methodological strategy for guideline development and to inform the reporting of information within guidelines [6]. It is designed to be applicable to multiple systems for developing guidelines and therefore, in comparison to GRADE, does not propose a specific series of steps that should be taken when developing treatment recommendations. AGREE-II has been widely adopted and is the standard tool to appraise guideline quality.

GRADE, in contrast to AGREE-II, is not intended as a tool to appraise guideline quality. The steps that GRADE outlines, however, should be highly relevant to an assessment of guideline quality, as they have been identified through a rigorous process and international consensus as key factors to consider when appraising evidence and developing treatment recommendations. The objective of this study was to determine how well current guidelines addressed each GRADE step and whether this could be adequately appraised by AGREE-II. Rheumatoid arthritis (RA) is an excellent candidate disease group as there are multiple and continuously evolving treatment choices with competing risks and benefits that need to be evaluated through a large volume of heterogeneous studies. GRADE outlines steps that are designed to assist in making these difficult judgments.

Section snippets

Data sources and searches

A systematic search was performed in MEDLINE, EMBASE, and CINAHL databases combining key word and major subject headings for RA, class and specific drug names for traditional disease-modifying antirheumatic drugs (DMARDs) and biologic agents, and guidelines and consensus statements (CS) published between January 2000 and December 2012. We supplemented this with an extensive gray literature search of guideline clearinghouses, rheumatology and guideline societies, and bibliographic hand searches.

Search results

The literature search yielded 5,408 records through database searches and an additional 59 articles from the gray literature (Fig. 1). After eliminating duplicates, title and/or abstract and full-text screening, 45 articles were included. One article did not cite any evidence and was excluded from the GRADE assessment [10].

Guideline characteristics

The guidelines covered a range of topics, came from all continents, and received endorsement and funding from multiple different sources (Table 1). Most considered randomized

Discussion

Through a comprehensive comparison of GRADE and AGREE-II, we found that although AGREE-II provides a broad assessment of quality over the entire process of guideline development, it lacks the granularity to adequately appraise how well a guideline addresses steps recommended by GRADE. Although several GRADE steps were addressed more often by guidelines with higher AGREE-II ratings, even guidelines rated as “R” by AGREE-II failed to address most GRADE steps. Guidelines often performed systematic

Acknowledgments

The authors thank Dr. Gordon Guyatt and Dr. Ignacio Neumann for their review of the GRADE assessment and their helpful comments.

References (73)

  • G.H. Guyatt et al.

    GRADE guidelines: 7. Rating the quality of evidence—inconsistency

    J Clin Epidemiol

    (2011)
  • G.H. Guyatt et al.

    GRADE guidelines: 8. Rating the quality of evidence—indirectness

    J Clin Epidemiol

    (2011)
  • G.H. Guyatt et al.

    GRADE guidelines: 5. Rating the quality of evidence—publication bias

    J Clin Epidemiol

    (2011)
  • J.C. Andrews et al.

    GRADE guidelines: 15. Going from evidence to recommendation—determinants of a recommendation's direction and strength

    J Clin Epidemiol

    (2013)
  • G. Guyatt et al.

    GRADE guidelines: 11. Making an overall rating of confidence in effect estimates for a single outcome and for all outcomes

    J Clin Epidemiol

    (2013)
  • M. Brunetti et al.

    GRADE guidelines: 10. Considering resource use and rating the quality of economic evidence

    J Clin Epidemiol

    (2013)
  • J. Thornton et al.

    Introducing GRADE across the NICE clinical guideline program

    J Clin Epidemiol

    (2013)
  • D. Davis et al.

    Canadian Medical Association Handbook on clinical practice guidelines

    CMAJ

    (2007)
  • T.M. Shaneyfelt et al.

    Are guidelines following guidelines? The methodological quality of clinical practice guidelines in the peer-reviewed medical literature

    JAMA

    (1999)
  • H.J. Schunemann et al.

    Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations

    CMAJ

    (2003)
  • D. Atkins et al.

    Grading quality of evidence and strength of recommendations

    BMJ

    (2004)
  • M.C. Brouwers et al.

    AGREE II: advancing guideline development, reporting and evaluation in health care

    CMAJ

    (2010)
  • V.P. Bykerk et al.

    Canadian Rheumatology Association recommendations for pharmacological management of rheumatoid arthritis with traditional and biologic disease-modifying antirheumatic drugs

    J Rheumatol

    (2012)
  • P.E. Shrout et al.

    Intraclass correlations: uses in assessing rater reliability

    Psychol Bull

    (1979)
  • Australian Rheumatology Association (ARA). Updated recommendations for the use of biological agents for the treatment...
  • M. Bukhari et al.

    BSR and BHPR guidelines on the use of rituximab in rheumatoid arthritis

    Rheumatology (Oxford)

    (2011)
  • C. Deighton et al.

    BSR and BHPR rheumatoid arthritis guidelines on eligibility criteria for the first biological therapy

    Rheumatology (Oxford)

    (2010)
  • R. Luqmani et al.

    British Society for Rheumatology and British Health Professionals in Rheumatology guideline for the management of rheumatoid arthritis (after the first 2 years)

    Rheumatology (Oxford)

    (2009)
  • R. Luqmani et al.

    British Society for Rheumatology and British Health Professionals in Rheumatology guideline for the management of rheumatoid arthritis (the first two years)

    Rheumatology (Oxford)

    (2006)
  • National Collaborating Centre for Chronic Conditions. Rheumatoid arthritis: national clinical guideline for management...
  • Scottish Intercollegiate Guidelines Network (SIGN). Management of early rheumatoid arthritis. 2011. Available at...
  • J.A. Singh et al.

    2012 update of the 2008 American College of Rheumatology recommendations for the use of disease-modifying antirheumatic drugs and biologic agents in the treatment of rheumatoid arthritis

    Arthritis Care Res (Hoboken)

    (2012)
  • J.S. Smolen et al.

    EULAR recommendations for the management of rheumatoid arthritis with synthetic and biological disease-modifying antirheumatic drugs

    Ann Rheum Dis

    (2010)
  • Spanish Society of Rheumatology. Update of the Clinical Practise Guideline for the Management of Rheumatoid Arthritis...
  • M.H. Buch et al.

    Updated consensus statement on the use of rituximab in patients with rheumatoid arthritis

    Ann Rheum Dis

    (2011)
  • R. Caporali et al.

    Recommendations for the use of biologic therapy in rheumatoid arthritis: update from the Italian Society for Rheumatology I. Efficacy

    Clin Exp Rheumatol

    (2011)
  • Cited by (17)

    • Evaluation of the quality of COVID-19 guidance documents in anaesthesia using the Appraisal of Guidelines for Research and Evaluation II instrument

      2022, British Journal of Anaesthesia
      Citation Excerpt :

      AGREE II is the most multifunctional and well-validated of all the tools available and can be used in guideline development, reporting, and appraisal.22,35 GRADE provides additional granularity in the assessment of the level of evidence and is increasingly being used in guideline appraisal, often in conjunction with AGREE II.36,37 CARE is a tool used to guide and assess a low level of evidence, specifically case reports.38

    • Quality of critical care clinical practice guidelines: Assessment with AGREE II instrument

      2018, Journal of Clinical Anesthesia
      Citation Excerpt :

      In the present study, we included CPGs addressing the management of acute severe pancreatitis [1,39]. Lastly, AGREE II only provides a broad assessment of the reporting quality of guidelines, but lacks sufficient granularity to assess how well a guideline conforms to the guideline development process proposed by the GRADE framework (e.g. study limitation/risk of bias, inconsistency, indirectness and imprecision) [20]. In conclusion, the study showed that the reporting quality of critical care CPGs were suboptimal.

    • Systematic review of current guideline appraisals performed with the Appraisal of Guidelines for Research & Evaluation II instrument—a third of AGREE II users apply a cut-off for guideline quality

      2018, Journal of Clinical Epidemiology
      Citation Excerpt :

      In 2003, an international group of guideline developers and researchers developed the Appraisal of Guidelines for Research & Evaluation (AGREE) instrument [15]. The AGREE instrument is a standard tool for the appraisal of methodological rigor and transparency in guideline development and is widely adopted [16,17]. The revised version, AGREE II [16], was published in 2009 and is currently the most commonly applied and comprehensively validated guideline appraisal tool worldwide [18–20].

    • A Bayesian model that jointly considers comparative effectiveness research and patients’ preferences may help inform GRADE recommendations: an application to rheumatoid arthritis treatment recommendations

      2018, Journal of Clinical Epidemiology
      Citation Excerpt :

      Even where high-quality evidence is available, treatment choices inevitably involve trade-offs between benefits, risks, dosing, or other monitoring requirements. Incorporating patients' preferences into treatment recommendations has been viewed as the next step in guideline development, but rarely occurs in practice [1–5]. Considering patients’ preferences for the desirable and undesirable aspects of treatments is a key step in the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach, which is becoming the standard for developing treatment recommendations [6].

    View all citing articles on Scopus

    Funding: G.S.H. is supported by an Alberta Heritage Foundation for Medical Research Clinical Fellowship. C.B. holds a Pfizer Chair and a Canada Research Chair in Knowledge Transfer for Musculoskeletal Care. D.M. holds a Canada Research Chair in Health Systems and Services Research and Arthur J.E. Child Chair in Rheumatology Research.

    Conflict of interest: None.

    View full text