Article Text

Original article
Endoscopic assessment of the oesophageal features of eosinophilic oesophagitis: validation of a novel classification and grading system
  1. Ikuo Hirano1,
  2. Nelson Moy1,
  3. Michael G Heckman2,
  4. Colleen S Thomas2,
  5. Nirmala Gonsalves1,
  6. Sami R Achem3
  1. 1Division of Gastroenterology, Northwestern University, Feinberg School of Medicine, Chicago, Illinois, USA
  2. 2Biostatistics Unit, Mayo Clinic, Jacksonville, Florida, USA
  3. 3Division of Gastroenterology, Mayo Clinic, Jacksonville, Florida, USA
  1. Correspondence to Dr Ikuo Hirano, Professor of Medicine, Division of Gastroenterology, Northwestern University Feinberg School of Medicine, 676 North Saint Clair, Suite 1400, Chicago, IL 60611, USA; i-hirano{at}northwestern.edu

Abstract

Objective Abnormalities are commonly identified during endoscopy in eosinophilic oesophagitis (EoE). There is no standardised classification to describe these features. This study aimed to evaluate the interobserver agreement of a grading system for the oesophageal features of EoE.

Method The proposed system incorporated the grading of four major oesophageal features (rings, furrows, exudates, oedema) and the presence of additional features of narrow calibre oesophagus, feline oesophagus, stricture and crepe paper oesophagus. Endoscopic videos from 25 patients with EoE and controls were reviewed by 21 gastroenterologists. Interobserver agreement was assessed by estimating multi-rater κ and the proportion of pairwise agreement.

Results Using the original grading system, agreement for rings, furrows and exudates was moderate (κ=0.38–0.46, 56–65% agreement) but poor for oedema (κ=0.23, 51% agreement). Identification of narrow calibre oesophagus had fair agreement (κ=0.30, 74% agreement) while feline oesophagus had poor agreement (κ=0.15, 68% agreement). After collapsing the severity grading for oedema and furrows and eliminating poorly performing features of feline oesophagus and narrow calibre oesophagus, a modified grading system demonstrated good agreement for the four major features of EoE (κ=0.40–0.54, 71–81% agreement) and additional features of stricture and crepe paper oesophagus (κ=0.52 and 0.58, 79% and 92% agreement).

Conclusions The proposed system for endoscopically-identified oesophageal features of EoE defines common nomenclature and severity scores for the assessment of EoE disease activity. The system has good interobserver agreement among practising and academic gastroenterologists.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Significance of this study

What is already known on this subject?

  • Oesophageal abnormalities are detected on upper endoscopic examination in most children and adults with eosinophilic oesophagitis.

  • Characteristic oesophageal features of eosinophilic oesophagitis include rings, longitudinal furrows, exudates (plaques), oedema (pallor), strictures, crepe paper oesophagus (mucosal fragility) and narrow calibre oesophagus.

What are the new findings?

  • A system for the endoscopically-detected oesophageal features of eosinophilic oesophagitis is proposed incorporating standardised classification and grading of severity.

  • The proposed classification and grading system demonstrated good interobserver agreement among paediatric and adult gastroenterologists with varying degrees of clinical experience with eosinophilic oesophagitis.

How might it impact on clinical practice in the foreseeable future?

  • An assessment instrument for the endoscopically-detected oesophageal features will facilitate comparisons of clinical phenotypes of patients with eosinophilic oesophagitis among gastroenterologists.

  • Further studies are needed to assess if the proposed system for endoscopically-detected oesophageal features predicts clinical severity and is an important outcome in determining the response to medical or dietary treatment of eosinophilic oesophagitis.

Background

Endoscopy has an important role in the evaluation of eosinophilic oesophagitis (EoE) in both children and adults. Characteristic endoscopically-identified oesophageal findings associated with the diagnosis of EoE include fixed oesophageal rings (trachealisation), longitudinal furrows, exudates (plaques), strictures and oedema (decreased vascular markings).1–4 In adult series, reported sensitivities of these EoE associated findings range from 50% to 90%.5–7 Higher sensitivities (>90%) are reported in both retrospective and prospective studies using only observations from investigators with greater experience with EoE.8–12 A prospective study that examined the prevalence of EoE among adults undergoing oesophagogastroduodenoscopy for any indication reported specificities for characteristic features of EoE of >90%.6 Additional oesophageal features described in EoE include fragility of the mucosa (referred to as a ‘crepe paper oesophagus’) and narrow calibre oesophagus.2 ,13 While not a necessary criterion for the diagnosis of EoE, endoscopically-identified oesophageal changes are commonly used by clinicians as supportive evidence of the disease.1 ,14 The features are not specific for EoE but uncommonly identified in disorders such as gastro-oesophageal reflux disease (GERD) where oesophageal erosions are characteristic. Endoscopy also has an important role in the exclusion of oesophageal disorders associated with secondary oesophageal eosinophilia including achalasia, infectious oesophagitis and gastro-oesophageal reflux disease. Endoscopy is, of course, required for the diagnosis of EoE since endoscopy-directed tissue biopsies are necessary to establish the histopathological criteria of eosinophil predominant oesophageal inflammation.

Endoscopic assessment has a potential role in assessment of the treatment response in EoE. Prospective and randomised controlled trials have demonstrated significant improvement in the oesophageal features of EoE following topical steroids.8 ,9 ,15 These reports support the use of endoscopic features of EoE as an important and objective outcome of treatment, analogous to the established role in GERD and increasing use in inflammatory bowel disease. A validated instrument for assessment of the endoscopically-identified oesophageal features was recognised as a major unmet need in a recent consensus recommendation on EoE in children and adults.1

Currently, there is limited consensus on how best to characterise the endoscopically-identified oesophageal features in EoE. Most studies have reported the primary oesophageal features as either present or absent. A recent study using white light and narrow band endoscopic images from patients with EoE demonstrated fair interobserver agreement for the findings of rings, furrows and exudates.16 The lack of standardised terminology and grading criteria hinders accurate descriptions of the clinical findings of individual patients among clinicians and limits the interpretation of data from epidemiological studies and clinical trials. Given this need, the purpose of this study was to develop and evaluate the interobserver agreement for a novel grading system for the primary endoscopically-identified oesophageal features of EoE.

Material and methods

Based on a review of the medical literature, eight abnormalities were identified as features of EoE: fixed rings (also referred to as trachealisation), exudates (also referred to as plaques or white spots), furrows (also referred to as vertical lines and longitudinal furrows), oedema (also referred to as mucosal pallor), stricture, feline oesophagus (also referred to as transient mucosal plications), narrow calibre oesophagus (also referred to as small calibre oesophagus) and crepe paper oesophagus (also referred to as mucosal fragility). A grading scheme was developed for each of the eight abnormalities based on consensus opinion of three gastroenterologists (NG, SRA, IH) (box 1). The primary features of rings, furrows, exudates and oedema were subcategorised into 2–3 grades. Strictures, narrow calibre oesophagus, feline oesophagus and crepe paper oesophagus were classified as present or absent.

Box 1

Original classification and grading system for the endoscopic assessment of the oesophageal features of eosinophilic oesophagitis

Major features

  • Fixed rings (also referred to as concentric rings, corrugated oesophagus, corrugated rings, ringed oesophagus, trachealisation)

    • Grade 0: none

    • Grade 1: mild (subtle circumferential ridges)

    • Grade 2: moderate (distinct rings that do not impair passage of a standard diagnostic adult endoscope (outer diameter 8–9.5 mm))

    • Grade 3: severe (distinct rings that do not permit passage of a diagnostic endoscope)

  • Exudates (also referred to as white spots, plaques)

    • Grade 0: none

    • Grade 1: mild (lesions involving <10% of the oesophageal surface area)

    • Grade 2: severe (lesions involving >10% of the oesophageal surface area)

  • Furrows (also referred to as vertical lines, longitudinal furrows)

    • Grade 0: absent

    • Grade 1: mild (vertical lines present without visible depth)

    • Grade 2: severe (vertical lines with mucosal depth (indentation))

  • Oedema (also referred to as decreased vascular pattern, mucosal pallor)

    • Grade 0: absent (distinct vascularity present)

    • Grade 1: mild (loss of clarity of vascular markings)

    • Grade 2: severe (absence of vascular markings)

  • Stricture

    • Grade 0: absent

    • Grade 1: present

Minor features

  • Feline oesophagus (transient, concentric mucosal rings observed spontaneously or during belching, retching or swallowing that disappear with air insufflation)

    • Grade 0: absent

    • Grade 1: present

  • Narrow calibre oesophagus (reduced luminal diameter of the majority of the tubular oesophagus)

    • Grade 0: absent

    • Grade 1: present

  • Crepe paper oesophagus (mucosal fragility or laceration upon passage of diagnostic endoscope but not after oesophageal dilation)

    • Grade 0: absent

    • Grade 1: present

A series of videos were selected from adult patients with EoE to include a minimum of four examples of each grade of each endoscopic characteristic with the exception of the crepe paper oesophagus where only two examples were available. The video recordings were obtained and reviewed by three gastroenterologists (NG, SRA, IH). In each case, EoE was defined in accordance with consensus recommendations to include symptoms of oesophageal dysfunction and histological evidence of ≥15 eosinophils per high power field (eos/hpf) in spite of double-dose proton pump inhibitors.17 Normal endoscopy videos were obtained from three patients undergoing upper endoscopy for indications who had normal oesophageal mucosal biopsies and randomly intermixed with EoE videos for analysis. One subject had dysphagia that was later identified as originating from a Zenker's diverticulum. A second control patient had dyspepsia and a third had primary laryngeal complaints. In addition, individual endoscopic features of EoE were not present in every patient with EoE.

Thirty-two endoscopists were invited to participate in a study designed to validate the interobserver agreement for the proposed classification and grading system of the oesophageal features of EoE using a series of endoscopic videos. Paediatric and adult gastroenterologists from settings that included academic medical centres, community practice and gastroenterology fellowship were included. Diversity in experience with EoE was encouraged, although the invited reviewers had clinical familiarity with the disease. Twenty-one endoscopists completed the prospective survey. Each endoscopist reviewed a DVD containing video clips from 25 patients demonstrating examples of the eight oesophageal features intermixed with normal videos. Before viewing the videos a colour pictorial atlas with representative images and written descriptions of the proposed grading scheme for EoE was provided to each reviewer (figure 1). The reviewers were blinded to the clinical and histological status of the patients corresponding with the videos. A survey form was provided to each reviewer for each video in order to document perceived mucosal abnormalities. In addition, a demographic survey was administered to each reviewer in order to obtain information regarding gender, age, level of training, practice population, years of experience after fellowship, approximate number of endoscopies performed per week and approximate number of patients with eosinophilic oesophagitis seen for consultative purposes. Expert reviewers were arbitrarily defined by endoscopists who had personally evaluated over 150 patients with EoE.

Figure 1

Reference pictorial atlas used for the grading system for the endoscopic assessment of the oesophageal features of eosinophilic oesophagitis. Categories for each feature are listed in box 1. (A) Fixed oesophageal ring (trachealisation, ringed oesophagus, corrugated oesophagus). (B) Exudates (plaques). (C) Furrows (vertical lines). (D) Oedema (decreased vascular markings). (E) Transient oesophageal rings (feline oesophagus). (F) Crepe paper oesophagus (mucosal fragility).

Statistical analysis

Numerical variables were summarised with the sample median, minimum and maximum. Agreement between endoscopists regarding assessment of endoscopic oesophageal abnormalities was assessed in two different ways. Multi-rater κ18 was estimated for each endoscopic abnormality along with the 95% CI. Owing to well-documented problems with the estimation and interpretation of κ19—for instance, with it being highly influenced by marginal rates—the proportion of pairwise agreements between endoscopists for each endoscopic abnormality was also estimated along with the 95% CI. For a given endoscopic abnormality, the proportion of pairwise agreement results from a comparison of the grading of the endoscopic abnormality between each endoscopist and the remaining 20 endoscopists (210 total pairwise comparisons between endoscopists for one video) for each of the 25 videos (5250 total pairwise comparisons between endoscopists across all videos); the proportion of all pairwise comparisons where the grading of the endoscopic abnormality was exactly equal for the two endoscopists is reported. Interobserver agreement was interpreted based on a combination of estimates of κ and the proportion of pairwise agreements. Statistical analyses were performed using SAS V.9.2.

Results

A summary of the characteristics of the 21 endoscopists is provided in table 1; 86% were men and their median age was 46 years. Most endoscopists were attending physicians (86%), gastroenterologists with adult-only practices (81%) and with a median number of years of experience after fellowship of 11 (range 0–30). The participating endoscopists had evaluated a median of 75 patients with EoE (range 15–600). An overall summary of the endoscopically-identified oesophageal abnormalities as identified by the 21 endoscopists in the 25 videos is presented in table 2, where the frequency of each grade for each abnormality is shown across all endoscopists and all videos, resulting in a total of 525 total ratings (21 endoscopists × 25 videos) for each endoscopic abnormality. Endoscopist-specific summaries of each endoscopic abnormality are presented in tables 1A–C in the online supplement.

Table 1

Demographics and characteristics of 21 endoscopists involved in grading of the endoscopically-detected oesophageal features of eosinophilic oesophagitis

Table 2

Endoscopic abnormalities detected by 21 endoscopists in 25 videos (N=525)

An evaluation of interobserver agreement regarding endoscopically-identified oesophageal abnormalities between the 21 endoscopists is shown in table 3. When using the scoring system given in box 1, interobserver agreement for rings, exudates and furrows was moderate (κ=0.40, 0.46 and 0.38, respectively; pairwise agreement 56%, 65% and 61%, respectively) while interobserver agreement for oedema under the original scoring was poor (κ=0.23, pairwise agreement 51%). To eliminate poorly performing categories and assess the agreement with a simplified version of the scoring system, specific categories were collapsed for these four endoscopic abnormalities. More specifically, grade 1 (mild) and grade 2 (moderate) were collapsed for rings, while grade 1 (mild) and grade 2 (severe) were collapsed for exudates, furrows and oedema (table 3). Under the collapsed categorisations, interobserver agreement improved at least moderately for each of the four features (rings, exudates, furrows and oedema) and can reasonably be described as good (κ=0.50, 0.51, 0.54 and 0.43, respectively; pairwise agreement 71%, 76%, 80% and 81%, respectively). Interobserver agreement was good for stricture (κ=0.52, pairwise agreement 79%) and crepe paper oesophagus (κ=0.58, pairwise agreement 92%), poor for feline oesophagus (κ=0.15, pairwise agreement 68%) and fair for narrow calibre oesophagus (κ=0.30, pairwise agreement 74%) (table 3).

Table 3

Interobserver agreement for the endoscopic assessment of the oesophageal abnormalities of eosinophilic oesophagitis between reviewers

Based on the results of the collapsed categories for rings, exudates, furrows and oedema, a modified grading system incorporating categories with improved interobserver agreement is proposed in box 2. Owing to more substantial improvements in interobserver agreement for furrows and oedema under the collapsed categorisations, the modified system incorporated the collapsed grading for these two features while the more detailed categorisations were retained for rings and exudates. The features of feline oesophagus and narrow calibre oesophagus, where interobserver agreement was less than ideal, were eliminated.

Box 2

Modified classification and grading system for the endoscopic assessment of the oesophageal features of eosinophilic oesophagitis

Major features

  • Fixed rings (also referred to as concentric rings, corrugated oesophagus, corrugated rings, ringed oesophagus, trachealisation)

    • Grade 0: none

    • Grade 1: mild (subtle circumferential ridges)

    • Grade 2: moderate (distinct rings that do not impair passage of a standard diagnostic adult endoscope (outer diameter 8–9.5 mm))

    • Grade 3: severe (distinct rings that do not permit passage of a diagnostic endoscope)

  • Exudates (also referred to as white spots, plaques)

    • Grade 0: none

    • Grade 1: mild (lesions involving <10% of the oesophageal surface area)

    • Grade 2: severe (lesions involving >10% of the oesophageal surface area)

  • Furrows (also referred to as vertical lines, longitudinal furrows)

    • Grade 0: absent

    • Grade 1: present

  • Oedema (also referred to as decreased vascular markings, mucosal pallor)

    • Grade 0: absent (distinct vascularity present)

    • Grade 1: loss of clarity or absence of vascular markings

  • Stricture

    • Grade 0: absent

    • Grade 1: present

Minor features

  • Crepe paper oesophagus (mucosal fragility or laceration upon passage of diagnostic endoscope but not after oesophageal dilation)

    • Grade 0: absent

    • Grade 1: present

In a secondary analysis we also evaluated interobserver agreement separately for expert endoscopists and non-expert endoscopists, and these results are presented in table 4. Under both the original scoring and the collapsed scoring systems, interobserver agreement was moderately better for experts than for non-experts with regard to the assessment of exudates, oedema, narrow calibre oesophagus and crepe paper oesophagus. For instance, when focusing on the collapsed scoring, the proportion of pairwise agreement for experts and non-experts was 81% and 74% for exudates, 90% and 77% for oedema, 80% and 73% for narrow calibre oesophagus, and 98% and 90% for crepe paper oesophagus There was no noticeable difference in interobserver agreement between the expert and non-expert endoscopists for assessment of rings, furrows, stricture or feline oesophagus.

Table 4

Interobserver agreement for the endoscopic assessment of the oesophageal abnormalities of eosinophilic oesophagitis between expert and non-expert endoscopists

Discussion

This study demonstrated moderate to good interobserver agreement among gastroenterologists with varied clinical experience for a classification scheme that graded the primary endoscopically-identified oesophageal features of eosinophilic oesophagitis. Evaluation of the original classification and grading system (box 1) revealed limited agreement for sub-categorisation of oedema and furrows as well as assessment of feline oesophagus and narrow calibre oesophagus. After elimination of these low performing elements, the modified classification and grading system (box 2) demonstrated κ scores from 0.4 to 0.58 and pairwise agreement from 56% to 81% for the primary features of rings, furrows, exudates and oedema. The interobserver agreement was on a par with or better than that reported for the Los Angeles classification system for GERD.20 The additional features of crepe paper oesophagus and oesophageal stricture showed good interobserver agreement.

While endoscopy has a recognised role in the initial evaluation of EoE, there is no standardisation of terminology for describing the oesophageal features. A recent study by Peery et al reported fair to good interobserver agreement for rings and furrows but poor agreement for plaques,16 with similar κ values for rings and furrows as those obtained in the current study. In addition, Peery et al found that agreement did not improve with the addition of narrow band imaging to white light imaging. In contrast to the current study, Peery et al did not use grading of specific features but only asked reviewers to note the presence or absence of features. Another limitation to this previous study was the use of still images rather than video clips, as used in the current study. The use of a colour pictorial atlas depicting examples of each category for the features of the proposed grading system may also have improved the interobserver agreement obtained in the current study.

A validated classification system for the endoscopically-identified oesophageal features of EoE has several potential uses. Oesophageal features are often used by endoscopists to support a diagnosis of EoE, yet the features may be variably defined and characterised by different endoscopists. Uniform nomenclature will facilitate communication between clinicians and comparison of findings between clinical studies performed at different medical centres. Furthermore, endoscopically-identified oesophageal features may be an important determinant of disease severity in EoE. Oesophageal remodelling in EoE is thought to be the consequence of lamina propria fibrosis that has been identified in up to 90% of patients with EoE.21–24 Such remodelling can be manifest in endoscopically-visualised features of oesophageal rings, narrow calibre oesophagus and focal oesophageal strictures. It should be noted that oesophageal strictures, while associated with dysphagia severity, are not a specific manifestation of EoE. Since biopsies do not routinely obtain adequate sampling of the subepithelial tissue, endoscopy offers the capability of detecting the consequences of oesophageal remodelling that are largely ignored by routine histopathology. This observation may explain the discordance between symptom severity and histopathology in EoE.25 It is worth noting that recent clinical trials in Crohn's disease have identified endoscopic signs of inflammation as a potentially more important measure of disease activity than symptom-based or laboratory-based indices.26

In the current study, interobserver agreement was assessed by calculations of both κ and pairwise agreement statistics. Studies have advised caution in the interpretation of κ estimates.19 The conversion of a value of κ into a category such as ‘fair agreement’ or ‘moderate agreement’ based on a predefined set of rules (eg, ≤0.20=slight agreement, 0.21–0.40=fair agreement, 0.41–0.60=moderate agreement) has limitations. For instance, it is possible to have very high agreement but a low estimate of κ, and also possible for two studies with identical rates of agreement to have very different κ values. Thus, assessments of interobserver agreement should be made with this in mind, and the estimates of κ and the proportion of pairwise agreements should be used in conjunction in data interpretation.

Observations regarding the poor interobserver agreement for the features of feline oesophagus and narrow calibre oesophagus probably reflect problems in the definitions of these entities. Feline oesophagus generally refers to transient plications or ring-like deformity of the oesophagus that occurs with shortening of the oesophagus during retching, transient lower oesophageal sphincter relaxation and occasionally during deglutition.27 The feline pattern typically disappears within seconds and upon adequate distension of the oesophageal lumen. In contrast, the fixed oesophageal rings that characterise EoE persist and are sometimes more apparent upon oesophageal distension. It is our belief that the feline pattern is a normal phenomenon, not indicative of underlying oesophageal pathology. The fixed ring pattern, however, has been referred to as a feline pattern and may hence account for the poor agreement among endoscopists detected in this study.13 Narrow calibre oesophagus has both technical limitations and problems in terms of definition. Technically, the reduced calibre of the oesophagus over long distances can be difficult to appreciate endoscopically and is better identified radiographically or by physiological measurement of oesophageal distensibility.28 Operationally, the narrow calibre oesophagus has not been well defined. It remains unclear if this morphological description should apply only when the entire or a specific proportion of the oesophageal lumen is compromised. Similarly, it is uncertain to what degree the lumen needs to be restricted to define a narrow calibre.

Limitations of this study include the small number of reviewers included in specific categories of fellow, clinical practice and paediatric gastroenterology, which limits the interpretation of the generalisability of the proposed grading system. One-third of the reviewers had significant experience with EoE, having cared for over 150 patients with the disease. Greater familiarity with EoE might increase the detection of more subtle oesophageal abnormalities. However, the comparison of ‘expert’ and ‘non-expert’ reviewers did not demonstrate substantial overall differences in agreement. The use of the grading system may be affected when analysed by endoscopists in real time as the videos may have focused on specific regions of interest that may not have been apparent to a different endoscopist. Reviewer participation in a study on EoE may have influenced the recognition of features of EoE. In addition, the DVD video recording playback system allowed reviewers to pause or replay the video recordings and thereby permit additional time for inspection that may not be available when done in real time. The grading system might be improved by assessment of specific regions of the oesophagus (ie, proximal, mid, distal). However, the limitations of a video recording without simultaneous notation of the location of the endoscope tip made such assessment difficult. Intraobserver agreement was not assessed and is an important consideration in the validation of the proposed system. Finally, replication of our results in an independent group of endoscopists using the modified classification and grading system would be important to substantiate validation of the proposed system.

In summary, the proposed classification and grading system for the endoscopic assessment of the oesophageal features of eosinophilic oesophagitis demonstrated reasonable interobserver agreement, on a par with or better than that reported for the widely used Los Angeles classification for GERD. The use of a classification system that incorporates the key features of EoE and defines specific grades for severity of changes has potential value for comparison of patients between clinicians and investigators and for the assessment of response to medical or dietary treatment for EoE.

Acknowledgments

The authors wish to acknowledge the video reviewers for the study: Prakash Chandra, Mirna Chihade, Ranjan Dohil, Eric Gaumitz, Sandeep Gupta, Christian Jackson, Amir Kagalwalla, David Katzka, Kumar Krishnan, Trevor Lissoos, Thomas Nealis, Kathryn Peterson, John Pandolfino, Neehar Parikh, Alain Schoepfer, Stuart Spechler, Alex Straumann

References

View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

Footnotes

  • Funding Campaign Urging Research for Eosinophilic Disease Foundation (CURED).

  • Competing interests None.

  • Ethics approval Approval was obtained from the Northwestern University Institutional Review Board.

  • Provenance and peer review Not commissioned; externally peer reviewed.