Follow-up of pediatric celiac disease: value of antibodies in predicting mucosal healing, a prospective cohort study

Background In diagnosing celiac disease (CD), serological tests are highly valuable. However, their role in following up children with CD after prescription of a gluten-free diet is unclear. This study aimed to compare the performance of antibody tests in predicting small-intestinal mucosal status in diagnosis vs. follow-up of pediatric CD. Methods We conducted a prospective cohort study at a tertiary-care center. 148 children underwent esophohagogastroduodenoscopy with biopsies either for symptoms ± positive CD antibodies (group A; n = 95) or following up CD diagnosed ≥ 1 year before study enrollment (group B; n = 53). Using biopsy (Marsh ≥ 2) as the criterion standard, areas under ROC curves (AUCs) and likelihood-ratios were calculated to estimate the performance of antibody tests against tissue transglutaminase (TG2), deamidated gliadin peptide (DGP) and endomysium (EMA). Results AUCs were higher when tests were used for CD diagnosis vs. follow-up: 1 vs. 0.86 (P = 0.100) for TG2-IgA, 0.85 vs. 0.74 (P = 0.421) for TG2-IgG, 0.97 vs. 0.61 (P = 0.004) for DPG-IgA, and 0.99 vs. 0.88 (P = 0.053) for DPG-IgG, respectively. Empirical power was 85% for the DPG-IgA comparison, and on average 33% (range 13–43) for the non-significant comparisons. Among group B children, 88.7% showed mucosal healing (median 2.2 years after primary diagnosis). Only the negative likelihood-ratio of EMA was low enough (0.097) to effectively rule out persistent mucosal injury. However, out of 12 EMA-positive children with mucosal healing, 9 subsequently turned EMA-negative. Conclusions Among the CD antibodies examined, negative EMA most reliably predict mucosal healing. In general, however, antibody tests, especially DPG-IgA, are of limited value in predicting the mucosal status in the early years post-diagnosis but may be sufficient after a longer period of time.


Background
Celiac disease (CD) is a multi-systemic autoimmune disease triggered by exposure to dietary gluten in genetically predisposed individuals. CD creates small-intestinal mucosal injury of different severity [1]. An effective treatment allowing mucosal healing is the gluten-free diet (GFD). The goals of treatment are not only symptomatic improvement but also avoiding complications, which could arise even in patients having become asymptomatic on a GFD [2,3]. Furthermore, achieving mucosal healing might be crucial because of an increased risk of lymphoproliferative malignancy among patients with persistent villous atrophy [4].
International CD guidelines propose regular follow-up of CD patients [5][6][7]. Among the follow-up modalities, rebiopsy may be undertaken to prove mucosal healing, which children achieve more often than adults [8]. However, its invasiveness, discomfort and possible complications limit the use of re-biopsy in routine follow-up [5,6]. Therefore, reliable non-invasive surrogate markers of mucosal healing are highly desirable. Whereas antibody tests are of irreplaceable value in diagnosing untreated CD [6], controversy exists over whether these tests can reliably indicate mucosal healing [8][9][10][11].
Concerning the correlation between follow-up histology and non-invasive biomarkers, children with CD are an understudied population. Specifically, there is a lack of prospective pediatric studies evaluating current biomarkers used in clinical practice for monitoring purposes.
The purpose of this study was to prospectively compare the performance of up-to-date antibody tests in predicting mucosal status in children with untreated CD vs. in children after prescription of a GFD.

Study design and subjects
Between July 1, 2009, and December 31, 2010, a prospective, cross-sectional cohort study was performed at St. Anna Children's Hospital. Following written informed parental consent, all consecutively enrolled children (n = 148) underwent esophagogastroduodenoscopy with biopsies (EGD). The participating children were divided into groups according to whether EGD was performed for diagnostic or follow-up purposes ( Figure 1).
Group A comprised 95 children on a gluten-containing diet, 32 of them became diagnosed with CD (group A1) and 63 were referred to EGD due to non-celiac dyspepsia (group A2). The predominant complaints in group A1 children were abdominal pain (31.3%), failure to thrive or short stature (18.8%), chronic diarrhea (6.3%), flatulence (6.3%), recurrent headache (6.3%) and constipation (3.1%). A firstdegree relative with CD (18.8%), IgA-deficiency (3.1%), autoimmune thyroiditis (3.1%), and iron deficiency anemia (3.1%) were the remaining reasons for CD screening in group A1. Diagnosis of CD was based on positive IgA antibodies against endomysium (EMA) in IgA-competent children or IgG-antibodies against deamidated gliadin peptides (DGP-IgG) in children with IgA-deficiency along with biopsy results consistent with CD (Marsh ≥ 2) and positivity of HLA-DQ2 and/or HLA-DQ8. CD was ruled out by negative biopsy results. 174 eligible children presenting •either with serologically suspected untreated CD, n=32 •or with dyspepsia for esophagogastroduodenoscopy with biopsies, n=63 •or for re-biopsy of CD, 1 year after prescription of a GFD, n=79 26 children presenting for CD follow-up refused to undergo re-biopsy Group A1 (CD before prescription of a GFD) n=32 Group A2 (non-celiac dyspepsia) n=63 Group B comprised 53 children with CD after prescription of a GFD ≥ 1 year before study enrollment (median 2.2, range 1 to 12.9). CD had been proven by positive EMA or IgA antibodies against tissue transglutaminase (TG2-IgA), biopsy evidence and positivity of HLA-DQ2 and/or HLA-DQ8. Group B children had received regular follow-up according to the recommendations then in force [7]. Within the 18-month study period, a total of 79 children presented for routine CD follow-up. All of these children were invited to participate in the study independent of the presence of symptoms or their adherence to the GFD according to dietary interview. As such they were unselected and only chosen by their willingness to undergo follow-up endoscopy. In this context, 26 of 79 eligible children opted out of the study. The predominant complaints in group B were abdominal pain (15.1%), constipation (1.9%) and aphthous stomatitis (1.9%). Within group B, 79.2% of children were symptom-free.

Endoscopy, biopsies and histology
All EGDs were performed in anesthesiologist-controlled deep sedation with propofol. Four biopsies were taken from the second part and two from the bulb of the duodenum. Biopsies were staged by two experienced pathologists (GA and AC) who were blinded to subject identity and indication for biopsy. Intestinal histological findings were classified according to a modified Marsh classification [1] using ≥ 30 lymphocytes/100 epithelial cells as cut-off for pathological intraepithelial lymphocytosis [12]. In cases of initial disagreement, a consensus diagnosis was reached using a multihead microscope. Mucosal healing in group A was defined as Marsh < 2.

CD serology
Blood for serology was taken in the week before EGD. In all children, total IgA levels were determined. IgAdeficiency was defined as serum IgA < 0.07 g/L [13]. All IgA-based tests were evaluated only after exclusion of IgA-deficient children. Four commercial enzyme-linked immunosorbent assays were used for detection of TG2-IgA, TG2-IgG, DPG-IgG, and DPG-IgA (Table 1). Sera were also tested for EMA by indirect immunofluorescence using monkey esophagus (Table 1), at the initial dilution of 1:5 and, when positive, titrated up to the end point. A single experienced technician assessed all slides. CD serology kits from different companies were used because Eurospital TG2-IgA and Orgentec EMA had been our routine test kits since 2005 onwards and on request Werfen Austria, Diagnostic Divisions, was willing to complete the armamentarium of current CD antibodies by providing Inova antibody kits free of charge during the study.

Statistics
Data of continuous and categorical variables were reported using median and interquartile range (IQR) on the one hand and counts and percentages on the other hand. For comparisons on categorical data chi-square and Fisher exact test were used while Mann-Whitney U and Kruskal-Wallis tests were used to compare continuous data. Unless otherwise specified, Bonferroni correction was applied for multiple comparisons in post-hoc-tests. Kappa coefficients were used to examine agreement among pathologists in classifying histological findings. Performances of noninvasive tests were evaluated by ROC curve analysis. Areas under ROC curves (AUC), summary measures of overall diagnostic performance, were reported with their 95% ROC, receiver operating characteristic; CD, celiac disease; ELISA, enzyme-linked immunosorbent assay.
confidence intervals (CIs); AUC > 0.9 was considered a high diagnostic test performance [14]. To rank the performance of tests when used for diagnosing CD, AUCs of ROC curves derived from group A were compared [15].
To rank the performance of tests when used for follow-up monitoring of CD, AUCs of ROC curves derived from group B cases were compared. Furthermore, we compared the performance of each single test when used for diagnosis vs. for follow-up monitoring of CD [16]. We also used ROC curve analysis to determine the optimal cut-off point for each test and calculated further performance measures like sensitivity, specificity, positive and negative likelihoodratios (LR + and LR-) with 95% CIs. Where there was a 2x2 table with an empty cell 0.5 was added to each cell. Tests with either a LR+ > 10 or LR-< 0.1 were considered informative and clinically useful [17]. Statistical calculations were performed with SPSS, version 20.0 (SPSS, Chicago, Ill), and the free software R [18]. For all statistical analyses, Bonferroni corrected two-tailed P-values < 0.05 were considered significant.

Ethics approval
The

Performance of antibody tests in group A
In predicting mucosal status in group A, TG2-IgA, DGP-IgA and DGP-IgG assays showed high diagnostic performance, exhibiting AUCs ≥ 0.96 (Figures 2 and 3, Table 2). TG2-IgG performed less well, exhibiting an AUC of about 0.85. Within group A1, all children had positive EMA, positive EMA being an inclusion criterion. Therefore, no performance evaluation for EMA using ROC curve analysis was done in group A.
Within group A2, positive EMA were found in 3 children (specificity 0.94; 95% CI 0.86 to 0.99) who all had Marsh 0 as the result of the small intestinal biopsy. Of these EMA-positive group A2 children, 2 tested positive for TG2-IgA as well. In 2 children, all antibody titres normalized on follow-up under gluten-containing diet within 6 months. One these two children still had Marsh 2 in a follow-up biopsy 28 months later. The third EMApositive group A2 child was placed on a GFD by her parents for 2.5 years before presenting for follow-up visit. At this visit, she was seronegative. She is currently undergoing a gluten challenge from November 2012 onwards. At her last visit in October 2013 she was still seronegative.

Performance of antibody tests in group B
Comparing the performance of antibody tests in predicting mucosal status in group A vs. B, all tests performed less well in group B (Figures 2 and 3, Table 2). This performance loss was significant at an uncorrected level of alpha = 0.05 in case of DPG-IgA (P uncorr. = 0.004). Empirical power (calculated using 1000 bootstrap samples) was 85% for this comparison, and on average 33% (range 13-43) for the non-significant comparisons. Within group B, EMA performed best followed by DGP-IgG and TG2-IgA.
Among the LRs obtained in group B, only the negative LR of EMA was low enough (0.097) to effectively rule out persistent mucosal injury (Table 2).
However, within group B, positive EMA were detected in 18 children, 12 of whom exhibited mucosal healing. Of these children with EMA-positivity despite mucosal

Histology
In group A1, where all children had histology consistent with CD, severe mucosal injury (Marsh 3B or 3C) was found in 28 children (87.5%) while the remaining 4 children (12.5%) showed Marsh 3a. In group A2, histology showed gastritis in 39 children, esophagogastritis in 8, esophagitis in 1, gastric ulcer in 1 and normal mucosa in 14.
Good inter-rater reliability among the pathologists was found, kappa coefficients were 0.81 and 0.74 for the biopsies from the bulb and from the descending part of the duodenum, respectively. Before reaching consensus, histological classification differed in 19 children (12.8%); however, in only 2 of these children, classification differences pertained to the presence or absence of mucosal injury (Marsh ≥ 2). The pathologists finally agreed on Marsh ≥ 2 in both children.
No adverse events of endoscopy including biopsies were encountered.

IgA-deficient children
In total, four children (2.7%), all girls, were IgA-deficient. Three girls belonged to group A1, one to group A2.

Discussion
In this study, we provide evidence that antibody tests are more reliably predicting mucosal status in children with CD before than in children after prescription of a GFD.
In our patients, according to AUCs (Figure 2), TG2-IgA, DPG-IgG and DPG-IgA antibody tests performed very well in diagnosing CD in group A, similar to the performances reported elsewhere [19]. However, when used for monitoring mucosal status in CD after a median of 2.2 years after primary diagnosis (group B), all tests suffered a performance loss turning out to be significant in case of DPG-IgA. According to LRs, additional parameters quantifying the non-invasive tests' performance, TG2-IgA, DGP-IgG and -IgA antibodies were most informative and clinically useful with respect to diagnosing CD in group A. Conversely, the limited ability to detect mucosal injury in group B was reflected by LRs + in all tests being < 10. Regarding LRs-, only negative EMA had an LR-< 0.1 thus being an informative and clinically useful marker of mucosal healing in CD [17].
Positive EMA, however, were detected in 18 children from group B. Twelve of these EMA-positive children showed mucosal healing, a finding that reflects faster mucosal recovery than EMA-seroconversion. Indeed, EMApositive children exhibiting mucosal healing had been on the GFD for a significantly shorter period of time than the EMA-negative children. Moreover, 9 of 12 developed EMA-negativity on further follow-up. This delayed seroconversion might partially explain positive EMA in subjects showing mucosal healing [10]. However, adherence to the dietary treatment was not evaluated in this study nor was small intestinal mucosa examined for IgA deposits [20]. Therefore, we cannot rule out that serum EMA-positivity is more sensitive than gross histological damage to detect minor dietary transgressions. Positive EMA without histological evidence of CD were also detected in 3 children from group A2. EMA normalized on a normal gluten-containing diet in 2 of the EMA-positive group B children while the third became seronegative on a self-prescribed GFD and stayed seronegative even after a 11-month gluten challenge. However, since EMA is a very strong predictor of a subsequent CD diagnosis [21], there is need of further follow-up in these children including endoscopy.
All the children participating in the study underwent EGD as criterion standard for the evaluation of the diagnostic reliability of the antibody tests. In the light of the possibility to diagnose CD without biopsies [6], an important finding of our study was that experiences with the diagnostic biopsy did not deter two thirds of the children eligible for group B from undergoing re-biopsy. We found mucosal healing in the majority of the re-biopsied children, 90% of them had Marsh < 2. All group B children with mucosal injury were ≥ 9 years old and belonged to the subgroup that had been on a GFD < 2 years. In children, the long-term mucosal healing rate was reported to be 100% and histological recovery might even occur after more than 2 years after primary diagnosis [22]. In contrast, it has been shown that children diagnosed after the age of 4 tend to follow the GFD less strictly and therefore are expected to have a higher prevalence of mucosal injury [23]. Surprisingly, a very low frequency (1.8%) of isolated increase of intraepithelial lymphocyte count (Marsh 1) was found among group B children. This finding could be related to the somewhat high cut-off for an increased intraepithelial lymphocyte count used in the study (≥ 30 lymphocytes/100 epithelial cells).
This study has several strengths. In the first place, it was conducted in children. Concerning the correlation between follow-up histology and serology, most of the studies investigating this correlation were conducted in adults [2,3,8,9,11,[24][25][26][27]. However, in one pediatric study on the value of DGP-antibodies in the follow-up of CD,  only 13 children had both re-endoscopy and follow-up serology [28]. Another study with children is of limited value in the aspect that only seroconverted children were included [29]. Then, in contrast to the current study, most of the studies reporting on correlations between follow-up histology and serologies are retrospective [2,26,28,29]. Another strength of the current study was the cut-off point being adjusted for the study population in case of all antibody tests according to ROC analysis (Table 1). This adjustment is especially important in children since manufacturers' cut-offs are usually based on data from adults. A further advantage was the use of AUCs. AUCs as effective single indicators of the agreement between a test and a reference standard facilitate the comparison of the overall performance between different non-invasive diagnostic tests [15,16]. Additionally, in order to increase the reliability of the reference standard, two pathologists, before reaching a consensus diagnosis, independently reviewed all biopsy samples, which had been taken according to guidelines [7,30,31]. Furthermore, we were able to systematically examine all recommended serologic tests. In contrast to recent recommendations for the use of DGP-IgA in monitoring treated CD [5], we found that DGP-IgA suffered a significant performance loss when used for follow-up in children. We therefore consider DGP-IgA less reliable for follow-up purposes compared with EMA, TG2-IgA and DGP-IgG. For the comparisons regarding the performance loss of the serologic tests other than DGP-IgA, mean empirical power was rather low, a finding we consider an important limitation of this study. Therefore, future research is needed to further clarify the correlation of EMA, TG2-IgA and DGP-IgG with follow-up histology or identify other reliable non-invasive follow-up tests in CD.

Conclusion
In conclusion, this study demonstrates the limited value of serologic testing in the follow-up of pediatric CD with respect to the mucosal status. Only the normalization of EMA indicates mucosal healing with acceptable accuracy. As long as there is a lack of more reliable tools for non-invasive follow-up, EMA should be used as followup tool of first choice. However, more reliable noninvasive follow-up tools would be of great clinical and research utility with respect to the individualization of the GFD strictness and upcoming studies evaluating the efficacy of new CD treatment modalities, respectively.