Systematic review and meta-analysis: real-world data rates of deep remission with anti-TNFα in inflammatory bowel disease

Background Deep remission (DR) is a treatment target in IBD associated with reduced hospitalization and improved outcome. Randomized control trial (RCT) data demonstrates efficacy of anti-TNFα agents in achieving DR; however, real-world data (RWD) can provide information complementary to RCTs, specifically regarding treatment duration. In this systematic review with meta-analysis, we use real-world data (RWD) to determine rates of DR in IBD treated with anti-TNFα. Methods We completed a systematic search of MEDLINE and EMBASE on July 8, 2019 with review of major gastrointestinal conference abstracts from 2012 to 2019. Studies utilizing RWD (data not from phase I-III RCTs) of adult IBD patients treated with anti-TNFα agents were included. DR was defined by clinical and endoscopic remission at minimum. DR was assessed at 8 weeks, 6 months, 1 year, and 2 years. Risk of bias was assessed with the Newcastle Ottawa Scale. Results 29,033 publications were identified. Fifteen publications, nine manuscripts and six conference abstracts, were included encompassing 1212 patients (769 Crohn’s disease-CD, 443 ulcerative colitis-UC), and analyzed using Comprehensive Meta-Analysis. Rate of DR was 36.4% (95% CI 12.6–69.4%) at 8 weeks, 39.1% (95% CI 10.4–78%) at 6 months, 44.4% (95% CI 34.6–54.6%) at 1 year, and 36% (95% CI 18.7–58%) at 2 years. DR in CD at 1 year was 48.6% (95% CI 32.8–64.7%) and in UC was 43.6% (95% CI 32.8–55.1%). Conclusions The rate of DR was highest after 1 year of therapy, in nearly 45% of IBD patients treated with anti-TNFα. Similar rates were achieved between patients with UC and CD. The findings highlight the efficacy of anti-TNFα in real-world setting. Future studies using RWD can determine efficacy of newer IBD therapeutics in routine clinical practice. Supplementary Information The online version contains supplementary material available at 10.1186/s12876-021-01883-6.


Background
Deep remission (DR) is a proposed treatment target in inflammatory bowel disease (IBD) that is increasingly being used as a benchmark in efficacy studies and randomized controlled trials (RCT) [1]. The most common definition for DR is concurrent clinical remission (CR) and endoscopic remission (ER) or mucosal healing (MH) [2]. DR is associated with longer periods of durable remission, improvement in quality of life, reduced hospitalization, and a decreased rate of surgical complications [3][4][5][6]. Therefore, there is great interest in determining the rate of achieving DR with various treatment strategies.
Recent meta-analyses have examined the rate of achieving DR with anti-TNFα agents in randomized controlled trials (RCTs) among ulcerative colitis (UC) patients [7], but none have evaluated DR in a real-world environment or in patients with Crohn's disease (CD). Differences between the efficacy of a drug's performance during a clinical trial and its effectiveness during use in everyday clinical practice has been described as the "efficacy-effectiveness gap" [8]. RCTs, though the ideal study design to demonstrate effectiveness and safety of a medication, are carried out in selective and controlled manner leading to high internal validity, but leaving uncertainty about their generalizability for an ethnically diverse and heterogenous population [9]. This possible lack of generalizability has also been demonstrated within the IBD population [10], and therefore creates a role for real world data (RWD) to fill [11].
In this systematic review with meta-analysis, we aim to provide complementary information by using RWD to determine rates of deep remission in IBD with anti-TNFα agents in clinical practice. Additionally, we perform sub-analyses to provide the rates of DR with anti-TNFα separately in patients with CD and UC. Furthermore, we explored the treatment duration at which DR is most likely to be seen, and the rate of DR in patients not previously treated with anti-TNFα.

Methods
The current study, including abstract and manuscript content, was completed in accordance with the PRISMA statement and checklist (Additional file 1: Tables S1, S2) [12].

Data sources and searches
We completed a systematic search of MEDLINE and EMBASE up to July 8, 2019 (see Additional file 2: Text/ Appendix 1 for search strategy), using the following search terms: ("inflammatory bowel disease" OR "IBD" OR "crohn*" OR "ulcerative colitis" OR "UC" or "colitis") AND ("mucosal healing" OR "deep remission" OR "complete remission" OR "full remission" OR "endoscopic remission"). This search was conducted without restrictions on year or language. We manually searched through abstracts presented at major national and international gastrointestinal conferences from 2012 to 2019 (Digestive Disease Week, United European Gastroenterology Week, European Crohn's and Colitis Organization, the American College of Gastroenterology Annual Scientific Meeting, Advances in Inflammatory Bowel Diseases, and the Crohn's and Colitis Congress). The reference sections of manuscripts included were also reviewed for additional studies to be evaluated for inclusion. Two authors (OA and AG) independently conducted this review. A third author (BZ) reviewed studies not agreed upon for inclusion. A cursory updated search of MEDLINE and EMBASE was performed by one author (BZ) from July 8, 2019 to April 25, 2021 (see Additional file 2: Text/Appendix 1). This systematic review was not pre-registered and a prior review protocol was not prepared.

Selection criteria
We included studies that presented real world data (RWD)/real world evidence (RWE), defined as all health data except those collected in a conventional phase I, II, or III RCT setting, including non-randomized controlled group studies. We included studies examining adults (18 years or older) with inflammatory bowel disease treated with anti-TNFα agents until the achievement of "deep remission" (DR), defined as at least a combination of clinical remission and mucosal healing/endoscopic remission [2]. Search results were carefully reviewed to identify remission targets consistent with common definitions of deep remission given many did publications did not explicitly state the term "deep remission" as an end point.
Case reports, case series, randomized trials, and non-English studies were excluded. Studies that did not define DR or did not identify components of DR to include at least clinical and endoscopic remission were excluded. Studies with a pediatric population were excluded to maintain a focus on adult patients.
The primary outcome was real-world rates of DR with anti-TNFα agents for the treatment of IBD at intervals of 8 weeks, 6 months, 1 year and 2 years after starting anti-TNFα. Secondary outcomes included rates of DR among UC and CD at 1 year after starting anti-TNFα, the rates of DR in patients naïve to, or not previously treated with, anti-TNFα, and the rates of DR with infliximab.

Data extraction and risk of bias assessment
Two authors (OA and AG) independently extracted the following data onto a data collection form: first author's name, last author's name, publication year, country, single or multiple institutions, study design, type of IBD, type of anti-TNFα used, concomitant or maintenance therapy, definition of deep remission, definition of mucosal healing/endoscopic remission, definition of clinical remission, and the number of participants who achieved deep remission at pre-determined time points (8 weeks, 6 months, 1 year, and 2 years).
All studies were deemed cohort studies based on the intervention of interest (treatment with anti-TNFα agents). Risk of bias was assessed independently by two authors (OA and AG) using the Newcastle-Ottawa Scale [13]. Any inconsistencies between the authors' scores were discussed and resolved. Out of nine possible stars, studies were considered at high risk of bias if they received 0-3 stars, intermediate risk if 4-6 stars, and low risk if 7-9 stars. The quality of evidence was determined based on the GRADE (Grading of Recommendations Assessment, Development, and Evaluation) system [14]. Quality of evidence ranges from "high" to "moderate" to "low" and "very low" based on the effect future research is expected to have and the certainty of the findings.

Data synthesis
To account for anticipated inherent heterogeneity in the designs of the included studies (for example, retrospective versus prospective, definitions of deep remission, anti-TNFα agents used, patient populations, etc.), pooled event rates and corresponding 95% confidence intervals (95% CI) were calculated using the randomeffects model per DerSimonian and Laird and inverse variance method for dichotomous outcomes [15]. Between-study heterogeneity was assessed with the chi-square test with significance defined as p < 0.1, and the I 2 test at > 50% [16]. Publication bias was assessed with funnel plot and Egger test (Additional file 4: Figure S2). All analyses were performed using Comprehensive Meta-Analysis (version 3; Biostat, Englewood, NJ, USA, 2013).

Search results
The search strategy identified 29,033 publications. After a review of titles, abstracts, and exclusion of duplicates, 756 articles underwent thorough review (Fig. 1). Application of the exclusion criteria yielded fifteen studies (9 manuscripts, 6 conference abstracts), encompassing a total of 1212 patients (Table 1) [17][18][19][20][21][22][23][24][25][26][27][28][29][30][31]. A diagnosis of CD was captured for 769 patients, and a diagnosis of UC was provided for 443 patients. A cursory updated search of MEDLINE and EMBASE using the same strategy as above from July 8, 2019 to April 25, 2021 yielded 1722 new publications (596 MEDLINE, 1126 EMBASE). 93 publications underwent thorough review; none included data meeting inclusion criteria. Most excluded studies were not eligible for inclusion because they did not meet the minimum criteria for deep remission, length of follow up, or sample size.

Quality of studies and risk of bias
The Newcastle Ottawa Scale (NOS) was used to evaluate and assign a point value to each study for quality and risk of bias (Additional file 1: Table S4) [13]. Studies received a point for "adequacy of follow up of cohorts" if their reported outcomes accounted for attrition. All included studies received between five or six points on the NOS, suggesting that they carried an intermediate risk of bias. Two studies, De Vos 2013 and Zhang 2016, included patients already in clinical remission, additional sensitivity analyses were run with these studies excluded (Additional file 3: Figure S1) [17,22].  Figure S2) and Egger's test for both 8 weeks and 6 months did not detect publication bias (8 week: Egger's t-value 0.056, p = 0.480; 6 month: Egger's t-value = 2.002, p = 0.091). Heterogeneity with these analyses reflected as an I 2 value were 94% and 93.7%, respectively, suggesting considerable heterogeneity of included studies (Additional file 1: Table S3). Nine studies reported the rate of DR at 1-year followup [17, 19, 20, 22, 24-26, 28, 31], with 44.4% (95% CI 34.6-54.6%) of patients (285/616) achieving DR. Five studies reported rates of DR at 2 years, with 36% (95% CI Fig. 2 Rates of deep remission in IBD at 8 weeks, 6 months, 1 year, and 2 years 18.7-58%) of patients (182/490) in DR (Fig. 2) [18,20,22,23,27]. The only two studies with five points in the NOS were in the DR at 2 years analysis, introducing higher risk of bias and uncertainty in this analysis compared to the 8 week, 6 month, and 1 year analyses. For 1 year, De Vos 2013 and Zhang 2016 only included patients already in clinical remission [17,22]. Sensitivity analysis with Zhang 2016 and De Vos 2013 removed demonstrated a 42.3% rate of deep remission at 1 year, and sensitivity analysis with Zhang 2016 removed at 2 year analysis had a deep remission rate of 27.8% (Additional file 3: Figure  S1). Funnel plots (Additional file 4: Figure S2) and Egger's test at one-year and two-years did not demonstrate publication bias (1 year: Egger's t-value = 0.703, p = 0.252; 2 years: Egger's t-value = 0.673, p = 0.275). Heterogeneity within these analyses, reflected as an I 2 value, were 80.6% and 92.6%, respectively, suggesting considerable heterogeneity of included studies (Additional file 1: Table S3). The GRADE quality of evidence for this analysis is 'low' .

Deep remission in patients treated with infliximab
The majority of studies primarily included patients treated with infliximab (IFX), therefore additional analyses excluding studies which did not utilize infliximab   1 year (Fig. 5). Sensitivity analysis at 2 years with the non-IFX studies Echarri 2015 removed, and excluding non-IFX cases from Molander 2013, resulted in a deep remission rate of 36.8% (Fig. 5). Sensitivity analysis with Echarri 2015 removed found a 51.5% rate of deep remission in CD at 1 year in patient's receiving infliximab (Fig. 5). Analysis of deep remission in UC at 1 year demonstrated a rate of 39.9% with noninfliximab studies removed (Fig. 5). The I 2 value for DR with IFX at one and 2 years was 82.4% and 94.8%, respectively. The heterogeneity value for DR with IFX only in CD and UC at 1 year was 82.8% and 72.1%, respectively. The GRADE quality of evidence was determined to be 'low' for this analysis.

Discussion
The ongoing development of novel targeted therapeutics has improved our ability to achieve clinical and endoscopic remission. While the efficacy of anti-TNFα agents achieving clinical remission has been established, evidence suggests that deep remission (DR) provides more durable remission [3][4][5]. Newer guidelines provided by the American College of Gastroenterology (ACG) and the International Organization for the Study of Inflammatory Bowel Disease (IOIBD) recommend mucosal healing with clinical remission as preferred treatment targets in UC and CD [32][33][34]. With the introduction of newer therapies such as ustekinumab, vedolizumab, tofacitinib, in addition to anti-TNFα agents, patients and gastroenterologists have more personalized treatment options suitable for long-term use. Therapeutics should be continued despite achieving deep remission, as withdrawal of therapy after achieving DR is associated with high rate of relapse [2]. Therefore, while efficacy of an agent is important, other factors including side-effect profile, cost, clinician experience, patient preference and comorbidities, and availability should be considered [35].
Anti-TNFα agents, the oldest and most well-studied biologic class in the treatment of IBD, carry multiple advantages over alternative biologics. In addition to their superior clinical efficacy, long-term outcomes and sideeffect profiles are well described, and systemic effect enables the concurrent treatment of rheumatologic diseases. Furthermore, infliximab is available as biosimilars and adalimumab allows the option of administration via injectables [36,37]. With regards to efficacy, a 2018 metaanalysis of RCTs estimated the efficacy of infliximab and adalimumab in achieving remission in CD [38]. Furthermore, a more recent 2020 review and network metaanalysis of RCTs estimated outcomes consistent with deep remission in UC using infliximab, adalimumab, and ustekinumab [7].
Real-world data (RWD), though acquired via cohort studies rather than randomized controlled trials, offers complementary information, providing generalizable clinical efficacy that can be compared to results reported by RCTs. Although considered to provide lower quality evidence, utility of RWD has recently been demonstrated by the VICTORY consortium, established to evaluate the efficacy of vedolizumab in CD and UC patients based on RWD gathered retrospectively from multiple institutions [39,40]. GEMINI 1 reported a 41.8% to 44.8% rate of remission (Mayo <=2, no subscore >1) at 52 weeks, similar to the 41% rate of endoscopic remission (Mayo subscore = 0 ), clinical remission rate of 51%, and deep remission rate of 30% at 1 year follow-up reported by the VICTORY Consortium [40,41]. While significant differences in study design and patient enrollment exist between GEMINI and VICTORY precluding direct comparison, the findings highlight the relevance of RWD for clinical decision-making and for directing future therapeutic research.
RWD has even been incorporated into recent guidelines published by the British Society of Gastroenterology and the United Arab Emirates consensus paper on diagnoses and management of IBD. These guidelines describe similar rates of clinical remission in UC treated with golimumab in both RWD sources and RCTs. There were further examples of similar outcomes derived from both data sources with regards to the efficacy of vedolizumab in UC, and separately the efficacy of adalimumab in UC [42,43].
Our meta-analysis of fifteen real world studies of anti-TNFα use in CD and UC demonstrates that RWD DR rates supplement rates reported in existing phase III trial data and provides data in the setting of a potential efficacy-effectiveness gap. Though no clinically significant difference can be derived from the data, we observed a modestly higher observed rate of DR in UC in real-world studies. We report a DR rate of 48.6% at 1 year in CD using RWD, providing similar results compared to a previous meta-analysis of RCTs [38]. These results corroborate the findings of prior RCTs with regards to efficacy of anti-TNFs. An additional observation was that the rate of DR after 1 year of treatment was higher than earlier time points; following this peak, DR rates diminished by 2 years, suggesting that the greatest therapeutic benefit from anti-TNFα may be realized within the first 12 months. In sub-analysis, the rate of DR in anti-TNFα naïve patients at 1 year was 47.2% (95% CI 34.5-60.4%), similar to the DR rate at 1 year in all patients. Finally, we observe a small increase in the rate of deep remission when only including studies that evaluated response to infliximab.
This meta-analysis with systematic review is the first to comprehensively report DR with anti-TNFα agents based on RWD, using a strictly pre-defined definition of DR as clinical remission combined with endoscopic remission. We thoroughly reviewed the literature by incorporating results from Pubmed and EMBASE in addition to conference abstracts and review of references from publications. We additionally report remission rates at predefined time points. The inclusion of only RWD provides clinical effectiveness data in clinical practice settings, complementary and comparable to results reported by RCTs [11]. We anticipate the findings will help guide clinical decision making and elucidate the generalizability of these treatments to diverse and heterogenous populations.
There are several limitations. Constrained by available studies, we could not directly compare differences in DR rates between CD and UC. The limitation in number of available studies also precluded analysis of CD and UC at the 8 week, 6 month, and 2 year time points. Most studies utilized infliximab, therefore we were unable to provide a head-to-head comparison of biologic agents. We attempted to account for heterogeneity of biologics with additional analyses including only studies conducted with infliximab. Furthermore, paucity of available publications precluded the inclusion of newer therapeutic options. Adverse events were poorly reported in the included studies and were not able to be addressed in this analysis. We limited our search to English language publications, potentially introducing language bias into our results. Additionally, we recognize that there are varying definitions and sources of RWD, and therefore elected to use definitions and sources similar to those used in recent meta-analyses of RWD [39,40]. Heterogeneity attributing to study design, use of cohort studies rather than RCTs, varying severity of disease in included patients, variations in concomitant medication usage, and differences in defining DR and endoscopic remission were expected given the utilization of RWD. The retrospective nature of some included studies also poses risk of bias, in particular with retrospective calculation of CDAI in patients with CD. Given the non-randomized nature of the included studies, there is significant risk of selection bias and potential confounders within individual studies. Our assessment is that the certainty of our findings are consistent with a low GRADE certainty rating given uncertainty of how biases may have influenced our results. This is due to the observational nature of the studies included which generated real-world data.