Symptom or faecal immunochemical test based referral criteria for colorectal cancer detection in symptomatic patients: a diagnostic tests study

Background Symptom based referral criteria for colorectal cancer (CRC) detection are the cornerstone of the strategy to improve prognosis in CRC. In 2017, the National Institute for Health and Care Excellence (NICE) updated their referral criteria (2017 NG12). Recently, several studies have evaluated the faecal haemoglobin (f-Hb) concentration in this setting. The aim of this study is to evaluate the diagnostic accuracy of the 2017 NG12 referral criteria and to compare them with the CG27 referral criteria, the f-Hb concentration and two f-Hb based prediction model: COLONPREDICT and FAST Score. Methods This is a post-hoc diagnostic test study performed within the COLONPREDICT study database (1572 patients, CRC prevalence 13.6%). We assessed symptoms, the 2017 NG12 and CG27 referral criteria and determined the f-Hb before performing a colonoscopy. We compared the discriminatory ability using the area under the curve (AUC) and the sensitivity and specificity at pre-stablished thresholds with the McNemar’s test. Results The 2017 NG12 referral criteria discriminatory ability (AUC 0.53; 95% confidence interval- CI 0.49–0.57) was inferior to the CG27 version (AUC 0.59; 95% CI 0.55–0.63; p = 0.01), the f-Hb concentration (AUC 0.86; 95% CI 0.84–0-89; p < 0.001), the COLONPREDICT Score (AUC 0.92; 95% CI 0.91–0.94; p < 0.001) or the FAST Score (AUC 0.87; 95% CI 0.85–0.89; p < 0.001). The number of patients meeting each criteria were as follows: 2017 NG12 and CG27 = 94.1% and 52.2%; f-Hb ≥20 and ≥ 10 μg/g faeces = 38.6 and 44.3%; COLONPREDICT Score ≥ 5.6 and ≥ 3.2 = 29.4 and 63.2% and FAST Score ≥ 4.50 and ≥ 2.12 = 37.1 and 87.0%. The 2017 NG12 criteria were more sensitive (100%) than the CG27 criteria (68.2%), the f-Hb (≥20 μg/g) (91.2%), the f-Hb (≥10 μg/g) (93.5%), the COLONPREDICT Score (≥5.6) (90.1%) and the FAST Score (≥4.50) (89.8%) (p ≤ 0.001) and equivalent to the COLONPREDICT Score (≥3.5) (99.5%) or the FAST Score (≥2.12) (100.0%) (p = 1). However, their specificity (6.8%) was significantly lower than any of the evaluated criteria (50.3%, 69.6%, 63.4%, 78.7%, 45.8%, 71.3%, 13.9%; p < 0.001). Conclusion Referral criteria based on f-Hb measurement, either as a single test or within prediction models, are more accurate than symptom-based referral criteria for CRC detection in symptomatic patients.


Background
Colorectal cancer (CRC) is the third most common cancer worldwide and the second leading cause of cancerrelated death [1]. Two strategies are widely used to detect the disease at an early stage and, thus, improve the prognosis: CRC screening and early diagnosis strategies in symptomatic patients [2,3]. Although screening programmes have been progressively implemented, most CRC are still detected when symptoms become apparent [4]. In addition, although gastrointestinal symptoms are extremely common in the population, the probability of CRC detection associated with any one symptom is low [5][6][7]. Thus, risk classification scores have been developed based on symptoms to determine which patients are most at risk of CRC with the aim of reducing this interval between the initial consultation and diagnostic colonoscopy [8,9].
In this regard, one of the best known referral criteria for CRC detection are the National Institute for Health and Care Excellence (NICE) referral guideline for suspected cancer (CG27) [3]. This referral system has been extensively evaluated showing a low specificity and a variable sensitivity for CRC detection [6,[10][11][12]. In order to improve these results, the updated version of 2015 (NG12) introduced two significant changes. First, they recommended referral for those symptoms with a positive predictive value of 3% instead of previous 5%. Second, for the first time, testing for occult blood in faeces was recommended in several symptom scenarios with a positive predictive value below 3% [13]. However, the guideline did not recommend any particular method to determine occult blood in faeces.
Faecal immunochemical tests for haemoglobin (FIT) allow for quantitation of faecal haemoglobin concentration (f-Hb). FIT has proven to be the best currently available non-invasive test for CRC screening in asymptomatic individuals and an excellent test for rule-in of CRC and rule-out of significant colonic lesions (SCL) in patients presenting with lower gastrointestinal symptoms [14][15][16][17][18][19][20][21]. On the basis of the available evidence [22], the NICE diagnostic guidance (DG30) recommends the use of FIT with a 10 μg Hb/g faeces to guide referral for colorectal cancer in primary care [23]. However, the effect of the NG12 is not well understood and only one study has evaluated the diagnostic accuracy of this guidance [24]. In July 2017, NG 12 was amended and testing for occult blood in faeces was recommended in patients without rectal bleeding but with unexplained symptoms that do not meet the criteria for a suspected cancer pathway [13].
We have recently developed and validated two f-Hb based prediction models for CRC detection: COLON-PREDICT and FAST. The database of the COLONPRE-DICT Score derivation cohort [25,26]. is an excellent platform to compare the most widely symptom based referral criteria with the f-Hb concentration based strategies. In this database, an extensive collection of information regarding symptoms as well as several blood and faecal determinations are included. This information allowed us to perform a post hoc analysis in order to evaluate the diagnostic accuracy of the 2017 NG12, compare these criteria with the CG27, the f-Hb concentration and two CRC prediction models based on the f-Hb concentration: COLONPREDICT and FAST Scores [25,26].

Study design
The current study is a post hoc analysis performed within the COLONPREDICT study: a multicentre, cross-sectional, blinded study of diagnostic tests. The study aimed to create and validate a CRC prediction index based on available biomarkers, clinical and demographical data. We performed this post hoc analysis in the 1572 patients included in the derivation previously described [25].

Brief description of the COLONPREDICT study
The details of the study have been described extensively elsewhere and are summarized here [25,26]. We used the Colonoscopy Research into Symptom Prediction questionnaire (CRISP) to record symptoms and demographic data [27]. Based on this questionnaire, they determined if patients met the CG27 referral criteria for CRC detection [3]. f-Hb concentration was assessed using the automated OC-SENSOR MICRO analyser (Eiken Chemical Co., Ltd., Tokyo, Japan). The faeces for the f-Hb determination were collected using the OC-Sensor probe. Moreover, we determined blood haemoglobin (b-Hb) and mean corpuscular volume with a Beckman Coulter Autoanalyzer (Beckman Coulter Inc., CA, USA). Colonoscopy was performed blind for the questionnaire and analytical results.

NG12 referral criteria and the f-Hb based prediction models calculation
On the basis of the information obtained from the CRISP questionnaire and the analysis performed (f-Hb, b-Hb and mean corpuscular volume), we determined which of the 2017 NG12 criteria for CRC suspicion were met. Two researchers (JMH and JC) independently decided the equivalence between each NICE criteria and the information collected. Finally, they reached a consensus version. NG12 referral criteria are shown in Table 1 [13]. We considered a positive faecal occult blood test if the f-Hb concentration was ≥10 μg Hb/g faeces.
COLONPREDICT score is a CRC prediction model based on a multivariable logistic regression analysis [25]. The COLONPREDICT score is based in eleven variables and the mathematical formula is as follows: 0.789 x rectal bleeding + 0.536 x change in bowel habit + 2.694 x rectal mass − 1.283 x benign anorectal lesions + 2.831 x f-Hb ≥20 μg Hb/g faeces + 1.561 x b-Hb (< 10 g/dL) + 0.588 x b-Hb (10-12 g/dL) + 1.511 x CEA ≥3 ng/mL + 0.040 x age (years) + 0.813 x sex (male) -2.073 x previous colonoscopy (last 10 years) -0.849 x continuous treatment with aspirin. It shows a high diagnostic accuracy for CRC detection. Two thresholds have been defined with 90% and 99% sensitivity for CRC: 5.6 and 3.5.
FAST Score is a CRC prediction model based on a multivariable logistic regression analysis [26]. The FAST score is based on three variables and the mathematical formula is as follows: 0 x f-Hb (0) μg Hb/g faeces 0.684 x f-Hb (1, 19) + 2.824 x f-Hb [20,200) μg Hb/g faeces + 4.184 x f-Hb ≥200 μg Hb/g faeces + 0.031 x age (years) + 0.479 x sex (male). Two thresholds have been defined with 90% and 99% sensitivity for CRC: 4.50 and 2.12.

Statistical analysis
First, we performed a descriptive analysis of the population included in the study. In order to determine differences in diagnostic accuracy between the NG12 referral criteria and the rest of diagnostic criteria, the CG27 referral criteria, the f-Hb concentration, the COLONPRE-DICT and the FAST score we performed two analysis. First, we determined the number of individuals with a positive result and the sensitivity and the specificity for CRC and SCL detection. We determined if the differences between the sensitivity and the specificity of the NG12 referral criteria and the rest of diagnostic criteria, CG27 referral criteria, the COLONPREDICT and the FAST scores at the pre-stablished thresholds and the f-Hb at a 10 and 20 μg Hb/g faeces concentration threshold, were statistically significant using the McNemar's test. Finally, we also calculated the positive and negative predictive value (PPV, NPV), the positive and negative likelihood ratios (LR) and the diagnostic Odds Ratio (OR) of all the diagnostic tests. Diagnostic OR is defined as the odds of positivity in subjects with disease relative to the odds in subjects without disease.
In a second step, we evaluated the discriminatory ability using receiver-operating characteristic (ROC) curves for CRC and SCL diagnosis, and we calculated the area under the curve (AUC). We determined whether there were statistically significant differences using the chi-square test of homogeneity of areas. Additionally, we determined if there were differences in the discriminatory ability of each of the diagnostic criteria according to the healthcare level referring the patient to colonoscopy. Primary healthcare referral was determined when a general practitioner was requesting the colonoscopy and secondary healthcare referral was determined when a specialist (gastroenterologist, surgeon..) was requesting the exploration.
We report differences with 95% confidence intervals (CI) and their significance. We consider a p-value < 0.05 statistically significant. We carried out the analyses using the IBM SPSS Statistics for Windows version 21.0 (IBM Corp, Armonk, USA) and EPIDAT 3.1 (Dirección Xeral de Saúde Pública, Santiago de Compostela, Spain).

Description of the cohort
Among the 1572 patients included in the derivation cohort of the COLONPREDICT, a CRC was detected in 214 (13.6%) patients and a SCL in 463 (29.5%) patients: advanced adenomas in 251 (16.0%), a polyp ≥10 mm with non-adenoma histology in 6 (0.4%), colitis in 36 (2.3%) and other SCLs in 6 (0.4%) patients. Direct referrals from primary care to endoscopic evaluation accounted for 22.9% of the patients included. As we show in the Table 1, 1,479 out of the 1572 (94.1%) met at least one of the 2017 NG12 referral criteria. In contrast, 52.2% of the patients met any of the CG27 referral criteria, 38.7% had a f-Hb concentration ≥ 20 μg Hb/g faeces, 44.4% had a f-Hb concentration ≥ 10 μg Hb/g faeces, 30.9% had a COLONPREDICT Score ≥ 5.6, 60.5% had a COLONPREDICT Score ≥ 3.5, 37.1% had a FAST Score ≥ 4.50 and 88.0% had a FAST Score ≥ 2.12.

Analysis of the diagnostic accuracy
The sensitivity of the 2017 NG12 referral criteria for CRC detection reaches 100% at the expense of a low specificity (6.8%). As we show in the Table 2, the sensitivity of the 2017 NG12 referral criteria is superior to the sensitivity of the CG27 referral criteria, the f-Hb (≥20 μg Hb/g and ≥ 10 μg Hb/g faeces), the COLON-PREDICT Score at a 5.6 threshold (p < 0.001) and the FAST Score at a 4.50 threshold. In contrast, the sensitivity is similar to the COLONPREDICT Score at a 3.2 threshold and the FAST Score at a 2.12 threshold (p = 1) and the specificity is inferior to any of the other criteria (p < 0.001). The rest of the diagnostic accuracy analysis is displayed in Table 2.
On the other hand, 2017 NG12 referral criteria allows the diagnosis of 98.9% of SCL. As in the diagnostic accuracy for CRC detection, the specificity is extremely low (7.9%). As we show in the Table 3, the sensitivity of the 2017 NG12 referral criteria is similar to the FAST Score at a 2.12 threshold (p = 1) and superior the rest of the evaluated criteria. In contrast, the specificity of the 2017 NG12 criteria is inferior to any of the additional criteria evaluated (p < 0.001). We show the PPV, NPV, positive and negative LR and the diagnostic OR in Table 2.

Analysis of the discriminatory ability
The analysis of the discriminatory ability for CRC detection of the NICE referral criteria, the f-Hb concentration, the COLONPREDICT and the FAST Score is shown in Fig. 1. The discriminatory ability of the 2017 NG12 referral criteria is inferior to any of the evaluated criteria in the Chi-square homogeneity test comparison of AUC. Additionally, we found no differences in the performance of each diagnostic test in the evaluation of the discriminatory ability according to the healthcare referring the patient to colonoscopy: 2017 NG12 referral criteria (primary = 0.53, 95% CI 0.46-0.60; secondary = 0.53, 95% CI 0.48-0.58; p = 0.  Fig. 2 shows the discriminatory ability for SCL detection of the diagnostic tests evaluated. The discriminatory ability of the 2017 NG12 referral criteria is similar to the CG27 referral criteria and inferior to the rest of the evaluated criteria in the Chi-square homogeneity test comparison of AUC.

Summary
We have evaluated the diagnostic accuracy of the 2017 NG12 referral criteria for suspected CRC. As we clearly show, these updated criteria are more sensitive than CG27 version. However, they produce a marked increase in the number of patients meeting them and a reduction in the specificity. Furthermore, we had the opportunity to compare them with the f-Hb concentration and two f-Hb based prediction model. As we clearly show, both diagnostic tools have a higher discriminatory ability than the NICE referral criteria.

Strengths and limitations
We have used a wide cohort of consecutive patients referred to colonoscopy due to gastrointestinal symptoms. Patients were evaluated homogenously using a symptom questionnaire and several analytics, including a FIT, were performed before colonoscopy. This questionnaire allowed us to gather all the details regarding type of symptoms, duration and evolution. Thus, we could evaluate the 2017 NG12 referral criteria for suspected CRC for the first time. Furthermore, we have been able to evaluate the use of FIT, with the 10 μg Hb/g faeces, as recommended in the DG30 [23]. On the other hand, our study has several limitations that must be taken in consideration. We cannot exclude a risk of bias of selection, as long as the symptomatic patients included in the study were previously selected for colonoscopy evaluation. However, we have included all consecutive patients referred both from primary and secondary healthcare to colonoscopy.

Comparison with existing literature
The update of the NICE referral criteria is based on reducing the PPV threshold required to refer patients from primary care using a suspected cancer pathway referral. The guideline development group agreed to use a threshold value of 3% PPV to underpin their recommendations [13]. So, those patients with symptoms (i.e ≥ 40 years with unexplained weight loss) with a PPV > 3% for CRC should be referred for further testing. Our results clearly demonstrate that this strategy increases the sensitivity for CRC detection in comparison with previous criteria. However, these criteria certainly introduce a risk of over investigation. In fact, what our analysis  Values are expressed as percentages and its 95% confidence interval 2 Significance of the sensitivity differences when compared with the NG12 referral criteria in McNemar's test. Differences with p < 0.05 are considered statistically significant 3 Significance of the specificity differences when compared with the NG12 referral criteria in McNemar's test. Differences with p < 0.05 are considered statistically significant 4 Values are expressed as absolute numbers and its 95% confidence interval PV, predictive value; LR, likelihood ratio; OR, odds ratio; f-Hb, faecal haemoglobin; NE, non evaluable confirms is that the discriminatory ability of any group of symptoms for CRC detection is suboptimal [6]. An additional innovation of the 2017 NG12 referral criteria is the inclusion of the faecal occult blood test in the evaluation of symptomatic patients. However, its use is only limited to patients without rectal bleeding and with unexplained symptoms that do not meet the criteria for a suspected cancer pathway referral [13]. Our results confirm the data previously published: the f-Hb concentration measured with a FIT shows a higher discriminatory ability for CRC detection than the NICE referral criteria [14][15][16][17][18][19]. So, probably, the strategy for the evaluation of the risk of CRC detection in symptomatic patients should be based on the f-Hb concentration irrespective of symptoms. Actually, in the COLONPRE-DICT Score, patients with a f-Hb concentration ≥ 20 μg Hb/g faeces have 17.0 times more risk of CRC detection. In contrast, patients with rectal bleeding or a change in bowel habit have 2.2 and 1.7 times more risk of CRC detection, respectively [25].
Recently, an article has evaluated the diagnostic accuracy of the 2017 NG12 referral criteria for CRC and SCL detection and compared these criteria with the f-Hb concentration [24]. This study used the database of three diagnostic tests studies evaluating FIT in symptomatic patients [15,17,19]. and shows that the discriminatory ability of the 2017 NG12 referral criteria are inferior to the f-Hb concentration. This cohort has significant differences with ours: the prevalence of symptoms related to CRC diagnosis, rectal bleeding, changes in bowel habit, iron-deficiency anaemia or rectal mass, is inferior as well as the prevalence of CRC or SCL. Probably, these differences are responsible for the differences in the number of patients that meet 2017 NG12 referral criteria and in the inferior discriminatory ability documented in our study. However, are results are consistent

Implications for research and/or practice
One of the main lessons learned in these years from the CRC screening programs is that lack of symptoms or the presence of non-specific symptoms do not exclude a CRC in adult population. Up to 20% of the incident CRC are detected in asymptomatic patients within a CRC screening program based in a guaiac faecal occult blood test [4]. So, the strategies for CRC detection in symptomatic patients should determine which patients require urgent referral, which require a normal referral and, finally, in what situations no additional evaluation is required. The NICE referral criteria only determine the scenarios where an urgent referral is required. Due to the increased discriminatory ability of the FIT for CRC, either the f-Hb concentration alone or a f-Hb based prediction model can allow to establish these three risk groups with different diagnostic strategies. As we have recently proposed, at least 90% of CRC should be detected in a high-risk group, requiring a fast-track referral to colonoscopy. In contrast, in a low-risk group, where no additional explorations are required, the probability of a missing CRC should be well below 1%, so that the risk of CRC is balanced with the risk of colonoscopy complications, mainly perforation [28].

Conclusions
To conclude, the discriminatory ability of any symptom based criteria is limited when compared with a f-Hb concentration based strategy. An urgent evaluation of the diagnostic accuracy of FIT in symptomatic patients attending primary care is required.  1 Significance of the discriminatory ability differences when compared with the NG12 referral criteria in Chi square homogeneity test. Differences with p < 0.05 are considered statistically significant. ROC, Receiveroperating characteristics; NICE, National Institute for Health and Care Excellence

Funding
This study was supported by a grant from Instituto de Salud Carlos III, Madrid, Spain (PI11/00094). Instituto de Salud Carlos III had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. JC has received an intensification grant through the European Commission funded "BIOCAPS" project (FP-7-REGPOT 2012-2013-1, Grant agreement no. FP7-316265).

Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information files.

Authors' contributions
The authors' contributions were as follows: JH and JC designed the analysis, PV, MS and LB collected the data, JH and JC performed the analysis and wrote the manuscript, and all the authors decided to submit the article for publication. All authors had full access to all of the data (including statistical reports and tables) in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis. JC had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. All authors read and approved the final manuscript.
Ethics approval and consent to participate Galician Clinical Research Ethics Committee approved this study (Code 2011/ 038) under resolution dated 2nd March 2011. We accessed patients' clinical histories for study purposes in accordance with the research protocols laid down by clinical documentation departments. Patients provided written informed consent.

Consent for publication
Not applicable.
Competing interests JC and MS had financial support from Instituto de Salud Carlos III for the submitted work but they had no financial relationships with any organisations that might have an interest in the submitted work in the previous five years, and no other relationships or activities that could appear to have influenced the submitted work. The remaining authors had no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous five years and no other relationships or activities that could appear to have influenced the submitted work.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Author details