Our findings demonstrate that a predictive model based on sociodemographic variables (age, gender and education level), pertinent medical history (previous colonoscopy, smoking, use of NSAID or aspirin, previous polyps, and IBS) and symptoms (rectal bleeding, rectal mucus, anaemia and fatigue), does well at predicting colorectal cancer and reasonably well at predicting advanced adenomas.
It is of interest to identify which variables are most strongly predictive of cancer and adenoma prevalence. Age is the dominant risk factor for cancer and for adenomas of all sizes. Having had a colonoscopy within the previous 10 years confers protection for cancers and advanced adenomas. Adding medical history and symptoms (rectal bleeding, mucus, anaemia and fatigue) to the model adds further modest improvement to cancer prediction, but negligible improvement to adenoma prediction.
Our finding that family history is not associated with an increase in prevalence of colorectal cancer may seem surprising. It is likely that this reflects the clinical setting of our cohort, with patients with a family history of colorectal cancer already having been screened and included in those having undergone colonoscopy previously. Other studies have also noted that in people with symptoms a positive family history does not increase the cancer prevalence [19, 20], and indeed, guidelines for referral of patients in place in Britain which aim to identify patients with higher risk symptoms, do not include assessment of family history .
The quality of our study relates to several factors including the size of our study with over 8,000 patients, the prospective nature of the data collection, the completeness of information on all patients, the requirements of complete examination of the entire colon, and pathological examination of all lesions encountered. Information about symptoms was also consistently collected using a validated questionnaire . A further strength of our study is that it represents a heterogeneous population which reflects what occurs in clinical practice in the real world and allows exploration of what factors that make up that heterogeneity predict the probability of cancer or adenomas. A potential limitation of our study was that there was no standard reporting for colonoscopy. However, the reports from which data were extracted were those used in clinical practice; based on a caecal intubation rate of 98% we believe the procedures were of high quality.
Our model does well at predicting cancer prevalence, achieving an area under the ROC curve of 0.83 which is similar to that found in other studies, for example Selvachandran (0.86) . Our models help to identify individuals who have a high probability of cancer amongst people referred to gastroenterologists and colorectal surgeons, thus helping to indicate the urgency for colonoscopy. At the low-risk end of the spectrum, prediction can be simplified to age: the probability of cancer or adenoma is very low in people under 40 and reduced still further if they have had a colonoscopy in the previous 10 years. For them, potential risks of colonoscopy may outweigh potential benefits. Consideration can be given to discussing benefits and harms of the procedure with patients to reach the best benefit-harm trade-off for each person, as has been done in other areas of health care .
In addition, risk information from the model can be useful at a policy level. Decision making about resource utilisation at a population level should take risk assessment into account to ensure that colonoscopy is prioritised to groups at higher risk of disease. At a general practice level, resources may, for example be directed to ensure that those in higher risk groups are referred for colonoscopy, while at a specialist level resources should be targeted at those who have never had a colonoscopy rather than for inappropriate, frequent colonoscopy. At a population level, symptoms as warnings for cancer or adenomas should be de-emphasised. Our model is not strictly applicable to patients presenting to a general practice. However, it is not feasible to do a study in patients presenting to a general practitioner and obtain colonoscopies on all patients. Indeed, the major symptom prediction studies in patients have been done in referred populations [22, 24, 25]. Our cancer prevalence is considerably lower (1.9%) than in other similar studies, which report cancer prevalences of between 4 to 12% [22, 24–26], suggesting that our population is less strongly filtered and therefore more representative of general practice.
In addition, given that in general practice the probability of cancer may be even lower than that predicted in the referred population, it seems reasonable to use the information from the model to inform decisions in general practice, in particular to identify who has a very low probability of cancer or advanced adenoma. The model will be the most reliable source of predicting cancer or advanced adenoma for most patient characteristics. This can be supplemented with selected information, for example the effect of family history, from sources where that has been reliably estimated elsewhere.
Another approach to identifying patients at higher risk for cancer or adenomas on colonoscopy patients is to use FOBTs [2–5], as suggested by Rozen . A recent review of FOBTs provided odds ratios for FOBT detection of cancer and advanced adenoma, which can be converted to areas under the ROC curve (AUC) and compared with our model . The AUC values were 0.93 for cancer, 0.88 for advanced adenomas and 0.69 for all adenomas. Other AUC values obtained for adenomas in a clinically presenting population were 0.72 for advanced adenomas and 0.64 for all adenomas. Overall, these are similar to or slightly higher than those found in our study. These data might suggest that FOBT would be as, or more, effective than our model as a triage tool for prioritising colonoscopy. However, FOBT requires additional cost and effort, whereas our model requires only easily and immediately obtainable sociodemographic and medical history information. Models that incorporate both this information and FOBT results should be developed and evaluated as this may boost prediction still further.