Hormone replacement therapy is associated with gastro-oesophageal reflux disease: a retrospective cohort study

Background Oestrogen and progestogen have the potential to influence gastro-intestinal motility; both are key components of hormone replacement therapy (HRT). Results of observational studies in women taking HRT rely on self-reporting of gastro-oesophageal symptoms and the aetiology of gastro-oesophageal reflux disease (GORD) remains unclear. This study investigated the association between HRT and GORD in menopausal women using validated general practice records. Methods 51,182 menopausal women were identified using the UK General Practice Research Database between 1995–2004. Of these, 8,831 were matched with and without hormone use. Odds ratios (ORs) were calculated for GORD and proton-pump inhibitor (PPI) use in hormone and non-hormone users, adjusting for age, co-morbidities, and co-pharmacy. Results In unadjusted analysis, all forms of hormone use (oestrogen-only, tibolone, combined HRT and progestogen) were statistically significantly associated with GORD. In adjusted models, this association remained statistically significant for oestrogen-only treatment (OR 1.49; 1.18–1.89). Unadjusted analysis showed a statistically significant association between PPI use and oestrogen-only and combined HRT treatment. When adjusted for covariates, oestrogen-only treatment was significant (OR 1.34; 95% CI 1.03–1.74). Findings from the adjusted model demonstrated the greater use of PPI by progestogen users (OR 1.50; 1.01–2.22). Conclusions This first large cohort study of the association between GORD and HRT found a statistically significant association between oestrogen-only hormone and GORD and PPI use. This should be further investigated using prospective follow-up to validate the strength of association and describe its clinical significance.


Background
Gastro-oesophageal reflux disease (GORD) is a common relapsing disorder largely caused by repeated exposure in the lower oesophagus to the retrograde flow of gastric contents [1]. Epidemiological studies have shown that reflux is experienced by 3-20% of the population at least weekly [2][3][4]. It is one of the most prevalent conditions seen in primary care [5,6] and is costly in terms of pharmacological therapy and investigations [5]. Symptoms of GORD, including heartburn, regurgitation and nausea, are associated with a reduced quality of life [7,8].
The aetiology of GORD remains unclear, although research has identified the main risk factors as heredity [9], increased body mass index (BMI) [10] and tobacco smoking [11]. Some studies indicate a stronger association between GORD and obesity in females than males, suggesting a link with female sex hormones [12]. This hypothesis is supported by in-vitro and clinical data which suggest an indirect action of sex hormones on GI motility [13][14][15][16], evidence of reduced upper GI motility during the menstrual cycle [17] and reduced oesophageal sphincter pressure during pregnancy [18]. Recent studies suggest that hormone replacement therapy (HRT) may be associated with GORD in post-menopausal women [19]. A recent Swedish cohort study of female twins suggests that oestrogen-only HRT is an independent risk factor for GORD symptoms [20] (OR 1.32; 95% CI 1. 18-1.47). A large US randomised controlled trial reported a similar association among post-menopausal women with hysterectomy [21], findings mirrored in a US cohort study of post-menopausal registered nurses [22]. Although a Norwegian case-control study indicated that obesity increases the influence of oestrogen use upon GORD (OR 2.3; 95%CI 1. 1-4.8), it is difficult to draw firm conclusions given the small numbers [12]. These findings contrast with in-vitro evidence suggesting that progestogen use may play a greater role than oestrogen in upper GI motility [23]. Furthermore, these European and US studies relied on self-reporting of symptoms and did not take into account time from onset of menopausal symptoms. In light of these results, the goal of this study was to establish the extent of the association between upper GI symptoms and the use of forms of HRT, and the relative importance of oestrogen compared with progestogen in the UK population.

Methods
A retrospective cohort study was designed and conducted, accessing records from the General Practice Research Database (GPRD). The GPRD provides anonymised access to electronic primary care medical records and prescription data for a representative 6% of the UK population [24], and has been extensively validated as a reliable source of GORD information [25]. Diagnoses were identified using Read codes and prescriptions using Prescription Pricing Authority (PPA) codes; all codes for GORD symptoms, outcomes, medications and covariates are available on request. Data were accessed under the Medical Research Council licence for academic groups (protocol reference 07109).

Disease definitions
Menopausal symptoms were defined using a wide range of diagnostic and referral codes indicating menopause or menopausal symptoms such as "menopause" and "syndrome menopausal". Gastro-oesophageal outcomes were categorised in two ways firstly using diagnostic codes specific to GORD (ICD 10 category K21), for example "gastro-oesophageal reflux", and secondly including Barrett's Oesophagus and more general codes indicative of symptoms of GORD such as 'reflux oesophagitis' and 'waterbrash'. In recognition of the reported lack of standardisation of GORD recording [25] in general practice; we also investigated the association between protonpump inhibitor (PPI) and hormone use.

Cohort selection
The original GPRD cohort contained complete records for 102,602 women with a recorded diagnosis of menopause aged between 40-70 at menopause diagnosis (T 0 ). Diagnosis occurred within the cohort window of 01/01/ 1995 -31/12/2004 (largely avoiding the subsequent reduction in HRT following reports of increased cancer and cardiovascular risk). Patients with less than 24 months of continuous GP registration were excluded, as were patients with malignancy or pregnancy recorded within the study window. The study window for identifying the cohort was defined as two years from menopause diagnosis (T 0 +/− 24 months). Patients who commenced HRT more than 2 years before or after the first record of menopause were excluded, thus a cohort of N = 51,182 women with a record of menopause were within the study window. A sub-cohort of hormone users were matched, according to calendar year, age at menopause, socio-economic status of GP practice, and date closest to menopause, with one user for every two non-users.
In addition to GORD, records were evaluated for hysterectomy, osteoarthritis, non-steroidal anti-inflammatory drugs (NSAIDs), bisphosphonates and calcium supplement use at menopause diagnosis. Limited by variable reporting within the GPRD, smoking and alcohol status were crudely categorized by "non-user", "user" or "exuser" at menopause. Mean BMI was calculated using all BMI readings (kg/m 2 ) within the study window and categorized as underweight (<18.5), normal (18.5 to 24.9), overweight (25 to 29.9), obese (30 to 39.9), morbidly obese (≥40). GP practices were allocated a quintile score for socioeconomic status based in the Index of Multiple Deprivation (IMD) [26].

Drug exposure definitions
Hormone use incorporated the following proprietary and generic classifications: Combined hormone (conjugated oestrogen with progestogen; estradiol with progestogen); oestrogen-only (oestradiol only; oestradiol, oestriol and oestrone; oestriol only; oestropipate only; conjugated oestrogens only); tibolone; and progestogen (other than for contraception; dydrogesterone; medroxyprogesterone; norethisterone; progesterone). For analytic purposes, the term ' All hormone (AH)' incorporates all of the above categories; sensitivity analysis reports on the differences between categories.
Bisphosphonates incorporated bisphosphonates and other drugs affecting bone metabolism including alendronic acid. Calcium supplements comprised calcium gluconate, calcium lactate, and calcium carbonate preparations. Non-steroidal anti-inflammatory drugs comprised all NSAIDs. Exposure to these drugs was defined as at least one prescription record in the time window of 2 years prior to or at menopause diagnosis. Proton pump inhibitors (PPIs) comprised esomeprazole, lansoprazole, omeprazole, pantoprazole, and rabeprazole sodium. Use of PPI (categorised as yes/no) was identified both before and after hormone exposure.

Analysis
The relative risk of GORD or PPI use was estimated as an odds ratio (OR) for hormone use compared to non use. Models were subsequently adjusted for demographic, comorbidity and drug exposures variables. Data management and manipulation was performed using Stata/IC 10. Binomial logistic regression analyses used SPSS v15; the response variable was the presence or absence of GORD or PPI use, and independent variables were either categorical, ordinal or continuous as appropriate. The analysis fitted odds ratios for each of the independent variables with 95% confidence intervals (CIs). The Wald statistic was used to indicate the statistical significance of each fitted logit coefficient (different from zero) corresponding to each independent variable.
Simple regression examined the unadjusted strength of association between each different form of HRT and GORD. Cases and controls were then matched according to calendar year, age at menopause, socio-economic status of GP practice, and date closest to menopause, and simple matched analysis was conducted. The matched dataset was then used to conduct multiple regression analysis taking into account smoking (never/ ever/ex), alcohol (never/ever/ex), BMI (subdivided as <18.5, 18.5-24.9, 25-29.9, 30-39.9) and current drug use (NSAID use, calcium supplements and bisphosphonate, duration subdivided into <30 days, and ≥30 days). After extensive exploration, five models for PPI-use and for GORD response were chosen for further detailed analysis on the basis that they allowed for full exploration of the relative effects of risk factors. Model A featured simple regression (unadjusted estimates). Models B to E featured multiple regression (adjusted estimates). Model B incorporated NSAID use (never/ever) calcium use (never/ever), bisphosphonate use (never/ever), BMI, alcohol, and smoking; Model C incorporated NSAID use (never/ever), calcium use (never/ever), bisphosphonates (never/ever); Model D incorporated Model B but exchanged never/ever drug use for drug duration (<30 days, and ≥30 days) plus BMI, alcohol and smoking; Model E incorporated Model D minus BMI, alcohol and smoking. All subgroup analyses were prospectively planned, informed by previous research and clinical expertise. Smoking, alcohol, and BMI had high proportions of missing or unreliable (out of range) data; the final models reported did not include these variables.

Results
Among 51,182 women with medical records taken from 414 general practices, 22,101 women (43%) were hormone users and 29,081 (57%) were non-users. Overall, users and non-users had clinically similar baseline demographic characteristics, although non-users of hormones were slightly less likely to use NSAIDs, and a higher proportion of oestrogen-only users had undergone hysterectomy (Table 1). A total of 23,210 women (45%) had a record of hysterectomy while 21,835 women (43%) had at least one record of pregnancy ever. The mean age of commencing hormone replacement was 49.7 years and the mean exposure duration was 5.4 years. The most common hormone replacement was combined oestrogen and progestogen; 68.3% of women were exposed to more than one type of hormone and 1% of users stopped taking hormone after one prescription.
A total of 42,724 (84%), women had ever been recorded as having GORD symptoms, with a total of 18.5% of all consultations coded as 'Dyspepsia' , and 3.3% of consultations coded as 'Gastro-oesophageal Reflux Disease'. The majority of these occurrences were post-menopause. Prior to the study window (pre-menopause), overall 11881 women (23%) consulted their GP for GORD symptoms; GORD reporting rates were comparable between groups prior to hormone exposure (25% of oestrogen-only and tibolone users reported GORD prior to hormone exposure compared to 22% of combined hormone and 23% of progestogen users and non-hormone users). Table 2 shows the simple analysis (unadjusted for other patient characteristics) of each form of hormone and the strength of their association with GORD, comparing unmatched with matched data. All forms of hormone were statistically significantly associated with reported GORD symptoms ( Table 2) both in the unmatched and the matched groups.
Adjusted analyses (shown in Table 3) explored the relative strength of association between different forms of hormone and GORD, taking into account certain patient characteristics. This showed a statistically significant independent association between oestrogen-only use and GORD (OR 1.49, 95% CI 1.18-1.89, p = 0.001) when taking into account the relative effect of other risk factors (NSAID, bisphosphonate and calcium use). Other unadjusted associations between hormone therapies and GORD did not persist when models were adjusted for these risk factors. Previously known independent risk factors for GORD were confirmed; the models provide evidence of the independent but varying influence of NSAID use and calcium on GORD.
A total of 35639 women (34.7%) were ever-users of PPIs with a mean number of 64 PPI prescriptions. Premenopause, the use of PPI prescriptions was comparable across hormone and non-hormone groups for approximately 6% of women.
Simple regression, unmatched and matched analyses ( Table 2) showed a consistent statistically significant association between PPI use and oestrogen-only use only (OR 1.46, p = 0.001 and OR 1.42, p = 0.007 respectively). This association remained present in adjusted analysis (OR 1.34, 95% 1.03-1.74, p = 0.027, Table 4). In unadjusted Index of Multiple Deprivation (IMD) based on practice post-code. Quintile 0 is the least deprived, quintile 4 is the most deprived. 5 Prior to menopause. models, the association between GORD or PPI use and progestogen was not statistically significant, however, the association was significant (OR 1.50, 95% CI 1.01-2.22, p = 0.044) in the adjusted analysis. The previously known association between NSAID use and PPI use was confirmed but small numbers of users of calcium supplements and bisphosphonates gave non-significant findings. Preliminary unadjusted models including BMI, alcohol and smoking had large numbers of missing data. These variables poorly fitted any model of GORD or PPI use in unadjusted analyses. When adjusted models of GORD were analysed with BMI, alcohol and smoking as independent variables, similarly none fitted at a statistically significant level. In unadjusted analysis, there was a small but unimportant association between BMI and PPI use (OR 1.04, 95% CI 1.01-1.056, p = 0.001). Alcohol use (OR 4.17, 95% CI 1.20-14.52, p = 0.025) was an additional independent factor in a model of PPI use among combined hormone users, this may be a chance finding in a model involving small numbers.

Discussion
Within the range of models tested, there was a consistent association between GORD or PPI-use and oestrogen-only hormone replacement. Our evaluation represents the first direct controlled comparison of hormone types for a clinically meaningful GORD diagnosis in a UK population, allowing for multiple hypothesis testing. The large sample size enabled extensive model refinement and subgroup analyses in three ways. Firstly, simple analyses allowed us to identify the potential association between different forms of HRT and GORD regardless of other patient characteristics. This was the first step in understanding the differences between forms  of HRT and showed that oestrogen-only HRT presented a possible association with GORD. Secondly, adjusted analysis allowed us to confirm that, even when accounting for known risk factors (NSAID use and other medications), there was an association between oestrogen-only HRT and GORD. Thirdly, we matched cases with controls according to year, age, and socio-economic status in order to account for the effect of these variables upon the incidence of GORD. This further supports the possible association between oestrogen-only HRT and GORD. This study is also the first to compare progestogen only with hormone replacement therapy. Although progestogen was independently associated with PPI use, this was not consistent across simple and adjusted analyses or across GORD groups but the reasons for this are unclear. It was previously thought that progestogen may have a relaxing effect on lower oesophageal sphincter tone. More recent work suggests that oestrogen increases nitric oxide synthesis, which results in smooth muscle relaxation in both human [27] and animal models [28] and may, therefore, be involved in the pathogenesis of GORD. Our findings suggest that oestrogen-only hormone has a stronger independent association with GORD than progestogen, refuting earlier in-vitro studies suggesting progestogen was the most important hormone in GORD aetiology. There is a well established association between NSAIDs and GORD, which our study suggests may be higher in magnitude with tibolone use although our findings are not statistically significant. Tibolone has oestrogenic, progestogenic, and androgenic effects which act to prevent bone loss. It has been associated with decreased levels of fibrinogen, factor VII, plasminogen activator inhibitor 1, homocysteine, and tissue plasminogen activator and with increased levels of C-reactive protein, antithrombin III, and D-dimer [29]. However, it is unclear which of these, if any, may be a contributory factor in the possible interaction between NSAIDs and tibolone. This study did not assess whether the effects of NSAIDs on tibolone users and their risk of GORD were mediated by the type of NSAID drug. Naproxen is thought to be more gastrotoxic than ibuprofen and diclofenac; whether this holds true for the potential interaction with tibolone remains to be established. In this study, the term GORD relates to a range of GORD like symptoms; it is possible to suggest that the association between NSAID and GORD may be related to gastrotoxicity, and not to an increase in 'true' GORD. The use of HRT declined following publicity about its possible association with cancer and cardiovascular disease [30,31]. However, these data go some way to explain the risk of GORD and PPI use in this group and also provide a possible explanation for those now on HRT who may have GORD symptoms. The GP records used in this study may be a more reliable mechanism for recording GORD symptoms than selfreporting methods used in other studies; this may explain the slightly higher level of risk identified in this study compared to other cohort studies [32]. Our findings are likely to be an underestimate of the extent of the problem as many women who chose not to consult for GORD may simply have stopped taking hormone therapy and would therefore be excluded or miscoded in our analysis. Many women might have also have wanted to avoid polypharmacy. Up to 75% of women choose to stop using HRT in the first six months [33] as a result of reported side effects including weight gain, headache, nausea and perceptions of disease risk. While the design of most studies prevents further analysis of side effects [30], it is possible that at least a proportion of these were attributable to GORD. Our findings, if validated in a prospective study, may have important consequences for the management and resources used by patients with upper GI symptoms as these patients would normally constitute a higher use group for acid suppression therapy.

Study limitations
The GPRD inconsistently records endoscopy findings; because of the lack of secondary care data to verify the presence of GORD, there is a potential risk of misclassification. The current study explored the relationship between HRT and GORD as presenting to and clinically diagnosed by GPs. This is increasingly the only relevant definition at a time when GPs no longer routinely refer patients without alarm signs for endoscopy because of the perceived balance of risks and benefits to these lowrisk patients. Because of well acknowledged difficulties in classifying GORD, this study compared a broad definition of GORD encompassing symptoms such as waterbrash, with a more specific definition of ICD 10 codes (K21) indicative of a GORD diagnosis. Simple regression was conducted using both definitions: the similarity of findings strengthens the conclusions. The GPRD records used in this study may in fact be a more reliable mechanism for recording GORD symptoms than self-reporting methods used in other studies.
Given the weaknesses inherent in GORD recording, we also investigated PPI use in this cohort. Although most of the PPI users had a record of GORD, PPI has many other uses. In addition, many patients with GORD were not prescribed PPIs and it is likely that many women suffering side-effects of HRT would not immediately commence PPIs. We therefore treated GORD sufferers and PPI users as two separate 'proxy' groups. The fact that findings were similar across these groups strengthens the association between oestrogen-only HRT and GORD.
When seeking causative associations, weaknesses inherent in all observational design are the problem of unmeasured variables, and the difficulties in teasing out temporal relationships. The similarity of hormone users and non-users at baseline and the consistency of effect with different model formulations offers some protection against this. Selection bias is another common threat to validity although in this study GPRD data was collected from a wide cross-section of general practices from across the UK. It is not possible to rule out some underlying factor selecting patients who do and don't receive HRT which is GORD-related, although there is no evidence for this.
A further potential weakness of any GPRD study is the incompleteness of BMI, smoking and alcohol data due to the fact that most general practices do not routinely and systematically collect this data. For example, only 41% of women had any BMI record within the study window (erroneous range 0.5 to 896000); after data cleaning, only 21% of BMI records were within acceptable limits. The association between these variables and GORD are well established and so in order to understand their relative effect in combination with HRT use, we conducted sub-group analysis on complete cases. Our findings showed that the strength of association with HRT remained the same as in the wider group but because numbers were small, a prospective study design would be required to confirm these findings.
A further limitation of the study, that might weaken the strength of association, is the lack of a direct measure of adherence to prescribed medication. Since HRT may be used in varying doses over time for symptom control, dose dependency could not be adequately examined using this data. We identified outcomes using diagnostic codes but we were unable to assess the severity of