Tracing ancestry with methylation patterns: most crypts appear distantly related in normal adult human colon

Background The ability to discern ancestral relationships between individual human colon crypts is limited. Widely separated crypts likely trace their common ancestors to a time around birth, but closely spaced adult crypts may share more recent common ancestors if they frequently divide by fission to form clonal patches. Alternatively, adult crypts may be long-lived structures that infrequently divide or die. Methods Methylation patterns (the 5' to 3' order of methylation) at CpG sites that exhibit random changes with aging were measured from isolated crypts by bisulfite genomic sequencing. This epigenetic drift may be used to infer ancestry because recently related crypts should have similar methylation patterns. Results Methylation patterns were different between widely separated ("unrelated") crypts greater than 15 cm apart. Evidence for a more recent relationship between directly adjacent or branched crypts could not be found because their methylation pattern distances were not significantly different than widely separated crypt pairs. Methylation patterns are essentially equally different between two adult human crypts regardless of their relative locations. Conclusions Methylation patterns appear to record somatic cell trees. Starting from a single cell at conception, sequences replicate and may drift apart. Most adult human colon crypts appear to be long-lived structures that become mosaic with respect to methylation during aging.


Background
The human colon is a large, epithelial lined organ that develops from a simple tube [1]. Crypts form from this tube and subdivide the colon into millions of distinct clonal units maintained by stem cells [2]. With further growth or damage, crypts may bifurcate to produce new adjacent crypts by a branching process called fission [3][4][5][6]. In rodent models, crypt fission is prominent in early development and declines with age [7]. Little is known about the fates of human colon crypts during aging. Unlike rodents, human colons persist for decades after development. Possibly the majority of adult crypts form early in life and subsequently survive a lifetime as independent clonal units. Alternatively, crypts may periodically die or turn over with new crypts created to replace old crypts. Branched crypts [presumed fission intermediates) are observed in normal adult human colons at low (<1% of all crypts) frequencies, suggesting some adult crypts periodically divide [8].
Adult crypt fates may be described with somatic cell trees that identify common ancestors (progenitors), their Colon somatic cell ancestral trees resemble a "big bang", originating from a zygote and progressively growing with aging Figure 1 Colon somatic cell ancestral trees resemble a "big bang", originating from a zygote and progressively growing with aging. All cells eventually relate to each other, and therefore "Y" shaped trees characterize the relationship between any two cells. The last possible common ancestor between widely spaced crypts is around birth, and their ancestral trees must have short trunks and essentially life-long branches. In contrast, more closely spaced crypts may be related by recent crypt fission, yielding ancestral trees with relatively longer trunks and shorter branches. Tree branch lengths may be inferred from methylation pattern drift -methylation at some CpG sites appears to randomly change with aging. Methylation patterns were not significantly different between widely or closely spaced (within the same 1-2 cm 2 patch or directly adjacent) crypts, consistent with stable, long-lived adult crypts.

yrs
Adult descendants (clonal patches), and distances since common ancestors ( Figure 1). Distances between cells can be defined as time or numbers of divisions since a common progenitor. Aging complicates somatic cell ancestry because even patches of descendants may eventually become distantly related over a lifetime. For example, patches containing up to 450 human colon crypts can be visualized by differential G6PD expression [9]. These patches likely reflect development from single progenitors that randomly inactive X-chromosomes during embreyogenesis. Cells in such clonal patches are more related to each other compared to cells from other patches, but even within patches, the time of a common ancestor between any two adult cells may vary from less than a day to a lifetime. Visual markers such as G6PD identify descendants but cannot directly infer times since a common ancestor.
An ideal fate marker would be identical in recently created crypt pairs but become progressively different after fission. Recent studies illustrate that methylation of some CpG rich sequences changes with aging in normal human colons [10,11]. At least some of this methylation appears to reflect random error or drift that begins from birth [12]. Methylation demonstrates somatic inheritance [13] and therefore, similar to sequences used to recreate phylogenies, random epigenetic changes acquired over time may be used to recreate relationships between cells. More closely related cells should have more similar methylation patterns compared to less-related cells. Methylation patterns at specific CpG rich sequences sampled from individual human colon crypts appear to record lifetime stem cell histories [12]. Crypt methylation patterns are consistent with niches in which stem cell numbers remain constant, but stem cells randomly turnover by a population mechanism [12,[14][15][16].
As expected with drift and independent clonal units, methylation patterns differed between crypts randomly sampled from small human colon patches [12]. Here we further explore ancestral relationships between human colon crypts by comparing methylation patterns from crypts separated by different distances (at least 15 cm apart, within the same 1-2 cm 2 patch, directly adjacent, and branched crypts). Physical distance may be used as a surrogate for time since a common ancestor because crypts physically separated far apart cannot be related except around the time of birth. In contrast, cells in adjacent crypts may or may not be related through recent fission.

Patients
Directly adjacent crypt pairs were obtained from normal appearing colon of five male patients undergoing colectomy at the Norris Cancer Center. Patient characteristics are presented in Table 1. Random single crypts were obtained from 1-2 cm 2 patches of colon mucosa from Patients A, D and E, either from the same patch or two different patches separated by at least 15 cm. Branched crypts were obtained from Patient E and the normal appearing colon of a 44 year-old patient with ulcerative colitis.

Crypt isolation
Individual or connected crypts were isolated from fresh colon mucosa using an EDTA containing solution [12,17]. Crypt morphology was verified with an inverted microscope. Adjacent crypt pairs initially attached at their luminal surfaces were placed into small culture dishes, and passive separation was achieved by further shaking. Individual crypts of each adjacent pair were placed into 0.5 ml microfuge tubes and separately analyzed. A total of 15 adjacent crypt pairs were examined.
Branched crypts were defined as two crypt bases attached below the luminal surface. A single branched crypt was obtained from Patient E and seven from the normal appearing colon of a 44 year-old patient with ulcerative colitis. The branched crypts demonstrated asymmetrical budding with one crypt larger than the other crypt.

Methylation analysis
DNA was isolated and bisulfite treated using an agarose bead method [12,18]. Approximately half the bead was used for PCR. PCR products were cloned and individual clones were sequenced. From 4 to 8 sequences (average of 6.9) were obtained from each crypt in Table 1. From 10 to 16 sequences (average 10.9) were obtained from each branched crypt pair. Clones with evidence of incomplete bisulfite conversion (ie C's at non-CpG sites) were eliminated from the analysis.
Methylation patterns were sampled at the CpG rich loci BGN (with 9 CpG sites) and CSX (with 8 CpG sites) as previously described [12]. BGN is on the X-chromosome and therefore all cells contain a single BGN allele because only crypts from male individuals were examined. Patients B and the 44 year-old had a polymorphism in BGN such that only 8 CpG sites were present. The comparisons based on 8 versus 9 sites did not significantly change the conclusions of this study.
Each sequenced molecule is referred to as a "tag". There are 512 possible different tags for BGN and 256 for CSX. Percent methylation was calculated from the number of methylated CpG sites. Distance was measured by summing the total number of methylation site differences between two methylation tags. Intracrypt distance was the average difference between all possible tag pairs sampled from a single crypt or from a branched crypt unit. Intercrypt distance was the average difference between all possible tag pairs sampled between two crypts. Intercrypt distances were calculated between adjacent crypt pairs, and between possible combinations of crypt pairs randomly isolated from the same colon patch or patches separated at least 15 cm apart. A two-sided t-test was used to compare distributions.

Methylation tags as markers of ancestry: the strategy
A theoretical "Y" shaped tree describes ancestry between any two cells or crypts, with the last common ancestor located at the trunk and branch junction, and the zygote at its base ( Figure 1). Crypts may remain visibly unchanged but their ancestral trees continuously change with aging. Two cells last related at birth have a short trunk tree with life-long branches. Two cells related by recent division have a life-long trunk and short branches.
The Y-shaped tree between two cells may be inferred from their methylation pattern differences, which are expected to be a function of drift or divisions since their common ancestor. Longer intervals since a common ancestor should produce greater methylation pattern differences. Methylation differences consistent with a "lifetime" of drift can be empirically inferred by comparing patterns from cells physically isolated from each other since birth, or widely separated crypts sampled from colon patches at least 15 cm apart pairs. More closely spaced crypts potentially allow for recent ancestry (long trunk, short branches) from crypt fission. However, if all crypts are created around birth, all possible crypt trees are equivalent (short trunks, long branches) and methylation patterns should be unable to distinguish between crypt pairs separated by different distances.
A simple measure of drift or distance between methylation patterns is the total number of methylation site differences between two CpG rich sequences [12]. The BGN sequence contains nine CpG sites and the maximum possible difference or distance is nine ( Figure 2). By representing methylated CpG sites as "1'' and unmethylated sites as "0'', each BGN sequence can be displayed as a binary string or tag. For example a fully methylated tag is "111111111'' and an unmethylated tag is "000000000'', and the distance between these tags is nine. Of note the BGN locus is located on the X-chromosome and only male patients were examined. Therefore, each BGN tag represents a single cell.
Distances may be measured by comparing tags within an individual crypt (intracrypt distance) or between two crypts (intercrypt distance). Numerical distances are not absolute measures of time since a common ancestor because methylation errors or drift are stochastic. For example, by chance, unrelated crypts may have similar methylation patterns. How intercrypt or intracrypt distances change with aging is illustrated in Figure 2B based on a previously published scenario [12].

Adjacent and non-adjacent crypts are equally unrelated
BGN tags sampled from crypts (Table 1) had complex methylation patterns ( Figure 2C). Variable numbers of unique tags were present within each crypt, as expected because crypts contain multiple stem cells. Intracrypt and intercrypt distances were variable, likely reflecting stochastic methylation errors and random stem cell death. Average intracrypt distance was significantly less than average intercrypt distances (Figure 3), consistent with cells within a crypt being more related than cells in different crypts. Comparisons of intercrypt distances revealed no significant differences between adjacent crypt pairs, crypts randomly sampled from the same 1-2 cm 2 patch, or widely separated crypts sampled from patches located at least 15 cm apart ( Figure 3). The data are consistent long-lived adult crypts because it was not possible to distinguish between adjacent or nonadjacent crypt pairs.
Intercrypt distances between four adjacent crypt pairs were small (Table 1 and Figure 3). These adjacent crypts may be more closely related by recent fission, but another possibility is they are also distantly related and happen to Figure 2 Methylation tags. A) The bisulfite converted sequence of the BGN methylation tag, which contains 9 CpG sites (underlined). A capital "T" represents bisulfite conversion of a non-CpG site "C". B) Intracrypt and intercrypt distances between methylation tags. Average distances (line) change with aging according to our model of human crypt niches [11]. 95% simulation intervals (dotted lines) reflect that drift and therefore distances are stochastic and not deterministic. The lower simulation bound for intracrypt distances is zero. Intracrypt distances remain relatively constant through life because stem cell niche turnover results in the periodic loss of all stem cell lineages except one. These "bottlenecks" ensure that all cells within a crypt are relatively closely related to each other. In contrast, intercrypt distances increase with age, reflecting that methylation tags randomly drift apart in unrelated crypts. C) Examples of BGN methylation tags (5' to 3') in adjacent and branched crypts. Filled circles are methylated CpG sites and tags from individual crypts are grouped. Intracrypt and intercrypt distances are labeled. Distances between BGN tags. Intracrypt distances are significant less than intercrypt distances. There are no significant differences between intercrypt distances of adjacent crypts, crypts randomly sampled from the same 1-2 cm 2 patch, or crypts from widely separated patches. Intracrypt distances of branched crypts were significantly greater than individual crypts and not significantly different from intercrypt distances. These distance relationships are consistent with long-lived individual or branched crypts. Shaded in yellow are adjacent crypts with the smallest intercrypt distances. These adjacent crypt pairs appeared to be distantly related when examined at another locus (see Table 2).  [19].

BGN bisulfite converted sequence
To better characterize crypt ancestry a second methylation tag called CSX was examined because methylation at different CpG rich loci appears to be independent [12]. CSX tags were usually methylated, with large intercrypt distances in the four adjacent crypt pairs (Table 2). Therefore, all adjacent crypt pairs appear to be distantly related with distance variations due to chance rather than recent crypt fission.

Branched crypts resemble connected but unrelated independent crypts
Tag intracrypt distances in branched crypts ( Figure 4) were significantly greater than single crypts (p = 0.0035) and slightly less than intercrypt distances, but not significantly (p = 0.35) different ( Figure 3). Therefore, branched crypts appear to be different from single crypts and more resemble two independent but connected individual crypts.

Discussion
Fates of individual human colon crypts are uncertain and must be inferred because most crypts appear alike and long-term serial observations are infeasible. Crypt behaviors are more readily examined in rodent models, but these studies primarily examine development rather than aging because rodents only live a few years. To accommodate growth, individual crypts bifurcate into two crypts by a process called fission. A murine crypt replication cycle (total time for crypt fission) requires about 108 days (reviewed in [6]). Physical separation of branched crypts (thought to be morphologic fission intermediates) into two new crypts requires from 12 hours to 5 days [6,20]. Crypt fission is prominent early in life but declines with age in rodent models [7]. Branched crypts can also be observed at low frequencies (<1% of all crypts) in normal adult human colons [8], suggesting crypt fission recurs throughout life.
An approach based on the random drift of methylation patterns [12] was used to investigate adult human crypt ancestral relationships. Sequences can be used to trace ancestry, and like base sequences, methylation at CpG sites exhibits somatic inheritance [13] but some errors are inevitable introduced during replication. The methylation error rate appears to be greater than for sequences and has been estimated at about 2 × 10 -5 per CpG site per day [12], allowing for many changes within a human lifetime. Methylation patterns sampled from crypts at certain CpG rich sequences are complex and different between crypts ( Figure 2C). The patterns appear stochastic and consistent with drift or random methylation errors with aging.
Recently related cells should share similar methylation patterns whereas less related cells may have different patterns.
Extraction of ancestral information from methylation patterns may be difficult and many uncertainties exist [21]. Therefore, the starting point of our analysis compares methylation patterns between cells with relatively known ancestries. Cells from widely separated crypts (greater than 15 cm apart) likely shared a last common ancestor around birth, or a "Y" shaped tree with life long branches ( Figure 1). In contrast, cells within a crypt likely share a more recent common ancestor. Crypts are maintained by small numbers of stem cells that reside in niches near crypt bases [2]. Niche stem cells are not strictly immortal but turn over through a population type mechanism [13][14][15] by which random stem cell loss with replacement eventually results in the extinction of all stem cell lineages except one, or niche succession. These niche "bottlenecks" periodically reduce intracrypt diversity because at various times after birth, all crypt cells become related to a new common ancestor. By one model a human crypt niche contains 64 stem cells and niche succession occurs on average every eight years [11]. Therefore, cells within an adult crypt have Y-shaped trees with long trunks and short (~eight years long) branches. Consistent with an ability to distinguish between known relationships (short trunk, life-long branches versus long trunk, short branches), methylation differences between widely separated crypts were significantly greater than within crypts ( Figure 3).
Relationships between closely spaced crypts are less well known, but comparisons between any two adult crypts, or within branched crypts, revealed differences significantly greater than intracrypt differences. Furthermore, methylation distances between cells from different crypts regardless of proximity, or between cells in branched crypts were not significantly different. These findings suggest two basic Y-shaped trees describe the majority of adult crypt cells. Trees with long trunks and short branches are present within single crypts, and trees with short trunks and long branches connect cells in different crypts.
Similar methylation pattern distances between crypt pairs regardless of proximity are consistent with long-lived crypts. Exact crypt lifetimes cannot be inferred with the current analysis but are likely greater than average intervals for stem cell niche succession (estimated at eight years [11]), and possibly most crypts survive a lifetime as intact structural units. However, the intercrypt methylation patterns are also consistent with extremely long human crypt fission cycles, which have been estimated to occur every nine to 18 years [4,6,8].
Example of a branched crypt analyzed for methylation patterns Figure 4 Example of a branched crypt analyzed for methylation patterns.
Long-lived human crypts may reflect stem cell niche biology. Multiple stem cells per crypt niche may help ensure lifelong crypt survival because stem cell losses are readily and normally replaced. Therefore, crypts are more physically defined by niches that survive a lifetime rather than their stem cells that turnover. Consistent with niches defined by their surroundings rather than their contents, recent studies reveal Drosophila niches persist even when their germ cell inhabitants are depleted [22].
Methylation patterns of branched crypts were significantly more diverse than individual crypts and similar to crypt pairs. This finding may be consistent with a crypt cycle in which fission occurs after there is an increase in the number of crypt stem cells [5,6] because methylation pattern diversity correlates with stem cell number [12]. Consistent with the low frequencies of branched crypts in normal adult colons [8], most single crypts appear not to be near such a fission threshold as measured by methylation patterns. A number of other scenarios could also account for the higher diversity of branched crypts including crypt fusion (unrelated crypts that merge) and stalled or long-lived fission intermediates. Further studies with more crypts and sequences will be needed to distinguish between these possibilities.
Comparisons of directly adjacent crypt pairs are limited by random sampling. Multiple crypts surrounded each crypt, and even if two adjacent crypts were recently related, it would be difficult to sample the appropriate partners. Our studies of 15 adjacent crypt pairs suggest most crypt pairs will be unrelated, consistent with low frequencies of somatically acquired human crypt clusters. For example somatic mutations in O-acetyltransferase may be identified by PAS staining in appropriate heterozygous individuals [23]. Such mutations serve as fate markers as they are likely to be neutral and inherited. Consistent with our conclusion that most adjacent crypts are not recently related, more than 75% of O-acetyltransferase mutant crypts are single crypts, although small clusters (typically two or three mutant crypts) were also observed [23]. O-acetyltransferase mutations may occur throughout life and patches of stained cells may reflect expansion events of various ages.

Conclusions
An approach based on methylation pattern comparisons can potentially infer ancestral relationships between human crypts. With aging, crypts become mosaic with respect to methylation patterns. Initial findings are consistent with generally accepted colon crypt physiologywidely separated and most directly adjacent adult crypts are not closely related [23], branched crypts are different from single crypts and represent possible fission intermediates [5,6], and crypts maintained by niches containing multiple stem cells [12,[14][15][16]. Most crypts appear to be long-lived structures, consistent with an ability to observe clonal patches established before birth in adult colons [9]. Of note, ancestries based on sequences are inherently controversial [24], and the same general objections to species phylogenetic studies likely apply to our somatic cell trees.
Although the exact interpretation of methylation patterns may be problematic [21], sequences are increasing used to infer ancestral relationships, especially when (like in human colon) few other alternatives exist. Ancestry may potentially be completely inferred from sequences, although a synthesis with morphologic markers will likely yield a better understanding of evolutionary relationships [24]. Molecular species phylogenies are continuously refined with more sequences and better analytic methods.
Similarly, studies with more loci or other quantitative models may further refine somatic cell trees of normal and diseased human colon.