Introduction

Breast cancer is a heterogeneous disease that can be classified on the basis of a number of characteristics including tumour size, histological subtype and grade, oestrogen (ER) and progesterone (PR) receptor and HER2 expression, axillary lymph node (LN) status and expression profile (Sorlie et al., 2001). Some of these features have been associated with disease characteristics and can therefore be used to inform patient management. For example, patients with tumours that test positive for ER and HER2 can be treated with tamoxifen and herceptin, respectively, and have a significantly better prognosis than those that test negative for these markers. However, the heterogeneity that exists even within breast cancer subgroups defined by multiple markers means that for the vast majority of breast cancer cases, predicting outcome remains a challenge, and thus additional informative biomarkers are urgently needed.

Breast cancer results from abnormalities in the quality or quantity of certain gene products, including coding and non-coding genes. MicroRNAs (miRNAs) are small non-coding RNAs 20 nt in length that are capable of modulating gene expression post-transcriptionally (Cullen, 2004; Boyd, 2008; Bartel, 2009). MiRNAs can exhibit either tumour suppressor or oncogenic roles by modulating key cellular processes in cell-cycle progression, apoptosis and invasion (Bartels and Tsongalis, 2009; Mirnezami et al., 2009; Visone and Croce, 2009). In several studies, differential miRNA expression has been shown to distinguish normal and breast tumour tissue, breast cancer subtypes, ER, PR and HER2 status, and to predict lymph node status and invasiveness (Iorio et al., 2005; Mattie et al., 2006; Foekens et al., 2008; Yan et al., 2008; Lowery et al., 2009). Together, these studies suggest a potential diagnostic and prognostic use of miRNAs as biomarkers in breast cancer.

Quantitative defects in miRNAs arise through several mechanisms, including aberrant DNA methylation. Human DNA methylation usually occurs at the number 5 carbon of cytosine of a CpG dinucleotide motif. High densities of CpGs, termed CpG islands (CGIs) are usually associated with promoter elements and methylation of which usually leads to gene repression. Aberrant DNA methylation of miRNA genes has been associated with several cancers (Lujambio et al., 2007; Lehmann et al., 2008; Lodygin et al., 2008), suggesting a possible use of miRNA DNA methylation as a prognostic tool. For example, miR-9-1 and miR-34a are hypermethylated in breast cancer (Lehmann et al., 2008; Lodygin et al., 2008). In addition, the mir-200b cluster (miR-200b, -200a and -429), has a CGI associated promoter 4 kb upstream of the sequence encoding the mature miRNA (Bracken et al., 2008), and aberrant DNA methylation of this sequence is associated with loss of miR-200 expression in colon (Han et al., 2007), bladder (Wiklund et al., 2011) and pancreatic (Li et al., 2010) cancers. The contribution of miR-200b cluster gene methylation to breast cancer has not yet been reported.

Although much is known about the biogenesis and function of miRNAs, relatively little is known about the transcriptional regulation of miRNA genes. To date, a limited number of miRNA promoters have been experimentally characterized and only recently have several miRNA promoter prediction algorithms emerged (Zhou et al., 2007; Fujita and Iba 2008; Linhart et al., 2008; Marson et al., 2008; Ozsolak et al., 2008; Wang et al., 2009). These studies show that miRNA promoters lie anywhere from a few bases upstream of the stemloop to tens of kilobases upstream (Linhart et al., 2008; Marson et al., 2008; Ozsolak et al., 2008; Wang et al., 2009). Furthermore, several miRNAs have multiple promoters (Ozsolak et al., 2008; Wang et al., 2009; Monteys et al., 2010).

In this study, 93 miRNAs previously associated with breast cancer, were prioritized for experimental analysis using bioinformatics to look for CGI-associated promoters. The CGI-associated promoters of 15 miRNAs were mapped and methylation determined in a panel of nine breast cancer cell lines. A novel promoter for the miR-200b cluster and its role in regulating miR-200b expression was investigated. The relationship between methylation of this promoter and the previously described miR-200b cluster promoter with miRNA-200b expression and clinical characteristics in breast cancer are described.

Results

Fifty-five miRNAs previously implicated in breast cancer are located within 5 kb of a predicted CGI

To identify candidate promoters for which methylation could be associated with breast cancer development, a list of 93 miRNAs implicated in breast cancer was collated from the literature (Supplementary Table 1). Using CpGPlot and CGI searcher, 55 (59%) of these miRNAs had a predicted CGI within 5 kb upstream of the region encoding their 5′ stemloop. The CoreBoost_HM promoter prediction algorithm was used to predict regulatory elements controlling the transcription of the 55 miRNA-associated CGIs. This algorithm uses PolII chip binding, histone modifications and DNA motifs associated with promoters to predict putative transcription start sites (TSS) (Wang et al., 2009). Putative promoters were defined by a CpG promoter prediction cutoff score of 0.5, representing a 90% likelihood of TSS within 500 bp of the predicted region. Figure 1 summarizes the process.

Figure 1
figure 1

Overview of the miRNA selection approach performed in this study. A literature review of miRNAs implicated in breast cancer was followed by in silico and molecular studies in human breast cancer cell lines. Analysis of methylation in clinical breast cancer specimens was performed on miR-200b.

Experimental validation of predicted novel promoters of 15 miRNAs

To determine if these predicted CpG promoter sequences had experimentally detectable promoter activity, a 600–1000 bp of genomic sequences around the predicted site was cloned upstream of a luciferase gene and assayed for reporter activities in either MCF7 or MDA-MBD-231 breast cancer cell lines. When a miRNA had more than one predicted promoter, a fragment encompassing each prediction was cloned. Their genomic locations are detailed in Figures 2 and 3 and Supplementary Table 1. The previously described promoters of the miR-17 cluster (Yan et al., 2009) and the miR-200b cluster (Bracken et al., 2008) were included as positive controls whereas a non-CoreBoost_HM predicted fragment in miR-17 cluster was used to control for background promoter activity.

Figure 2
figure 2

UCSC screenshots of miRNA candidates and their associated genomic features. Bars representing miRNAs are shown in red, CGIs in green, promoter and methylation-sensitive high-resolution melt analysis fragments in black. Annotated genes are marked in blue. Orientation of genes and fragments are indicated by directional arrows. CoreBoost_HM promoter predictions are shown as black peaks.

Figure 3
figure 3

UCSC screenshots of miRNA candidates and their associated genomic features. Bars representing miRNAs are shown in red, CGIs in green, promoter and methylation-sensitive high-resolution melt analysis fragments in black. Annotated genes are marked in blue. Orientation of genes and fragments are indicated by directional arrows. CoreBoost_HM promoter predictions are shown as black peaks.

Twenty-two novel promoters from 15 miRNAs exhibited at least fivefold activity compared with the promoter-less pGL3-basic control in at least one cell line (Figures 4,5,6a). As expected, the previously described promoters of miR-17 and miR-200b clusters had strong promoter activity whereas the non-CoreBoost_HM predicted fragment had no detectable promoter function (Figures 3d and 4a). The 15 miRNAs with experimentally validated promoters are miR-9-1, miR-9-3, miR-10b, miR-22, miR-124-1, miR-124-2, miR-124-3, the miR-130b cluster, miR-193b, miR-200b cluster, miR-210, miR-320a, miR-335, miR-373 and miR-663. Three promoters were mapped for the miR-124-3 loci; two promoters were mapped for miR-9-1, 22, 124-1, 124-2, 193b and 200b; and one promoter was mapped for miR-9-3, 10b, 130b, 210, 320a, 335, 373 and 663. To map the minimal promoters regions and to facilitate methylation analysis, promoter fragments of the 15 miRNAs were fine mapped to 300 bp (Figures 2, 3 and 6b).

Figure 4
figure 4

(a–i) Promoter activities of miRNA candidates. miRNA promoter activity in cells expressed in RLU±the s.e.m. Data were generated from three independent experiments. Promoter fragments are labelled 1, 2 or 3 and A, B or C indicates the sub-fragment of that respective promoter.

Figure 5
figure 5

(a–g) Promoter activities of miRNA candidates. miRNA promoter activity in cells expressed in RLU±the s.e.m. Data were generated from three independent experiments. Promoter fragments are labelled 1, 2 or 3 and A, B or C indicates the sub-fragment of that respective promoter.

Figure 6
figure 6

miR-200b cluster has two functional promoters. (a) Promoter activities of miR-200b P1 and P2 in MCF7 and MDA-MB-231 cells. Reporter activity is expressed in RLU±the s.e.m. Data were generated from three independent experiments. White bars represent activity in MDA-MB-231 cells whereas grey bars represent activity in MCF7 cells. (b) Luciferase activities of various miR-200b P2 5′ and 3′ truncations. Graphical representation of the various reporter fragments in relation to their genomic locations and their associated reporter activities expressed in RLU. Error bars represent s.e. of three separate experiments. (c) Top: relative location of the minigene to P2, Hsa-mir-200b and the CGI. Grey box represents the region deleted in the minigene promoter KO construct. Bottom: MDA-MB-231 cells were transfected with pGL3-basic, miR200b minigene or miR200b minigene promoter KO plasmids and assayed for mature miR-200b expression by TaqMan real-time PCR±s.e. (d) Top: schematic of the miR-200b P2 luciferase construct. The white arrow represents the P2 promoter followed by the luciferase gene. Primers R1, R2, A to D are represented small arrows. Bottom: DNA agarose gel of the 5′ PCR walk using the R2 primer with either Primer A, B, C or D, as labelled above each lane. (e) RNA-sequence profiles in T47D and MCF7 and RNA Polymerase II (RNA PolII) chip profile of MCF7 at the miR-200b locus. Peaks in RNA-sequence tracks represent expression detected at that region. Peaks represent RNA PolII binding in RNA PolII track. Horizontal bars indicate the location of the miR-200b cluster, CGI and oestrogen response element.

The miR-200b cluster P2 promoter is sufficient to drive expression of miR-200b

To determine whether the P2 promoter could drive the expression of miR200b in its endogenous genomic context, low miR-200b expressing MDA-MB-231 cells (Gregory et al., 2008), were transfected with a miR-200b minigene spanning the P2, but not the P1, promoter and the sequence corresponding to the mature miR-200. This minigene was generated by replacing the luciferase coding sequence of pGL3-basic with the miR-200b genomic sequence (Figure 6c). The introduction of the miR-200b minigene resulted in an eightfold increase in mature miR-200b expression over the pGL3-basic control (Figure 6c). Deletion of the minimal promoter in the minigene reduced miR-200b expression by 50% (Figure 6c). Collectively, these results indicate that the P2 promoter can regulate miR-200b, and very possibly mir-200a and 429, as a polycistronic primary transcript (Bracken et al., 2008).

The miR-200b cluster P1 and P2 promoters are independent

To address the hypothesis that P1 and P2 promoters function synergistically to enhance expression of the miR-200b cluster, a 2.5-kb fragment encompassing both promoters was cloned upstream of the luciferase gene and assayed for reporter activity in MDA-MB-231 cells, in which only P2 was observed to be functional, and MCF7 cells, in which functional activity was observed for both promoters. As predicted, the P1+P2 fragment produced similar reporter activity to the P2 fragment alone in MDA-MB-231 cells (P=0.34) (Figure 6a). In contrast, in MCF7 cells, the reporter activity of P1+P2 was not significantly greater than the activity of P1 (P=0.4), but was significantly stronger than P2 (P<0.05) (Figure 6a). Also, the activity of P2 alone and P1+P2 were also significantly higher (P<0.05) in MCF7 than in MDA-MB-231 cells (Figure 6a). Taken together, this data suggests that the P1 and P2 promoters function independently.

The miR-200b cluster has multiple TSS

To complement the promoter mapping experiments, attempts were made to map the TSS of P2. Classical 5′ RACE PCR was employed to determine the TSS of P2 using RNA from MDA-MB-231 cells transfected with the P2 construct to enrich for P2 derived transcripts. However, repeated attempts with the classical 5′ RACE protocol were unsuccessful, consistently producing non-specific smears (data not shown). Successful amplification of template controls indicated that the cDNA synthesis had worked and that this result was more likely to reflect heterogeneity in miR-200b cluster transcripts. An alternative ‘PCR walk’ approach to mapping the TSS was performed using a single transcript-specific reverse primer and various forward primers toward the 5′ end of the cDNA transcript. The longest transcript extended from −3032 bp to −2447 bp upstream of the 5′ stemloop as indicated by loss of PCR amplification (Figure 6d). This observation was in agreement with the both the minigene and the luciferase reporter assays. To further test the hypothesis that miR-200b has multiple TSS, publicly available breast cancer specific RNA-Sequence and RNA PolII–chip data were analysed at the miR-200b loci (Figure 6e). Multiple RNA-sequence peaks were observed along the CGI for T47D and MCF7 cells indicating expression from the CGI. Furthermore, multiple RNA PolII binding signals in MCF7 cells were detected along the associated CGI suggesting multiple TSS. A strong RNA PolII signal overlapping P1 also suggested preferential transcription from P1 in MCF7 (Figure 6e).

Methylation of miR-200b cluster and miR-335 promoters is associated with reduced miRNA expression

DNA methylation of the minimal promoters of the 15 miRNA was assessed by methylation-sensitive high-resolution melt analysis (Wojdacz and Dobrovic, 2007) in a panel of nine breast cancer cell lines. The proximal miR-9-1 promoter was not included as methylation of this promoter had been previously described (Lehmann et al., 2008). The miR-17 cluster promoter was also excluded because the high density of CG dinucleotides made it unsuitable for methylation-sensitive high-resolution melt analysis. Promoter methylation was then compared with miRNA expression in the same cell lines. Mir-200b cluster and miR-335 promoter methylation were inversely associated with miRNA expression. For miR-200b cluster, eight out of the nine cell lines displayed an inverse association (Figure 7a, Supplementary Figure 2). Although only MCF7 highly expressed miR-335, MCF7 also had the lowest methylation compared with the remaining eight, which were fully methylated and had minimal miR-335 expression (Supplementary Figure 1A). However, since the inverse association was stronger in miR-200b, P1 and P2 promoters represented better candidates for further analysis. In contrast, the miR-210 and miR-320a promoters were unmethylated in all nine cell lines although DNA methylation was not associated with miRNA expression for miR-9, miR-10b, miR-124, miR-373 and miR-663 (Supplementary Figure 1).

Figure 7
figure 7

DNA methylation represses miR-200b P1 and P2 activity in breast cancer cells. (a) DNA methylation and miR-200b expression levels in a panel of nine breast cancer cell lines. Top: black bars represent the percentage methylation. Bottom: miR-200b expression was assessed by qPCR. Expression is shown relative to RNU6B and bars represent the mean±s.e. of two independent experiments. (b) Reporter activity of miR-200b P1 and P2 methylated by SssI DNA methylase (white) compared with mock methylated plasmids (grey)±s.d. of two separate experiments.

The minimal miR-200b cluster promoters are regulated by DNA methylation

The novel P2 promoter had comparable activity to P1 in MCF7 cells, but unlike P1, was functional in both cell lines tested (Figure 6a). The minimal P2 promoter maps to −2228/−1993 bp upstream of the miR-200b 5′ stemloop (Figure 6b). To confirm that DNA methylation directly repressed promoter activity, P1 and P2 were cloned into a CpG-free reporter construct (Klug and Rehli, 2006) and in vitro methylated by SssI DNA methylase. Methylated P1 and P2 constructs displayed a significant reduction in promoter activity, compared with their mock methylated constructs when transfected into T47D cells, in which both promoters are endogenously unmethylated and functional (Figure 7b). This suggests that DNA methylation represses miR200b cluster promoter activity.

miR-200b P1 and P2 promoters are differentially methylated in primary breast tumours

To study DNA methylation of the miR-200b promoters, Sequenom MassArray was performed on Grade 3 FFPE clinical samples. In all cases, P1 and P2 were differentially methylated in both tumours and lymph nodes (Figures 8a and b). In addition, P1, but not P2, was hypermethylated in lymph nodes compared with matched primary tumours (Figures 8c and d).

Figure 8
figure 8

Differential methylation of P1 and P2 in clinical samples. (a) Log10 ratios of P1 to P2 in 26 primary tumours. (b) Log10 ratios of P1 to P2 in 23 lymph nodes (LN). Positive values: P1>P2; negative values: P1<P2. Graph heights represent magnitude of difference in methylation between P1 and P2. (c) Mean methylation profile in matched tumours and LN with horizontal s.e. bars for individual CpG units. t-Test P-value as indicated. (d) Box plot of the average methylation in tumours and LN. (+): median, box: 25–75 percentile, whiskers: max/min, N: sample size, Mann–Whitley P-values as indicated.

To determine if hypermethylation was associated with expression of the miR-200b cluster in primary tumours, qPCR for miR-200b was performed on tumour samples from which RNA was available. P1 hypermethylation was associated with loss of miR-200b expression in seven out of nine samples (Supplementary Figure 3A) whereas P2 was found to be associated with loss of miR-200b expression in six out of seven samples tested (Supplementary Figure 3B). These suggested that hypermethylation of miR-200b cluster promoters could regulate miRNA expression in tumours.

Methylation of the miR-200b P2 promoter is associated with ER, PR, HER2 and androgen receptor expression in primary breast tumours

To ascertain whether DNA methylation of the miR-200b cluster promoters is associated with expression of routinely used breast cancer biomarkers, ER, PR and HER2, methylation was assessed in patients positive and negative for expression of these receptors. Methylation of P2, but not P1, was significantly higher in tumours that were ER or PR negative (Figures 9a and b, respectively). Hypermethylation of P2 was also associated with HER2 positivity (Figure 9c). Androgen receptor, a potential breast cancer biomarker (Hu et al., 2011) and regulator of the miR-200 family, (Xu et al., 2010; Waltering et al., 2011) was also associated with hypermethylation of P2 (Figure 9d). Although the mir-200b cluster is involved in metastasis, which in turn affects prognosis, no evidence of an association between DNA methylation and survival was found.

Figure 9
figure 9

P2 methylation is associated with ER, PR, HER2 and AR receptor status. (a) Methylation status of ER positive (pos) and ER negative (neg) cohorts. (b) Methylation status of PR pos and PR neg cohorts. (c) Methylation status of HER2 pos and HER2 neg cohorts. (d) Methylation status of AR pos and AR neg cohorts. Left: mean methylation profile with horizontal s.e. bars for individual CpG units. t-Test P-value as indicated. Right: box plot of the average methylation in tumours and LN. (+): median, box: 25–75 percentile, whiskers: max/min, N: sample size, Mann–Whitley P-values as indicated.

Discussion

Transcriptional regulation of miRNA genes is poorly understood and only a few miRNA promoters have been reported. A comprehensive understanding of miRNA promoters is a prerequisite for their use as genetic or epigenetic biomarkers. In this report, novel CGI-associated miRNA promoters were mapped and analysed for associations between DNA methylation and miRNA expression. In all, 59% of the miRNAs examined were associated with a CGI within 5 kb upstream, similar to the estimated proportion of CGI-associated coding genes and was consistent with previous estimations for miRNA promoters (Ozsolak et al., 2008; Corcoran et al., 2009). Twenty-two novel promoters were identified and shown to be active in reporter assays. MiR-10b had a previously described promoter (Ma et al., 2007; Zhou et al., 2007) immediately upstream of the mature miRNA sequence (Figure 2). However, we could not detect promoter activity for this fragment in the breast cancer cells tested (Figure 4c). We were also unable to detect any activity in fragments encompassing CoreBoost_HM predicted promoter regions for miR-125a in the cell lines used (Figures 3 and 4i). A likely explanation was that neither cell line expressed miR-125a.

In all, 7 of 15 miRNAs had two or more promoters in close proximity, usually at either ends of the associated CGI. Although it was not clear how the multiple promoters function in regulating their miRNA genes or why the promoters were usually at either end of the CGI, it was evident that regulation of miRNAs is a complex process. The miR-200b cluster (miR-200a, miR-200b and miR-429) is an example of a miRNA with promoters at either end of the CGI. The P1 promoter, located at the distal end of the CGI, was predicted (Bracken et al., 2008) based on the presence of a 5′ EST, the presence of E-Box motifs and the presence of a CGI, which is commonly associated with promoters of coding genes. Further, a 7.5-kb primary transcript of the miR-200b cluster was described using a ‘PCR walk’ approach and P1 promoter activity was demonstrated using a luciferase reporter assay. Using a similar approach, a novel promoter, P2, is described here. P2 was predicted 2.5 Kb downstream of P1 (Figure 3) by the CoreBoost_HM promoter prediction algorithm (Wang et al., 2009), which utilizes empirical data such as ESTs, RNA PolII binding and histone modification profiles in addition to DNA motifs associated with core promoters to accurately predict active promoter sites. We demonstrate that the P2 promoter has an activity similar to that of the P1 promoter (Figure 6a), is functional in breast cancer cell lines (Figures 6a and 7b) and is able to drive the expression of miR-200b in its endogenous genomic context (Figure 6c). Thus, the P2 promoter is likely to be important in the regulation of the miR-200b cluster. Deletion of the P2 minimal promoter also reduced miR-200b levels by 50% (Figure 6c), and may indicate multiple TSS as previously suggested (Wiklund et al., 2011). In addition, DNA methylation of both miR-200b promoters repressed miR-200b expression in eight out of nine breast cancer cell lines studied (Figure 7), suggesting regulation by DNA methylation. However, the precise role of P1 and P2 in regulating the cluster is unclear. In our reporter assays, P1 and P2 seemed to function independently (Figure 6a). In clinical samples, DNA methylation at P1 was also different compared with P2 in both tumour and lymph node metastases (Figures 8a and b), thus supporting the hypothesis that the two promoters have different regulatory roles. This hypothesis is supported by other studies in bladder cancer cells, where a region encompassing P2, but not P1, was unmethylated and expressed high levels of miR-200b (Wiklund et al., 2011). Taken together, the evidence suggests that the P1 and P2 transcripts are regulated by different mechanisms and this could in turn have a role in regulating metastasis.

Of the eight cell lines studied, MCF7 did not show an inverse association between methylation and miRNA expression (Figure 7a). A minority of patients also did display a reciprocal relationship between promoter methylation and miRNA levels (Supplementary Figure 3). In previous studies (Han et al., 2007; Wiklund et al., 2011), miRNA repression by DNA methylation is usually accompanied by histone modifications associated with gene silencing. Thus, other mechanisms including chromatin remodeling or post-transcriptional regulatory events may account for this inconsistency. Perhaps these repressive histone marks were absent in these cases thus resulting in open chromatin that was readily expressed.

In this study we describe, for the first time, differential methylation of the P1 and P2 region of the miR-200b cluster in breast cancer. The differential methylation is functional is evidenced by our observations that DNA methylation is inversely associated with miR-200b expression in both breast cancer cell lines (Figure 7) and clinical samples (Supplementary Figure 3). These are consistent with the previously reported tumour suppressive role of miR-200b (Korpal and Kang, 2008). They are also consistent with previous reports of aberrant DNA methylation of the miR-200b cluster proximal CGI, containing both P1 and P2, in colon, bladder and pancreatic cancers (Han et al., 2007; Li et al., 2010; Wiklund et al., 2011).

Loss of ER and PR expression was also associated with DNA methylation at P2 in breast tumours (Figures 9a and b). Patients with tumours that express these receptors often have a better prognosis because they respond well to treatments such as Tamoxifen. We hypothesize that methylation at P2, is likely to be associated with a lower level of miRNA expression, resulting in a more aggressive tumour (Korpal and Kang, 2008) that is unresponsive to these therapies and generally associated with poor prognosis. Using publicly available ER Chip-sequence data (Schmidt et al., 2010), ER bound to a putative ER response element just downstream of P1 upon ER stimulation in MCF7s (Figure 6e). In a microarray study (Klinge, 2009), miR-200a and miR-200b were significantly upregulated in MCF7 after 6 h of E2 induction. However, in a similar independent study, miR-200a and 200c were found to be significantly downregulated after 48 h of E2 induction (Maillot et al., 2009). Although the studies seemed to have conflicting conclusions, they do suggest a possible regulatory mechanism between ER and the miR-200 family.

A double negative feedback regulatory relationship between the miR-200 family and ZEB1 (Bracken et al., 2008; Burk et al., 2008) has been shown to regulate the delicate balance between mesenchymal and epithelial cellular states. Based on this data, we propose that miR-200b is repressed in the early stages of tumourigenesis in order to promoter EMT and thus the spread of the tumour, followed by later induction of miR-200b to promote mesenchymal–epithelial transition and thus establishment of the tumour cells at a distant site (for example, lymph node). Our data is consistent with this as we show only P1 was hypermethylated in matched lymph nodes compared with their primary tumours. This coupled with miRNA repression, suggests a DNA methylation mechanism for EMT initiation in addition to the previously described TGFB/ZEB pathway. At P2, no differential methylation between primary tumours and matched lymph node and thus possibly maintaining base levels of miR-200 is consistent with the mesenchymal–epithelial transition observed in mouse models. Metastatic murine breast cancer cells expressing low levels of miR-200 were able to invade distant tissue but unable to colonize. However, when miR-200 was overexpressed, these cells could form macroscopic tumours at distant sites (Dykxhoorn et al., 2009). Further support for this model comes from studies in the bladder cancer (Wiklund et al., 2011) where hypomethylation of the P2 region was sufficient for miR-200b cluster expression. This hypomethylation could also possibly account for the elevated levels of the miR-200 family, observed in other cancers (Hiroki et al., 2010; Li et al., 2010; Lee et al., 2011).

Collectively, the evidence presented here indicates that miR-200b cluster regulation is complex and is regulated transcriptionally by at least two distinct promoters that are sensitive to DNA methylation. The novel P2 promoter functions independently of P1 and can drive the expression of miR-200b. However, the precise roles of P1 or P2 and under what conditions they are utilized is still not clear and will require further examination.

Materials and methods

Bioinformatics

A list of 93 miRNAs implicated in breast cancer was generated by literature review. Genomic sequences +5 and −1 kb of the 5′ stemloop of each miRNA were analysed for CGIs using CpGPLot and CpG Island Searcher. Putative promoters are defined by a CoreBoost_HM score of at least 0.5 (Wang et al., 2009) within this 6 kb window. Initial 600–1000 bp fragments overlapping the predicted sites were cloned and assayed for promoter activity as described later. This process is illustrated in Figure 1.

RNA Polymerase II Chip-sequence data mapped to human genome HG18 was obtained from the National Center for Biotechnology Gene Expression Omnibus, GEO accession number GSE14664. RNA-sequence data (Wang et al., 2008) mapped to human genome HG18. RNA Polymerase II and RNA-sequence data were visualized on integrative genome viewer using the data ranges indicated.

Cell culture

Breast cancer cell lines MDAMB157, MDAMB231, MDAMB436, MDAMB468, MCF7, T47D, ZR75-1, Hs578T and BT549 were obtained from American Type Culture Collection (ATCC, Manassas, VA, USA) and cultured according to the manufacturer's recommendations.

Patient samples

Human breast tumours and matching lymph node metastases were collected from 56 patients, as approved by local Human Ethics committees, who underwent surgical resection and did not undergo preoperative radiochemotherapy at Princess Alexandra Hospital between 1988 and 2000. All patients were female aged from 30 to 94 years old, with a median age of 56 years. ER, PR and HER2 receptor status of each patient were determined by a qualified pathologist. Details are provided in the Supplementary Information.

DNA extractions and purifications

Genomic DNA from cell lines was extracted using the NucleoSpin Tissue Prep kit (Macherey-Nagel, Germany) according to the manufacturer's instructions. Plasmid DNA was purified using the Miniprep Kit (Qiagen, Doncaster, VIC, Australia). For human tumour samples, four FFPE tumour-rich tissue cores (1 × 0.6 mm) were crushed and digested with proteinase K at 55 °C for 2 days. Genomic DNA was purified using the PureGene kit (Qiagen) according to the manufacturer's instructions.

Generation of plasmid constructs

All promoter reporter constructs were cloned into pGL3-Basic (Promega, Sydney, NSW, Australia) unless otherwise specified. For the in vitro methylation plasmids, P1 and P2 fragments were cloned into a CpG-free luciferase reporter construct pCpG-basic (Klug and Rehli, 2006; a gift from Klug and Rehli). PCR was performed using KapaHiFi polymerase (Kapa Biosystems, Woburn, MA, USA). All constructs were confirmed by sequencing. All primers used and cloning details are provided in the Supplementary.

Transfections and reporter assays

All transfections used a 3-μl:1 μg ratio of Fugene (Roche, Castle Hill, NSW, Australia) tranfection reagent to DNA. For luciferase assays, either MCF7 or MDA-MB-231cells were co-transfected with 400 ng of promoter construct and 10 ng of RL-TK plasmid (Promega) as a transfection control and harvested and assayed for reporter activity after 48 h. The Dual-Glo luciferase Assay kit (Promega) was used as recommended by the manufacturer. Firefly luciferase levels were normalized to Renilla luciferase levels and expressed relative to pGL3-basic levels (RLU). Statistical analysis was performed using unpaired two-tailed t-test.

For minigene experiments, MDA-MB-231 cells were grown to 60–70% confluence in 6-well plates, and transfected with 1 μg of DNA and harvested after 72 h.

Identification of TSS of Hsa-mir-200b

Total RNA from MCF7 transfected with the P2 luciferase reporter construct was extracted using Trizol (Invitrogen) and DNaseI (NEB, Ipswich, MA, USA) treated. First strand cDNA was reverse transcribed using SuperSciptIII (Invitrogen) using a luciferase specific primer, R1, at 50 °C. This then served as a template for PCR amplification. PCR ‘walking’ towards the 5′ end was performed using primers A to D with R2. All PCR products were visualized on a 1% agarose gel. Primer sequences are given in Supplementary Table 3. Classical 5′RACE was perform as previously described (2005).

Quantitation of miRNAs

Total RNA was extracted from cell lines using Trizol (Invitrogen). RNA from clinical samples was extracted using miRNeasy kit (Qiagen). For miR200b and miR335 experiments, cDNA was made from total RNA using TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems, Mulgrave, VIC, Australia) with both reverse transcription miRNA and RNU6B (loading control) primers in the same reaction. Real-time PCR was performed using the TaqMan microRNA Assay (Applied Biosystems) according to the manufacturer's instructions. For all other miRNAs, the Qiagen miScript PCR system for miRNA quantification was used with the RNU6B loading control. Changes in expression levels were calculated using ΔΔCt method (Livak and Schmittgen, 2001).

Bisulfite modification and methylation-sensitive high-resolution melt analysis

2 μg of DNA extracted from cell lines was subjected to bisulfite modification with MethylEasy Xceed kit (Human Genetic Signatures, Randwick, NSW, Australia) according to manufacturer's instructions. PCR amplification and methylation-sensitive high-resolution melt analysis (Wojdacz and Dobrovic, 2007) was performed in duplicate on the RotorGene Q (Qiagen). Primers were designed according to the principles outlined (Wojdacz and Dobrovic, 2007) to control for PCR bias and are shown in Supplementary Tables 4 and 5. PCR conditions are provided in the Supplementary. Bisulfite treated CpGenome Universal Methylated DNA (Chemicon, Millipore, Kilsyth, VIC, Australia) and DNA from the appropriate cell lines were used as positive/methylated and negative/unmethylated controls, respectively. WGA DNA made with the GenomiPhi kit (Amersham GE Healthcare, Rydalmere, NSW, Australia) was used as unmethylated controls for miR335 and miR663. Included in the analysis of each region, controls were mixed in 25, 50 and 75% methylated to unmethylated template ratios.

In vitro methylation of plasmid DNA

DNA was methylated using SssI (NEB) as previously described (Klug and Rehli, 2006). Briefly, plasmids were incubated with SssI (2.5 U/μg) with 160 μM S-adenosylmethionine at 37 °C for 4 h and supplemented with an additional 160 μM of S-adenosylmethionine for another 4 h at 37 °C. Mock methylated plasmids controls were treated similarly but without enzyme. Plasmids were recovered by phenol/chroloform, followed by ethanol precipitation, transfected into T47D cells and luciferase assays performed.

Sequenom MassArray

Genomic DNA from clinical samples was bisulfite converted with EZ-96 DNA methylation kit (Zymo Research, Irvine, CA, USA). Methylation levels in clinical samples were determined using Sequenom MassArray, performed according to manufacturer's recommendations for T-cleavage chemistry protocol and analysed between a 1640 and 7000 mass window (Coolen et al., 2007). Average methylation of each patient is defined as the average percent methylation of all CpG units in each amplicon. Average methylation of each CpG cluster (or profile) is defined as the average percent methylation of the cohort for that specific CpG cluster. In all, 0–100% methylation are represented by 0.0 to 1.0. Primer sequences are provided in Supplementary Table 4.