Integrating multiple oestrogen receptor alpha ChIP studies: overlap with disease susceptibility regions, DNase I hypersensitivity peaks and gene expression.
Handel AE., Sandve GK., Disanto G., Handunnetthi L., Giovannoni G., Ramagopalan SV.
BACKGROUND: A wealth of nuclear receptor binding data has been generated by the application of chromatin immunoprecipitation (ChIP) techniques. However, there have been relatively few attempts to apply these datasets to human complex disease or traits. METHODS: We integrated multiple oestrogen receptor alpha (ESR1) ChIP datasets in the Genomic Hyperbrowser. We analysed these datasets for overlap with DNase I hypersensitivity peaks, differentially expressed genes with estradiol treatment and regions near single nucleotide polymorphisms associated with sex-related diseases and traits. We used FIMO to scan ESR1 binding sites for classical ESR1 binding motifs drawn from the JASPAR database. RESULTS: We found that binding sites present in multiple datasets were enriched for classical ESR1 binding motifs, DNase I hypersensitivity peaks and differentially expressed genes after estradiol treatment compared with those present in only few datasets. There was significant enrichment of ESR1 binding present in multiple datasets near genomic regions associated with breast cancer (7.45-fold, p = 0.001), height (2.45-fold, p = 0.002), multiple sclerosis (5.97-fold, p < 0.0002) and prostate cancer (4.47-fold, p = 0.0008), and suggestive evidence of ESR1 enrichment for regions associated with coronary artery disease, ovarian cancer, Parkinson's disease, polycystic ovarian syndrome and testicular cancer. Integration of multiple cell line ESR1 ChIP datasets also increases overlap with ESR1 ChIP-seq peaks from primary cancer samples, further supporting this approach as helpful in identifying true positive ESR1 binding sites in cell line systems. CONCLUSIONS: Our study suggests that integration of multiple ChIP datasets can highlight binding sites likely to be of particular biological importance and can provide important insights into understanding human health and disease. However, it also highlights the high number of likely false positive binding sites in ChIP datasets drawn from cell lines and illustrates the importance of considering multiple independent experiments together.