ValIdated Systematic IntegratiON of hematopoietic epigenomes
The VISION project began in summer of 2016. This description and list of deliverables is what we plan to accomplish. Some resources are available now, and links to them are provided on the VISION home page.
The VISION project (ValIdated Systematic IntegratiON of hematopoietic epigenomes) is under consideration for funding via the FOA on Collaborative Interdisciplinary Team Science in NIDDK Research Areas. The problem we address is how to utilize the enormous amounts of emerging epigenetic data effectively both for basic research and precision medicine. We will consolidate hundreds of epigenomic datasets and apply integrative approaches to generate robust candidate functional assignments to DNA segments. These assignments, coupled with gene target predictions and results of genome editing experiments, will be the input to machine-learning approaches that will generate quantitative models for how each candidate CRM contributes to the regulation of its target gene. Importantly, these models will be rigorously tested and validated by targeted genome editing in reference loci, and then applied genome-wide. Furthermore, we will expand resources to enable more accurate translation of regulatory insights between mouse and human.
The data and resources generated in our VISION project are intended to enable better research by a large community of investigators. The deliverables from our project will harvest the truly valuable information within the flood of epigenomic data, and provide the results of the integrative analysis, modeling, and experimental validations in a manner readily used by the larger community. Each member of our investigative team is committed to the goal of building resources to help the larger community find answers to enduring questions in hematopoiesis and accelerate improvements in therapy for hematological disorders.
Thus, we embrace the imperative that the data and resources be released to the public rapidly and in a form that is both understandable and usable by the wider community. Furthermore, we are fully aware of the need for transparency and accuracy at all levels of data acquisition and analysis, from metadata describing each sample analyzed (mouse strain, cell type and how it was isolated, experimental procedures used, sequencing methodology used, etc.) to the pipelines used for mapping and analyzing sequencing reads to integrative analyses. Some of the PIs are heavily involved in the Galaxy project, which not only provides a computational platform enabling sophisticated analysis of large datasets by a wide community, but also is designed to insure transparency and reproducibility in analysis. Several of the PIs are active in the ENCODE projects and have been active in large scale sequencing projects. Deliverables
This project will deliver three categories of information, each supported by web-based resources:
The modes of delivery are readily accessible, web-based platforms including customized browsers, databases with facile query interfaces, and data-driven on-line tools. The links on the homepage take you to existing interfaces.