Data Analytics Core
The PreMiEr Data Analytics Core is an integral part of the success of the ERC by supporting the hardware and software needs of all other research thrusts. Its goals can be broken down into three main areas:
A seamless, central repository with core services providing
- Scientific data processing and analysis support
- Report and figure creation services
- Seamless and automated ability to start and modify processes and analyses
Ensuring transparency and reproducibility
- Eliminate “silos” of data
- Allow teams to easily work together
- Publish results that fully expose the processing pipeline and publish virtualized containers
- Support version control
Being “hardware agnostic”
- Allow any PreMiEr researcher (or other interested scientist) to faithfully reproduce analyses regardless of hardware
- Provide virtualized containers for on-campus and other systems, including a primary system at UNC-Charlotte, the ViCAR system at North Carolina A&T, and the cloud
Currently Funded Projects
DA-1: Overcoming contamination in low-biomass microbiome measurements
Environments in indoor spaces often reflect low biomass conditions, or spaces where the number of microbes is relatively low. Accurately sampling and sequencing the DNA or RNA of these microbes and distinguising these communities from contaminants is a major concern. This project aims to develop the next generation of low-biomass microbial measurements that are effective, economical, and relevant to the wide spectrum of built environment sample types.
Collaborations
Benjamin Callahan
NC State, Project Lead
Kristen Rhinehardt
N.C. A&T
Lawrence David
Duke
DA-2: Building a Building Genome Collection (BBuGCo)
Microbiome research depends on the microbial genomes in databases. Microbiome sequence data that cannot be mapped to a genome in a database is biological “dark matter” and it is difficult or impossible to derive biological meaning from this data. This project aims to determine optimal sequencing and analysis approaches for different sample types and to determine, for each sample type, which sequencing approach is most amenable to generating high quality metagenome-assembled genomes (MAGs).
Collaborations
Joshua Granek
Duke, Project Lead
Claudia Gunsch
Duke
Rachel Noble
UNC-Chapel Hill
Jennifer Kuzma
NC State
Benjamin Callahan
NC State
Scott Harrison
N.C. A&T
DA-3: Development of a computing infrastructure network and bioinformatic workflows for microbiome research
This project will determine the computational assets and needs across the center and develop a comprehensive PreMiEr computational research and training infrastructure to support bioinformatics and microbiome research. This will be accomplished through an assessment of the computational resources within PreMiEr and the development of bioinformatic workflows with training modules accessible to all participants.
Collaborations
Kristen Rhinehardt
N.C. A&T, Project Lead
Anthony Fodor
UNC Charlotte
Joshua Granek
Duke