Data Analytics Core

The PreMiEr Data Analytics Core is an integral part of the success of the ERC by supporting the hardware and software needs of all other research thrusts.  Its goals can be broken down into three main areas:

A seamless, central repository with core services providing

  1. Scientific data processing and analysis support
  2. Report and figure creation services
  3. Seamless and automated ability to start and modify processes and analyses

Ensuring transparency and reproducibility

  1. Eliminate “silos” of data
  2. Allow teams to easily work together
  3. Publish results that fully expose the processing pipeline and publish virtualized containers
  4. Support version control

Being “hardware agnostic”

  1. Allow any PreMiEr researcher (or other interested scientist) to faithfully reproduce analyses regardless of hardware
  2. Provide virtualized containers for on-campus and other systems, including a primary system at UNC-Charlotte, the ViCAR system at North Carolina A&T, and the cloud

Currently Funded Projects

DA-1: Overcoming contamination in low-biomass microbiome measurements

Environments in indoor spaces often reflect low biomass conditions, or spaces where the number of microbes is relatively low. Accurately sampling and sequencing the DNA or RNA of these microbes and distinguising these communities from contaminants is a major concern.  This project aims to develop the next generation of low-biomass microbial measurements that are effective, economical, and relevant to the wide spectrum of built environment sample types.

Collaborations

      

   

Benjamin Callahan

Benjamin Callahan

NC State, Project Lead

Kristen Rhinehardt

Kristen Rhinehardt

N.C. A&T

Lawrence David

Lawrence David

Duke

DA-2: Building a Building Genome Collection (BBuGCo)

 

Microbiome research depends on the microbial genomes in databases. Microbiome sequence data that cannot be mapped to a genome in a database is biological “dark matter” and it is difficult or impossible to derive biological meaning from this data. This project aims to determine optimal sequencing and analysis approaches for different sample types and to determine, for each sample type, which sequencing approach is most amenable to generating high quality metagenome-assembled genomes (MAGs).

Collaborations

            

     

Joshua Granek

Joshua Granek

Duke, Project Lead

Claudia Gunsch

Claudia Gunsch

Duke

Rachel Noble

Rachel Noble

UNC-Chapel Hill

Jennifer Kuzma

Jennifer Kuzma

NC State

Benjamin Callahan

Benjamin Callahan

NC State

Scott Harrison

Scott Harrison

N.C. A&T

DA-3: Development of a computing infrastructure network and bioinformatic workflows for microbiome research

This project will determine the computational assets and needs across the center and develop a comprehensive PreMiEr computational research and training infrastructure to support bioinformatics and microbiome research. This will be accomplished through an assessment of the computational resources within PreMiEr and the development of bioinformatic workflows with training modules accessible to all participants.

Collaborations

      University of North Carolina at Charlotte

Kristen Rhinehardt

Kristen Rhinehardt

N.C. A&T, Project Lead

Anthony Fodor

Anthony Fodor

UNC Charlotte

Joshua Granek

Joshua Granek

Duke