Poster Presentation Lorne Infection and Immunity 2014

Introducing RedDog; a mapping-based genome comparison pipeline for high-throughput bacterial sequence data. (#138)

David J Edwards 1 , Kathryn E Holt 1 , Bernard J Pope 2
  1. Biochemistry and Molecular Science, University of Melbourne, Melbourne
  2. VLSCI, Melbourne

It is currently feasible to sequence 1000s of bacterial isolates using high-throughput (HTP) sequencing platforms, thereby revolutionizing the study of bacterial evolution and pathogen outbreak investigation. HTP sequencers produce millions of short reads which can be compared to reference genomes to identify single nucleotide polymorphisms (SNPs). This forms the basis for further phylogenetic and evolutionary analyses. Whilst sequencing is increasingly affordable, the major barrier to wider use of bacterial genomics stems from the sheer volume of sequences that need to be processed.

Here we present RedDog, a computational pipeline for mapping, SNP calling and gene content comparison that can handle both large numbers of isolates and extensive pan genomes (reference genomes with large numbers of chromosome, plasmid and accessory sequences). RedDog uses fast and accurate open-source software and has been tested on thousands of genomes from over a dozen pathogenic bacterial species. Several quality control steps are built into the pipeline, and quality statistics are reported for downstream analysis. RedDog is free and open source, can be easily modified and provides a fast, robust and reliable method for generating high-resolution phylogenies and gene content matrices from large sets of raw sequencing reads.