Metagenomic Assembly Evaluation

Decreasing the assembler space

Ino de Bruijn / @inodb
Presentation at ino.pm/2014hack

de Bruijn Graph

de Bruijn graph
  • cut up reads into overlapping kmers
  • unique kmers determine comp reqs

Purity noscaf recipes

purity hist

Purity thresholds

Kmer origin

kraken venn kraken venn

Impure contigs

kraken venn
  • Number of kmers chimeric: 219020
  • Number of kmers erroneous: 89314
kraken venn
  • unclassifed: not in any of categories below
  • a_tip_error: has erroneous kmers at the tip, but also other erroneous kmers
  • one_tip_error: has one tip with erroneous_kmers and no other erroneous kmers
  • two_tip_error: has two tips with erroneous_kmers and no other erroneous kmers
  • one_break: has one point not located at the tips with only erroneous kmers
  • 100% qrycov: the query contig is completely aligned to the reference

Predicting contig impurity

  • Tried several assembly quality predictors without luck
    • FRCurve
    • REAPR
    • ALE

Current State masmvaliweb

masmvali django

Current State masmvaliweb

masmvali flask

Stop

Ask questions

Time