De Novo Genome Assembly: Difference between revisions

From Algorithm Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 6: Line 6:
== Parameters ==  
== Parameters ==  


No parameters found.
$n$: sum of lengths of reads
 
$f$: number of input sequences


== Table of Algorithms ==  
== Table of Algorithms ==  
Line 22: Line 24:
| [[de Bruijn Graph (Idury, Waterman) (De Novo Genome Assembly De Novo Genome Assembly)|de Bruijn Graph (Idury, Waterman)]] || 1994 || $O(n^{2})$ || $O(n)$? || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/7497130 Time]
| [[de Bruijn Graph (Idury, Waterman) (De Novo Genome Assembly De Novo Genome Assembly)|de Bruijn Graph (Idury, Waterman)]] || 1994 || $O(n^{2})$ || $O(n)$? || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/7497130 Time]
|-
|-
| [[String Graph (Myers) (De Novo Genome Assembly De Novo Genome Assembly)|String Graph (Myers)]] || 1994 || $O(nlogn)$ || $O(n)$? || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/7497129 Time]
| [[String Graph (Myers) (De Novo Genome Assembly De Novo Genome Assembly)|String Graph (Myers)]] || 1994 || $O(n \log n)$ || $O(n)$? || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/7497129 Time]
|-
|-
| [[String Graph with Ferragina–Manzini Index (Simpson, Durbin) (De Novo Genome Assembly De Novo Genome Assembly)|String Graph with Ferragina–Manzini Index (Simpson, Durbin)]] || 2010 || $O(n)$ || $O(n)$? || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881401/pdf/btq217.pdf Time]
| [[String Graph with Ferragina–Manzini Index (Simpson, Durbin) (De Novo Genome Assembly De Novo Genome Assembly)|String Graph with Ferragina–Manzini Index (Simpson, Durbin)]] || 2010 || $O(n)$ || $O(n)$? || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881401/pdf/btq217.pdf Time]

Revision as of 08:22, 10 April 2023

Description

De novo sequencing refers to sequencing a novel genome where there is no reference sequence available for alignment. Sequence reads are assembled as contigs, and the coverage quality of de novo sequence data depends on the size and continuity of the contigs (ie, the number of gaps in the data).

Parameters

$n$: sum of lengths of reads

$f$: number of input sequences

Table of Algorithms

Name Year Time Space Approximation Factor Model Reference
Overlap Layout Consensus 1987 $O(n^{2})$ $O(n^{2})$? Deterministic
Greedy SEQAID 1984 $O(n^{2})$? $O(n^{2})$? Deterministic Time
de Bruijn Graph (Idury, Waterman) 1994 $O(n^{2})$ $O(n)$? Exact Deterministic Time
String Graph (Myers) 1994 $O(n \log n)$ $O(n)$? Exact Deterministic Time
String Graph with Ferragina–Manzini Index (Simpson, Durbin) 2010 $O(n)$ $O(n)$? Exact Deterministic Time
Hybrid Algorithm 1999 $O(n^{2})$ Exact Deterministic

Time Complexity Graph

De Novo Genome Assembly - Time.png

Space Complexity Graph

De Novo Genome Assembly - Space.png

Time-Space Tradeoff

De Novo Genome Assembly - Pareto Frontier.png