Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) : материалы временных коллективов

Описание

Тип публикации: доклад, тезисы доклада, статья из сборника материалов конференций

Конференция: International Multiconference on Bioinformatics of Genome Regulation and Structure\Systems Biology (BDRS\SB); Novosibirsk, RUSSIA; Novosibirsk, RUSSIA

Год издания: 2019

Идентификатор DOI: 10.1186/s12859-018-2570-y

Ключевые слова: de novo genome assembly, Siberian larch, Larix sibirica, Forestry, Numerical methods, Arabidopsis thaliana, De novo assemblies, Genome assembly, Nucleotide sequences, Pinus lambertiana, Pseudotsuga menziesii, Genes, article, Douglas fir, genome, human, Larix

Аннотация: BackgroundDe novo assembling of large genomes, such as in conifers (similar to 12-30 Gbp), which also consist of similar to 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA asПоказать полностьюsemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly.ResultsAn original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method.ConclusionUsing the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii. Background: De novo assembling of large genomes, such as in conifers (~ 12-30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computin

Ссылки на полный текст

Издание

Журнал: BMC BIOINFORMATICS

Выпуск журнала: Vol. 20

Номера страниц: 37

ISSN журнала: 14712105

Место издания: LONDON

Издатель: BMC

Персоны

  • Kuzmin Dmitry A. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Siberian Fed Univ, Inst Space & Informat Technol, Dept High Performance Comp, Krasnoyarsk 660074, Russia)
  • Feranchuk Sergey I. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Natl Res Tech Univ, Dept Informat, Irkutsk 664074, Russia; Russian Acad Sci, Limnol Inst, Siberian Branch, Irkutsk 664033, Russia)
  • Sharov Vadim V. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Siberian Fed Univ, Inst Space & Informat Technol, Dept High Performance Comp, Krasnoyarsk 660074, Russia)
  • Cybin Alexander N. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Siberian Fed Univ, Inst Space & Informat Technol, Dept High Performance Comp, Krasnoyarsk 660074, Russia)
  • Makolov Stepan V. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Siberian Fed Univ, Inst Space & Informat Technol, Dept High Performance Comp, Krasnoyarsk 660074, Russia)
  • Putintseva Yuliya A. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia)
  • Oreshkova Natalya V. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Russian Acad Sci, VN Sukachev Inst Forest, Lab Forest Genet & Select, Siberian Branch, Krasnoyarsk 660036, Russia)
  • Krutovsky Konstantin V. (Siberian Fed Univ, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk 660036, Russia; Georg August Univ Gottingen, Dept Forest Genet & Forest Tree Breeding, D-37077 Gottingen, Germany; Russian Acad Sci, NI Vavilov Inst Gen Genet, Lab Populat Genet, Moscow 119333, Russia; Texas A&M Univ, Dept Ecosyst Sci & Management, College Stn, TX 77843 USAArticleProceedings Paper)