PH.D DEFENCE - PUBLIC SEMINAR

Improvement and evaluation of genome assembly

Speaker
Mr Xie Luyu
Advisor
Dr Wong Lim Soon, Kithct Chair Professor, School of Computing


29 Nov 2019 Friday, 02:00 PM to 03:30 PM

MR1, COM1-03-19

Abstract:

After ten years of development, Next-Generation Sequencing (NGS) has already been successfully commercialized and widely applied in many scenarios. As a basic application, genomic de novo assembly has also benefitted from the development of many strategies, e.g. mate-pair library, optical map, chromatin interaction, and genetic map. Among these strategies, the genetic map is the most widely adopted one in breeding studies. But genetic map construction requires numerous computational resources and underperforms when draft assembly contains misassembly.

To address these limitations, I propose a new assembly-improving method, CAST. It corrects and scaffolds a draft assembly by genetic information in progenies' NGS data, without the construction of a genetic map. In theory, It first splits the draft assembly at genetically incoherent positions and then scaffolds contigs at genetically coherent positions. This method was evaluated based on two public datasets and showed its ability to significantly improve a draft assembly's contiguity and correctness.

In the course of the CAST project, it was found that existing metrics only provided limited insight on specific aspects of genome assembly quality, and sometimes even disagreed with each other. For better integrative comparison between assemblies, I propose a new genome assembly metric, PDR, in this thesis as well. It derives from a common question in genetic studies and takes completeness, contiguity, and correctness into consideration. The results on publicly available datasets showed its ability to integratively assess the quality of a genome assembly.