Detection of Novel Splicing and Structural Variants through Next-generation Transcriptome Sequencing
My current research project focuses on computational analysis of next-generation high-throughput transcriptome sequencing datasets, aiming todevelop algorithms to detect transcriptional variations of an individualorganism in terms of alternative splicing and novel structural alterations.Recently, the advent of next-generation sequencing technologies, allowinghigh-throughput and low cost genome and transcriptome sequencing with thedrawback of very short reads (30-100 base pairs) with high error rate; havefundamentally changed genomic and transcriptomic studies, necessitatingefficient, reliable and highly sophisticated computational methods. Afundamental problem for transcriptional variation studies is the gappedalignment problem, in which a transcript sequence (in the form of a shortread or a contiguous sequence of assembled short reads) is required to bealigned to a large genomic sequence, spanning long gaps in between, inorder to identify the exon-intron structure of the transcript. A specificproblem that I currently investigate in my research project is an advancedversion of this gapped alignment problem, in which complex structuralalterations are also taken into account (that can appear in the form ofduplications, inversions, rearrangements and fusions). These alterationscan either be caused by genomic mutations that alter the structure of anactive gene, or an abnormal execution of the transcriptional machinery inwhich the same sequence is transcribed twice, inverted during transcriptionand/or two separate transcripts are fused into a single product.Availability of a computational method that can effectively identify suchcomplex events would be of great significance to personalized genomeresearch, identifying complex transcriptional variations that might lead tothe discovery of personal biomarkers for cancer diagnosis and prognosis oftumours.