Ph.D. Bioinformatics and Genomics, Pennsylvania State University, 2010
B.S. Biotechnology, Peking University, 2002
2014-present, Assistant Professor, Crop and Soil Environmental Science, Virginia Tech
2010-2014, Postdoctoral Associate, Biology Department and Institute for Genome Sciences and Policy, Duke University
2002-2010, Graduate Research Assistant, Biology Department, Pennsylvania State University
2001-2002, Undergraduate Research Assistant, Department of Biotechnology, Peking University
Bioinformatics for Next Generation Sequencing, Fall 2014
GBCB 5874 (CSES course number pending): Problem Solving in Genetics, Bioinformatics and Computational Biology. Spring 2015
The complete sequences of many reference genomes of important crop species provide the opportunity to accelerate crop improvement using genomic and computational methods. The growing capability and reducing cost of sequencing technology have transformed genomics into a data rich science. The new challenge for computational biologists is how to extract biological meaningful knowledge from this “Big Data”. Our research team focuses on developing computational tools for genome scale data analysis. The goal is to understand the connections between genotypes and phenotypes by developing computational algorithms that integrate large-scale data from genomics, transcriptomics, proteomics and metabolomics. Our algorithms will be able to generate testable hypotheses for bench or field validation. Our current main research projects include: 1) Comparative transcriptome analysis: Recent advancement in high throughput sequencing has revealed many novel components of the transcriptome such as non-coding RNA, antisense transcripts and alternative splicing variants. We are interested in identifying functional components in the transcriptome using comparative genomics approaches in various plant and crop species. 2) Regulatory networks: The temporal and spatial gene expression patterns are key to the physiological functions of living organisms. Gene expression patterns are controlled by complicated regulatory networks. We are applying advanced statistical inference and machine learning algorithms to integrate multiple genomic datasets to identify regulatory networks and key regulatory mechanisms underlying gene expression. 3) Next generation functional genomics: We are also interested in developing new and powerful statistical methods for GWAS and QTL. Our method will integrate genetic data and gene expression data to provide improved functional genomic annotation and to identify candidate genes underlying traits of agricultural importance.
- Identifying conserved alternative splicing variants and antisense transcripts in diverse plant species.
- Developing a computational framework to systematically define best parameters and pipeline for mapping and assembly of next generation sequencing reads in crop species.
- Developing machine learning methods that identify active regulatory networks controlling cell type- or condition-specific gene expression.
- Improving Hidden Markov Model (HMM) based gene prediction method. Our new method will incorporate diverse genomic evidence such as PEAT-seq, Ribo-seq and evolutionary conservation to improve splicing variant discovery and annotation in both model species and crop species.
Role of Graduate Students
Computational biology is a fast changing discipline. For a student to achieve a successful career in this field, knowledge of how to operate existing software is necessary but not sufficient. Five years from now, many software packages will become obsolete simply because new technology with better performance will replace old technology and new computational tools will continue to emerge. To be successful in this dynamic field, a graduate student should master a range of programming skills, understand the principles of commonly used machine learning and statistical methods in bioinformatics, and learn to formulate and transform research questions in genomics and bioinformatics into problems that can be solved by well-grounded computational algorithms. A graduate student in our group will also gain working experience in a highly collaborative environment. Graduate students should adhere to the principle of reproducible research, develop robust and easy to use software packages, prepare reports and publish results in scientific journals and collaborate with colleagues in Vigrinia Tech and in the broad scientific community. I will strive to provide a positive and rewarding research environment, help the student to improve public speaking and scientific writing skills, create opportunity for internal and external collaboration and encourage student to attend scientific conferences to present their work. Together, we will perform exciting, original and interdisciplinary research in computational biology to solving challenging problems in our discipline.
The long-term goal of our research program is to advance our knowledge of basic biology and to apply such knowledge in crop improvement. Our future computational tools will aim to translate molecular function and pathway information in model organisms into testable predictions in crops species. We will also develop multi-level; multi-scale modeling tools to help us better understand how biological systems interact with each other. One particular exciting direction we would like to explore is the interaction between soil microbial communities and crop species.
- (540) 231-2756
220 Ag Quad Lane
508 Latham Hall