Song Li


Ph.D. Bioinformatics and Genomics, Pennsylvania State University, 2010

B.S. Biotechnology, Peking University, 2002                                                                                                          


2014-present, Assistant Professor, Crop and Soil Environmental Science, Virginia Tech

2010-2014, Postdoctoral Associate, Biology Department and Institute for Genome Sciences and Policy, Duke University                                          

2002-2010, Graduate Research Assistant, Biology Department, Pennsylvania State University

2001-2002, Undergraduate Research Assistant, Department of Biotechnology, Peking University

Link to curriculum vita

Courses Taught

Bioinformatics for Next Generation Sequencing, Fall 2014

GBCB 5874 (CSES course number pending): Problem Solving in Genetics, Bioinformatics and Computational Biology. Spring 2015

Undergraduate Research

Working in a research laboratory is one of the most valuable experiences for undergraduate students. Our group provides the opportunity for motivated and enthusiastic undergraduate students to participate in research and analysis in the exciting, interdisciplinary field of computational biology. We are constantly looking for students with experience and skills in the following area: knowledge of programming and algorithm implementation in any of the following languages: C, C++, Java, R or Python; website design and implementation using HTML and Javascript; database administration; proficiency with Linux environment. You will gain experience by applying programming and computational skills you have learned from classroom in a realistic research setting. Please feel free to contact me regarding potential undergraduate research projects.

Program Focus

The complete sequences of many reference genomes of important crop species provide the opportunity to accelerate crop improvement using genomic and computational methods. The growing capability and reducing cost of sequencing technology have transformed genomics into a data rich science. The new challenge for computational biologists is how to extract biological meaningful knowledge from this “Big Data”. Our research team focuses on developing computational tools for genome scale data analysis. The goal is to understand the connections between genotypes and phenotypes by developing computational algorithms that integrate large-scale data from genomics, transcriptomics, proteomics and metabolomics. Our algorithms will be able to generate testable hypotheses for bench or field validation. Our current main research projects include: 1) Comparative transcriptome analysis: Recent advancement in high throughput sequencing has revealed many novel components of the transcriptome such as non-coding RNA, antisense transcripts and alternative splicing variants. We are interested in identifying functional components in the transcriptome using comparative genomics approaches in various plant and crop species. 2) Regulatory networks: The temporal and spatial gene expression patterns are key to the physiological functions of living organisms. Gene expression patterns are controlled by complicated regulatory networks. We are applying advanced statistical inference and machine learning algorithms to integrate multiple genomic datasets to identify regulatory networks and key regulatory mechanisms underlying gene expression. 3) Next generation functional genomics: We are also interested in developing new and powerful statistical methods for GWAS and QTL. Our method will integrate genetic data and gene expression data to provide improved functional genomic annotation and to identify candidate genes underlying traits of agricultural importance.

Current Research

  •  Identifying conserved alternative splicing variants and antisense transcripts in diverse plant species.
  •  Developing a computational framework to systematically define best parameters and pipeline for mapping and assembly of next generation sequencing reads in crop species.
  •  Developing machine learning methods that identify active regulatory networks controlling cell type- or condition-specific gene expression.
  •  Improving Hidden Markov Model (HMM) based gene prediction method.  Our new method will incorporate diverse genomic evidence such as PEAT-seq, Ribo-seq and evolutionary conservation to improve splicing variant discovery and annotation in both model species and crop species.

Role of Graduate Students

Computational biology is a fast changing discipline. For a student to achieve a successful career in this field, knowledge of how to operate existing software is necessary but not sufficient. Five years from now, many software packages will become obsolete simply because new technology with better performance will replace old technology and new computational tools will continue to emerge. To be successful in this dynamic field, a graduate student should master a range of programming skills, understand the principles of commonly used machine learning and statistical methods in bioinformatics, and learn to formulate and transform research questions in genomics and bioinformatics into problems that can be solved by well-grounded computational algorithms. A graduate student in our group will also gain working experience in a highly collaborative environment. Graduate students should adhere to the principle of reproducible research, develop robust and easy to use software packages, prepare reports and publish results in scientific journals and collaborate with colleagues in Vigrinia Tech and in the broad scientific community. I will strive to provide a positive and rewarding research environment, help the student to improve public speaking and scientific writing skills, create opportunity for internal and external collaboration and encourage student to attend scientific conferences to present their work. Together, we will perform exciting, original and interdisciplinary research in computational biology to solving challenging problems in our discipline.

Future Research

The long-term goal of our research program is to advance our knowledge of basic biology and to apply such knowledge in crop improvement. Our future computational tools will aim to translate molecular function and pathway information in model organisms into testable predictions in crops species. We will also develop multi-level; multi-scale modeling tools to help us better understand how biological systems interact with each other. One particular exciting direction we would like to explore is the interaction between soil microbial communities and crop species.

  • (540) 231-2756
  • 220 Ag Quad Lane
    508 Latham Hall
    Blacksburg, Virginia