Computational Techniques for Genome Assembly and Analysis

COM S 551

Offered during Fall Semester of odd years.

  1. Credits: 3 credit hours
  2. Instructor's or course coordinator's name: Xiaoqiu Huang
  3. Textbook, title, author, and year: None
  4. Other supplemental materials: None

Course Information

  1. Brief description of the content of the course: Introduction to a big data research area in bioinformatics. Focus on applying computational techniques to huge genomic sequence data. These techniques include finding optimal sequence alignments, generating genome assemblies, finding genes in genomic sequences, mapping short sequences onto a genome assembly, finding single-nucleotide and structural variations, building phylogenetic trees from genome sequences, and performing genome-wide association studies.
  2. Prerequisites or co-requisites: COM S 311

Course Outcomes

  • Be able to understand and use bioinformatics tools for analysis of next-generation data.
  • Be able to conduct a research project in genomics through analysis of next-generation data.


  1. Computing local alignments with SIM
  2. Computing a global alignment with GAP3
  3. Genome assembly with PCAP.Solexa
  4. Genome assembly using Velvet
  5. Viewing an assembly in Consed
  6. Transcriptome assembly with Trinity
  7. Gene finding with Augustus
  8. Gene finding with AAT
  9. Read mapping with Bowtie2
  10. Sequence mapping with BWA
  11. Manipulating alignment/map formats with Samtools and Picard
  12. Calling SNVs and CNVs with SpeedSeq
  13. Genome alignment using TBA
  14. Phylogenetic analysis using SNPs
  15. Viewing read alignments to a reference sequence using IGV
  16. Performing a genome-wide association study with PLINK and Haploview