# Main Page

From BTRY4840/6840 Website

This is the course web page for BTRY 4840/6840, "Computational Genomics" (Fall 2011).

Please check this page frequently throughout the semester. It will continually be updated with information you will need. Keep in mind that the schedules for lectures and homeworks are provisional.

## Contents |

## Announcements

- Homework #5 has been posted.
- Homework #4 has been posted
- Homework #3 has been posted.
- Homework #2 has been posted.
- Class is cancelled for Sep. 8. This will be true even if the University opens at 11AM.
- The due date for homework #1 has been moved to Sep 12.
- We now have a room for the discussion section: Comstock B108 (4:00-4:50PM Tues). Unfortunately, this room does not support videoconferencing.
- Homework #1 has been posted. It is due Sep 9. Note that you will need a password to access PDFs from this site. If you need it before Tues you can email me.
- The course mailing list is now in place (btry4840-l@cornell.edu). To join, follow these instructions
- We have tentatively rescheduled the discussion section for Tues 4-5PM. We are still working on finding a room. We plan to hold the first meeting on Tues 8/30.
- Office hours have been rescheduled to Thurs 4-5PM, to avoid a conflict with the discussion section.
- The room for the course has been changed to Weill 224, to permit videoconferencing with NYC. We will meet in Weill 224 every day
**except October 20**, when we will be in**Weill 321**.

## General Information

- Lectures: Tues/Thurs, 11:40-12:55, Weill 224
- Recitations: Tues, 4:00-5:00 [room TBD]
- Credit Hours: 4 (S/U or letter)
- Instructor: Adam Siepel, 102E Weill
- TA: Brad Gulko, 102 Weill
- Instructor Office Hours: Thurs, 4:00-5:00
- Mailing list: btry4840-l@cornell.edu

## Resources

- Syllabus
- Project guidelines and ideas
- Practicum Slides.

## Books

- Primary textbook:
- Durbin R, Eddy SR, Krogh A, and Mitchison G [DEKM],
*Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids*, Cambridge University Press, 1998 (10th printing, 2006).

- Durbin R, Eddy SR, Krogh A, and Mitchison G [DEKM],
- Auxiliary bioinformatics books:
- Felsenstein J [F],
*Inferring Phylogenies*, Sinauer Associates, Inc., 2004. - Jones NC, Pevzner PA [JP],
*An Introduction to Bioinformatics Algorithms*, MIT Press, 2004. - Deonier RC, Tavare S, Waterman MS [DTW],
*Computational Genome Analysis: An Introduction*, Springer, 2005. - Ewens WJ, Grant G [EG],
*Statistical Methods in Bioinformatics: An Introduction*, Springer, 2005.

- Felsenstein J [F],
- Recommended reference books:
- Hogg RV, Craig AT [HC],
*Introduction to Mathematical Statistics*. Prentice Hall, 1995. - Casella G, Berger RL [CB],
*Statistical Inference*. Duxbury Press, 2001. - Cormen TH, Leiserson CE, Rivest RL, Stein C [CLRS],
*Introduction to Algorithms*. MIT Press, 2001. - Watson JD, Baker TA, Bell SP, Gann A, Levine M, Losick R [MBG],
*Molecular Biology of the Gene*. CSHL Press, 2004.

- Hogg RV, Craig AT [HC],

## Online tutorials

Here is a small collection of possibly useful online tutorials on genomics and bioinformatics. Please email me if you find others that are particularly useful.

- DOE HGP Genomics Primers

- NCBI "What is a Genome" Primer

- Wikipedia on bioinformatics

## Lecture Schedule

Date | Readings | Topics | Slides |
---|---|---|---|

Aug 25 | MBG or similar as needed | Course introduction. | |

Aug 30 | DEKM ch 1, pp 300-314; DTW ch 2&3; HC, CB or similar as needed | Molecular biology background. Probability and statistics intro. | |

Sep 1 | J&P ch 2 | More statistics background. | |

Sep 6 | – | More statistics background. | |

Sep 8 | DEKM ch 2; JP ch 5 or CLRS ch 15 | Finish statistics. Dynamic programming. | |

Sep 13 | DEKM pp 320-322 | Sequence alignment. | |

Sep 15 | Wasserman & Sandelin review article | More on alignment; introduction to motif models. | |

Sep 20 | – | Information theory; Markov models. | |

Sep 22 | DEKM pp 46-58; Supplementary: Eddy Rabiner | Hidden Markov models. | |

Sep 27 | DEKM pp 58-61, 68-79 | More on HMMs. | |

Sep 29 | DEKM pp 160-165, 173-176, 192-202; F ch 1&2, pp 196-206, 248-255 | More on HMMs; phylogenetic models. | |

Oct 4 | – | Phylogenetic models. | |

Oct 6 | DEKM pp 202 | More on phylogenetic models. | |

Oct 11 | – | Happy Fall Break! |
– |

Oct 13 | DEKM pp 165-172,189-191; F ch 11 | Phylogeny reconstruction (Ilan Gronau). | |

Oct 18 | Jordan ch 2; See also this review | General graphical models. | |

Oct 20 | Jordan ch 3,4,9 | More on graphical models. | |

Oct 25 | DEKM pp 323-325, Jordan ch 9-10 | Expectation maximization (EM). | |

Oct 27 | DEKM pp 63-66 | EM for HMMs and motif models. | |

Nov 1 | DEKM pp 314-319, 154-159; Jordan ch 21 | Introduction to MCMC. | |

Nov 3 | – | Gibbs sampling and use in motif finding. | |

Nov 8 | – | Positive selection (Leo Arbiza). | |

Nov 10 | DEKM ch 4 | Statistical pairwise alignment. | |

Nov 15 | – | Fast heuristic alignment and short-read sequence analysis. | |

Nov 17 | DEKM ch 9 & 10 | Stochastic context-free grammars and RNA structure prediction. | |

Nov 22 | DEKM ch 6 | Multiple alignment. | |

Nov 24 | – | Happy Thanksgiving! |
– |

Nov 29 | – | Gene duplication and loss (Matt Rasmussen) | |

Dec 1 | Phylo-HMM review | Applied comparative genomics. |

## Homework Schedule

Homework | Date Assigned | Date Due | Topics | Data |
---|---|---|---|---|

HW#1 | Aug 26 | Sep 12 | Probability and statistics warm up | |

HW#2 | Sep 10 | Sep 23 | Dynamic programming and sequence alignment | sequences.fa |

HW#3 | Sep 24 | Oct 7 | Motif models, HMMs | sequence.fa |

HW#4 | Oct 11 | Oct 24 | Phylogenetic models | apoe.fa |

Proposal | Oct 25 | Nov 11 | Detailed project proposal | |

HW#5 | Oct 24 | Nov 7 | EM and Gibbs sampling |