MBV-INFX410 2015

From projects/clsi
Jump to: navigation, search

Bioinformatics for Molecular Biology - 2015

THIS IS THE WIKI FOR 2015. INFORMATION FOR 2018 IS FOUND HERE!

Jan 15: The exam results have been submitted to the IBV administration and will be available soon. Some brief comments on the exam:

  • This was a home exam with lots of time, permitted materials were basically "everything", and roughly 1/3 of the students had answered almost everything correctly. For this reason there was relatively little tolerance for silly mistakes such as not reading the questions properly, not answering all sub-questions, very odd/incomprehensible argumentation/spelling, and leaving parts of the exercises blank
  • Exercise 1b: 3 proteins are identical because they all represent identical splicing of a pre-mRNA corresponding to the human reference genome. The last sequence corresponds to a (different) individual and different splicing and uses the wrong ATG as start codon
  • Exercise 1h: 3 codons is 9 nucleotides, not 3. 5' end is at "the beginning" of the exon
  • If FTO from cow or sheep is more similar to human FTO than the other, this is completely by chance
  • All Sauropsida (birds and reptiles) have a common ancestor more recently than the common ancestor of Sauropsida and mammals, while amphibians are remote cousins. All the phylogenetic trees were ok
  • Exercise 2a: First we make one long string with all the information in the input file, then we split on the string "\n>" and get a list of fasta sequences. We loop through this list and in an inner loop split on "\n"
  • Exercise 2b: (i) Two main ways to do this: Best(?) is to use len() to find length of sequence string but remember to not count the end-of-line character at the end of every line. Must also use str() to convert number to string for printing. It was also possible to extract the length from the header, but in that case one should not cut out a substring of only three characters since this will only work for sequences with lengths between 100 and 999. (ii) Many ways to do this task and almost all ok
  • Exercise 3: You have to normalise the data because the total amount of RNA is not the same in the various samples. This might be due to different concentrations of the samples, different degree of RNA degradation etc. If you look at the data it is pretty obvious that all the top 5 miRNAs are upregulated in HCHF
  • Exercise 4a: This was quite difficult and time consuming, unless you had read the article "A beginners guide to SNP calling from high-throughput DNA-sequencing data" in the required reading material. Not all students had read the curriculum

Dec 9: The exam for the course was sent, by e-mail, to all qualified students on December 9. If you did not receive the exam, please contact Jon immediately!

Dec 3: The solutions for the obligatory assignment can be downloaded here. Jon will go through the oblig with solutions on Thursday.

Dec 1: The handouts and material for Monday Nov 30 was updated.

Nov 25: On Friday Dec 4th, between 13:00 and 15:00, there will be a helpdesk session in lecture room Java (2. etasje). This is not a compulsory part of the course and there will not be any lectures. There will not be any new messages and no messages or hints about the exam (that is not also put on the wiki). This is an opportunity to get help with parts of the curriculum you are struggling with. Jon will be present, but will most likely not be able to answer all questions about the parts of the curriculum that has been taught by other teachers.

Nov 25: The following rooms will be used for lectures in week 49: Monday - Postscript, Tuesday - Lille Auditorium in Kristen Nygaards hus, Wednesday - Pascal, and Thursday - C. For more information go to the Time and place section below.

Nov 20: Download the obligatory and compulsory assignment by following this link. It must be returned by 09:00, Monday November 30. Good luck! Students that do not get this assignment approved, will not be allowed to take the exam.

All students must be present at the start of the course, at 09:00, on November 9. Missing students will not be allowed to follow the course or to take the exam. If you for some reason are not able to attend, please contact Jon (See contact details below).

This is the wiki for the courses MBV-INF4410 and MBV-INF9410 offered by the Department of Biosciences and Department of Informatics at the University of Oslo (UiO). Both MBV-INF4410 (M.Sc. level) and MBV-INF9410 (Ph.D. level) are 10 study point courses. The 8 study points variant of the course (MBV-INF9410A) will not be given after 2013.

The course consists of five weeks of lectures, exercises, obligatory assignments, and a take-home exam (one week). Obligatory assignments must be completed and approved before the exam and in the same semester. An additional, limited, oral examination may be arranged in cases where this is necessary for the student evaluation. The course is open also for non-UiO students. It is only necessary to be physically present in Oslo for certain parts of the course.

Course description

This intensive course will introduce students to bioinformatics resources and tools for molecular biology research. All the lecturers are among the top researchers working within the fields of bioinformatics and computational biology in the Oslo region. Students must bring their own laptop for in-course demonstrations as well as for practical lab exercises. The course is mainly intended for biology students, but also for computer science students or students from other fields of science with an interest for and some experience with molecular biology. No prior background in bioinformatics or computer science is required. All students should have a basic understanding of molecular biology, at least roughly corresponding to 5-10 university study points in molecular biology, biochemistry, or similar. If you are uncertain if your biology background is strong enough, please contact Jon (See contact details below) before you sign up for the course.

Course responsible is Dr. Jon K. Lærdahl (jonkl@medisin.uio.no) from the Department of Microbiology, Oslo University Hospital (OUH) - Rikshospitalet. Lærdahl is also employed by the CLS initiative at UiO and the Bioinformatics Core Facility (CF) at OUH and UiO.

Links to the web pages for the years 2009-2011 is found here (Aug 2014: this server might be permanently down...), for 2012 here, for 2013 here and for 2014 here.

Notes on the course format: The course has previously been given as an intensive course over two weeks with a take-home exam in the 3rd to 4th week. A take-home assignment was also a compulsory part of the 10 study points versions of the course. The compact format was ideal for students coming from outside Oslo, but it was also exhausting for students and lecturers. It gave no time to digest and dive more deeply into the various topics presented in the course. Since 2013, the course will be given over 5 weeks. However, it will only be necessary to be physically present in Oslo for parts of the course, i.e. the lectures/exercises. The schedule is presented below, but there might be small adjustments to this later.

TeachingPlanMBVX410 15A2.png

Time and place

The course will be offered in weeks 46 to 50, autumn 2015, i.e. starting on Monday November 9 (See schedule below). The take-home exam must be handed in in week 51, on Wednesday December 16. Each day of lectures/exercises will consist of three time slots for lectures and/or exercises/practical labs between 09:00 and 16:00. Lunch will usually be between 12:45 and 13:30. You will have to bring your own lunch or buy lunch in the local kantine.

Lecture room: All lectures/exercises in weeks 46 and 47 will be given in lecture room Python in Ole-Johan Dahls hus (IFI2). A map showing the location of the building is found here. The building is located next to the Forskningsparken metro and tram stations. The room Python is on the 1st floor (2. etasje) in the northern end of the building, the end closest to the tram line. The easiest access to Python is through the entrance in the tunnel going through the building.

In week 49 we will use the following rooms:

  • Monday, Nov 30th: Room Postscript, 2. etasje (opposite direction of Python when you enter the building through the tunnel)
  • Tuesday, Dec 1st: Lille Auditorium in Kristen Nygaards hus, the building next to OJDs hus, ground floor
  • Wednesday, Dec 2nd: Room Pascal, back in OJDs hus, 2. etasje, further along towards the south wrt. Postscript
  • Thursday, Dec 3rd: Room C, 3. etasje, one floor up from Pascal

On Friday Dec 4th, between 13:00 and 15:00, there will be a helpdesk session in lecture room Java (2. etasje). This is not a compulsory part of the course and there will not be any lectures. There will not be any new messages and no messages or hints about the exam (that is not also put on the wiki). This is an opportunity to get help with parts of the curriculum you are struggling with. Jon will be present, but will most likely not be able to answer all questions about the parts of the curriculum that has been taught by other teachers.

Contacts

Jon K. Lærdahl (Course coordinator) - e-mail: jonkl@medisin.uio.no, phone: +47 99 507 335

Torill Rørtveit (Course administrator, registration) - e-mail: torill.rortveit@ibv.uio.no

Computers/laptops, internet access, and UiO user account

All students must bring a laptop with either a Windows (Windows 7 or more recent), Unix/Linux, or OS X (i.e. an Apple computer) operating system.

  • The computer should not be more than 2-3 years old
  • It should be possible to connect the computer to the UiO wireless network
  • You must have a root/administrator password that gives you access to installing new software on the computer
  • Bring an external mouse, and do not rely on touchpad/trackpad only
  • You must have a valid UiO user account and must be able to log onto a computer on the UiO network
  • If you are unsure if you have a UiO user account and a valid password, you should try to log in using kiosk.uio.no or win.uio.no as described here. If you are unable to log in, try the hints you find here.
  • Instructions (in Norwegian) about how to find your user name and get a new password can be found here.

If you are struggling with anything of the above, in particular if you have forgotten your UiO user name/password or you do not have one, you must contact Jon (See contact details above) as soon as possible, and at least one week before the start of the course.

To get a UiO username/password at the UiO helpdesk you need valid ID.

On the first day of the course we will set up your laptop so that it can be used for the exercises/tutorials, the home exam and hopefully in your future work. How to get a reasonable setup is described here.

If you already are an expert programmer and Unix guru, go here.

Programme

The schedule below is tentative, and may be changed prior to, and possibly even during, the course. Requests and suggestions are welcome.

Week 46: Monday, November 09 - Friday, November 13
Session 1 Session 2 Session 3
09:00 - 10:45 11:00 - 12:45 13:30 - 16:00
Monday 9th

Course introduction

- lectures

Biological databases, Unix & setting up laptops

- lectures/exercises

Basic Unix

- exercises

Jon K Lærdahl Jon K Lærdahl Jon K Lærdahl
Tuesday 10th

Ensembl genome browser & Jalview

- lectures/demo

Working with sequences/databases

- lectures/exercises

Applied sequence bioinformatics

- exercise

Jon K Lærdahl Jon K Lærdahl Jon K Lærdahl
Wednesday 11th

Structural Biology Review

- lectures/review

More Unix & UCSC Genome browser

- lectures/demo/exercises

Galaxy/Life Portal

- lectures/demo/exercises

Jon K Lærdahl Jon K Lærdahl Jon K Lærdahl
Thursday 12th

Python workshop

- lectures/exercises

Python workshop

- lectures/exercises

Python workshop

- lectures/exercises

Karin Lagesen Karin Lagesen Karin Lagesen
Friday 13th

Python workshop

- lectures/exercises

Python workshop

- lectures/exercises

Python workshop

- lectures/exercises

Karin Lagesen Karin Lagesen Karin Lagesen
Week 47: Monday, November 16 - Friday, November 20
Session 1 Session 2 Session 3
09:00 - 10:45 11:00 - 12:45 13:30 - 16:00
Monday 16th

Sequence searching, alignments, and multiple alignments

- lectures (09:00 - 11:45)

Sequence searching and multiple sequence alignments

- exercise (12:30 - 16:00)

Torbjørn Rognes Jon K Lærdahl
Tuesday 17th

Structural Bioinformatics

- lectures/exercises

Structural Bioinformatics

- lectures/exercises

Structural Bioinformatics

- lectures/exercises

Jon K Lærdahl Jon K Lærdahl Jon K Lærdahl
Wednesday 18th

Reproducibility/Statistical epigenomics

- lectures/exercises

Reproducibility/Statistical epigenomics

- lectures/exercises

Reproducibility/Statistical epigenomics

- lectures/exercises

Sveinung Gundersen/Boris Simovski Sveinung Gundersen/Boris Simovski Sveinung Gundersen/Boris Simovski
Thursday 19th

Docking and drug discovery

- lectures/exercises (09:00 - 11:45)

Practical Unix/Python exercise

- lectures/exercises (12:30 - 16:00)

Bjørn Dalhus Jon K Lærdahl
Friday 20th

More Structural Bioinformatics

- lectures/exercises

More Structural Bioinformatics

- lectures/exercises

More Structural Bioinformatics

- lectures/exercises

Jon K Lærdahl Jon K Lærdahl Jon K Lærdahl
Week 48: Monday, November 23 - Friday, November 27
Work on obligatory home assignment and study days (no lectures)
Week 49: Monday, November 30 - Thursday, December 3
Session 1 Session 2 Session 3
09:00 - 10:45 11:00 - 12:45 13:30 - 16:00
Monday 30th

Introduction to statistical inference and R

- lectures

Basic R and exploring your data

- lectures/exercises

Basic R and exploring your data

- lectures/exercises

Ole Christian Lingjærde Jon K Lærdahl/Ksenia Khelik Jon K Lærdahl/Ksenia Khelik
Tuesday 1st

Next generation sequencing (NGS)

- lectures

NGS & variant calling lab

- lectures/exercises

NGS & variant calling lab

- lectures/exercises

Tim Hughes & Aravind Sundaram Tim Hughes & Aravind Sundaram Tim Hughes & Aravind Sundaram
Wednesday 2nd

Analysis of gene expression data using R

- lectures/exercises

Analysis of gene expression data using R

- lectures/exercises

Analysis of gene expression data using R

- lectures/exercises

Ståle Nygård Ståle Nygård Ståle Nygård
Thursday 3rd

What does it mean to do bioinformatics?

- lectures

Gene lists & over-representation analysis (ORA)

- lectures/exercises

End of course summary

- lecture/discussion

Lex Nederbragt Ståle Nygård Jon K Lærdahl

Ksenia Khelik from the BMI group, Ifi, will help during the Python and R courses.

Ivar Grytten and Boris Simovski from the BMI group, Ifi, will help during the laptop setup session on Monday, November 09.

The brief lecture on PCSK9 can be found here, if you are interested. This is not a part of the curriculum.

Required reading material/curriculum

The curriculum comprises all lectures, lecture handouts, exercises and the reviews/articles/written material listed on this page.

An obligatory and compulsory assignment must be returned, by e-mail, to the course coordinator Jon K. Lærdahl (e-mail address: jonkl@medisin.uio.no) before Monday November 30, at 09:00. Note that students that do not get this assignment approved, will not be allowed to take the exam.

The obligatory assignment can be downloaded here.

Exam

##########################################

Some messages:

  • Exercise 1, subsection i): "Zoom in on exon 2 so that..." means "Zoom in on exon 2 in the gene so that..."
  • Exercise 2, subsection b): The script should print out the length of the shortest and the longest sequence. It is not necessary that the script prints out the names of these sequences. You can find this manually
  • The exam for the course was sent, by e-mail, to all qualified students on December 9. If you did not receive the exam, please contact Jon immediately!

##########################################

The exam for this course will be a one week, take-home exam. Only students that have returned the obligatory assignment, and got this assignment approved, will be allowed to take the exam.

The exam will be sent to all qualified participants at 3 pm, Wednesday December 9, by e-mail.

Your completed exam must be returned, at the latest, at 3 pm, Wednesday December 16. It should be sent by e-mail to the course administrator Torill Rørtveit (e-mail address: torill.rortveit@ibv.uio.no). Please put the course code and your candidate number (for this course) in the subject field (e.g. "Exam MBV-INF4410 Candidate:15").

The exam must be handed in as a single PDF document (Microsoft Word or an Open Office Document is also acceptable). The document should be named with the course code and your candidate number only (e.g. MBV-INF4410-15.pdf). Do not place your name in the document.

An additional, limited, oral examination may be arranged in cases where this is necessary for the student evaluation.

Bioinformatics mailing list for the Oslo region

The mailing list for computational biology and bioinformatics in the Oslo region is cbo-all@usit.uio.no. The list has approximately 400 members. The list is used to distribute news about seminars, positions, courses, meetings and other topics that might be of interest to students and researchers with an interest in computational life science in south-eastern Norway. If you want to receive e-mails that are sent to the list, sign up here

https://sympa.uio.no/usit.uio.no/info/cbo-all

by following the link termed "Subscribe".

Useful links

Trond Hasle Amundsen's Local guide to Linux and Unix

EMBnet Quick guide Unix

Jalview and Jalview refcard

UCSC Genome browser

Free UCSC Genome browser tutorial   from OpenHelix

Portal to Galaxy

Galaxy 101 and other Galaxy screencasts/tutorials

UiO Lifeportal and more information on the Lifeportal

The Genomic HyperBrowser

Links Directory from bioinformatics.ca

The 2015 Nucleic Acids Research Database Issue

The 2015 Nucleic Acids Research Web Server Issue

UiO and OUS Bioinformatics Core Facility