Skip to content

Latest commit

 

History

History
309 lines (221 loc) · 8.82 KB

README.md

File metadata and controls

309 lines (221 loc) · 8.82 KB

Materials for #NGSchool2019 - Machine Learning for Biomedicine

You will find here the materials for workshops, hackathons and lectures at the #NGSchool2019, together with installation directions and tips for running the software necessary for participation in the NGSchool2019.

Table of Content

General instructions

Colab

Google Colab is an online service in which you can run jupyter notebooks (and even use some limited GPU!) It comes with some preloaded libraries which makes it easier to teach and run tutorials without having to spend too much time on fixing dependencies etc.

Working on Prometheus

Here you will find a short guide on how to work on the Prometheus supercomputer.

Prometheus Quick Start Guide

Talks

Guilliame Fillion - "An experiment on anti-academic research"

Workshops

Intro to HPC

tutor: Klemens Noga

The website with info about the workshop can be accessed here

Intro to R

tutor: Maja Kuzman

Intro to Python

tutor: Kasia Kędzierska

The whole workshop will be executed in the Jupyter notebook, and will rely on several Python packages. In the directory you can find a setup_check.sh script you can run to see if your enviorenment satisfies all requirements.

Install and check if requirements are satisfied.

bash intro_to_python/setup_check.sh

Requirements:

  • python3
  • Jupyter
  • python3 modules:
    • numpy
    • pandas
    • matplotlib
    • scipy

Intro to Stats

tutor: German Demidov

Unsupervised learning

tutor: Kasia Kędzierska

Slides: unsupervised_learning/unsupervised_learning_slides.pdf

The workshop will be run in R notebook. We would work locally and the following packages are required.

Requirements:

  • R 3.5+
  • tidyverse 1.2.1+
  • factoextra 1.0.5+
  • ggpubr 0.2+
  • ggsci 2.9+
  • MASS 7.3-50+
  • tsne 0.1-3+
  • umap 0.2.3.1+
required_packages <- c("tidyverse", "factoextra", "ggpubr", 
                       "ggsci", "MASS", "tsne", "umap")

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Bayesian Inference

tutor: Roman Cheplyaka

Either in RStudio or in the interactive R session run following commands:

required_packages <- c("rstan", "StanHeaders", "magrittr", "reshape2", 
                       "forcats", "stringr", "dplyr", "purrr", "readr",
                       "tidyr", "tibble")

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Natural language processing

tutor: Noura Al Moubayed

Installation guidelines

  1. Install miniconda

Start by installing miniconda.

https://docs.conda.io/en/latest/miniconda.html

  1. Create conda environment

To simplify, we can crete the enviromnet from the yml file: nlp/workshop.yml

conda env create -f nlp/workshop.yml

  1. FROM LOCAL COPY Install missing package:

a. Copy the file from USB

Due to a large file size (>1GB), we are copying the en_core_web_lg from USB sticks distributed on site. When you copy the file from a USB, please change the following command to point to the location of the file.

b. Copy from server

If you didn't copy the file from USB stick, copy it from local server.

scp <your-user>@10.0.0.200:/srv/en_core_web_lg-2.2.0.tar.gz ~/

Now, install it.

# python -m spacy download en_core_web_lg
conda activate workshop
pip install /path/to/folder/with/en_core_web_lg-2.2.0.tar.gz
  1. Clone the repository

Make sure your github repository is up to date and unpack one of files from the nlp directory! The files is gziped to reduce its size.

git pull origin master
gunzip nlp/tutorial_features.pkl.gz

Running the workshop

cd nlp
conda activate workshop
jupyter notebook

Reinforcement Learning

tutor: Robert Loftin

Presentation Slides

In order to run tutorial locally:

conda create --name reinforced python=3.7
conda activate reinforced
pip install numpy==1.17.3
pip install gym==0.15.3
pip install matplotlib==3.0.3
#pip install torch==1.3.0
conda install pytorch torchvision cpuonly -c pytorch
pip install chainer
pip install minerl
pip install opencv-python-headless
pip install roboschool
conda install jupyter
conda install -c anaconda openjdk
jupyter-notebook

Deep learning methods for genomics

tutor: Ron Schwessinger

Slides

The seminar hands-on workshop will be run in a google colab notebook. A google account is required though. Additional information can be found in this repo but no need to install anything for the workshop.

Deep Generative Models for dimensionality reduction

tutor: Kaspar Märtens

Link to slides

In the hands-on part of the tutorial, we will implement an Autoencoder on MNIST data. See google colab notebook for Autoencoders on MNIST.

For those interested, there is also an additional colab notebook for Variational Autoencoders.

Tree based methods

tutor: Rosa Karlic

You will work locally in RStudio, execute following code to install packages:

required_packages <- c("caret", "rpart", "e1071", 
                       "ranger", "dplyr", "randomForest", "rpart.plot",
		       "ipred", "bst", "plyr")

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Lasso workshop

tutor: Tim Padvitski

You will work locally in RStudio, execute following code to install packages:

required_packages <- c("c060", "glmnet", "igraph)

for (pkg in required_packages) {
  if(!require(pkg, character.only = TRUE, 
              quietly = TRUE, 
              warn.conflicts = FALSE)) {
    print(paste0("Warning! Installing package: ", pkg, "."))
    install.packages(pkg)
  } 
}

print("All done! :)")

Hackathons

Dilated Convolutional Neural Nets for DNase-seq and ATAC-seq footprinting

Requirements:

  • python3
  • keras and Tensorflow v1.14 as backend
  • numpy
  • scikit-learn
  • google account for colab notebook work

Literature:

Cell Of Origin Hackathon!

German Demidov, Maja Kuzman

The Goals:

Day 1:

  1. Explain the data set
  2. Explore the data
    • Explore the response data set
    • Explore the predictors data set

Day 2:

  1. Predict mutational patterns using random forest regression
    • Find important features
  2. Predict mutational patterns using different methods
    5 . Use mutational profiles to predict cancer type
  3. Try to beat Rosa!

Day 3:

  1. Complete the presentations
  2. Good luck!