Data analyses usually entail the application of many command line tools or scripts to transform, filter, aggregate or plot data and results. With ever increasing amounts of data being collected in science, reproducible and scalable automatic workflow management becomes increasingly important. Snakemake is a workflow management system, consisting of a text-based workflow specification language and a scalable execution environment, that allows the parallelized execution of workflows on workstations, compute servers and clusters without modification of the workflow definition. Thereby, a scheduling algorithm based on a multidimensional knapsack problem allows Snakemake to maximize workflow execution speed while not exceeding given constraints like the number of available processor cores, cluster nodes or auxilliary hardware like graphics cards.

Since its publication, Snakemake has been widely adopted and was used to build analysis workflows for a variety of high impact publications. With about 5000 homepage visits per month, it has a large and stable user community.


Do you feel unable to statistically analyse data, despite having already followed an introduction course on statistics ? If yes, this course was created for you.
The goal of this training is to provide researchers with the practical skills required in order to analyse real biomedical data. This includes:

  • how to explore data
  • how to choose and apply an analysis method (statistical tests in particular)
  • how to manage common issues encountered during data analysis, such as outliers, batch effects, management of biological vs technical replicates
  • how (and when) to evaluate the power of an experiment
  • how to communicate the results

During this two-day training, you will be provided with datasets to analyse in small groups, using information provided by the trainers. The results will then be discussed together. The datasets will be chosen to allow you to cover the most common questions that arise during a statistical analysis, including the assumptions of tests (and the requirement for normality of data in particular), the handling of outliers, missing data.

With a constant evolution of technologies, laboratory biologists are faced with an increasing need of bioinformatics skills to deal with high-throughput data storage, retrieval and analysis.

Although several resources developped for such tasks have a web interface (most of the time, the first choice of biologitsts), many operations can be more efficiently handled with command lines (CLI).

The goal of this course is to give some basic theoretical and practical knowledge on the mass spectrometry (MS) techniques used in proteomics to identify proteins in simple and complex mixtures. The course will cover the following main subjects:
• Fundamental concepts of mass spectrometry (MS) of peptides and proteins
• Strategies for protein identification with MS data: theory, concepts, variants
• Database searches with MS data: practical examples and exercises
• Validation and interpretation of results

The course will be structured with alternating theory and exercise periods. All participants will learn how to analyse simple mass spectrometry data and how to use an online tool for protein identification. They will also be introduced to a data processing software (Scaffold) used to filter, validate, compare and export complex proteomics results. Examples of typical proteomics workflows will be presented.

The goal of this course is to give some basic theoretical and practical knowledge on the mass spectrometry (MS) techniques used in proteomics to identify proteins in simple and complex mixtures. The course will cover the following main subjects:
• Fundamental concepts of mass spectrometry (MS) of peptides and proteins
• Strategies for protein identification with MS data: theory, concepts, variants
• Database searches with MS data: practical examples and exercises
• Validation and interpretation of results

The course will be structured with alternating theory and exercise periods. All participants will learn how to analyse simple mass spectrometry data and how to use an online tool for protein identification. They will also be introduced to a data processing software (Scaffold) used to filter, validate, compare and export complex proteomics results. Examples of typical proteomics workflows will be presented.

This course is designed to provide researchers in biomedical sciences with experience in the application of basic statistical analysis techniques to a variety of biological problems.

The course will combine lectures on statistics and practical exercises. The participants will learn how to work with the widely used "R" language and environment for statistical computing and graphics.

Topics covered during the course include: reminders about numerical and graphical summaries, and hypothesis testing; multiple testing, linear models, correlation and regression, and other topics. Participants will also have the opportunity to ask questions about the analysis of their own data.


Usage of NGS is increasing in several biological fields due to a very rapid decrease in cost. However, it often results in hundreds of Gbs of data making the downstream analysis very challenging and requires bioinformatics skills.

In this module, we will introduce the most used sequencing technologies and explain their decryption concepts.

We will also introduce the repositories e.g. the European Nucleotide Archive (ENA), Sequence Read Archive (SRA) from which you could retrieve raw data based on specific experiments. We will practice the usage of command line tools to search and fetch NGS raw data in a powerful way.

Finally, using different datasets, we will practice screening for quality control, filtering reads for better downstream analysis, mapping reads to reference genome and visualize the output.

The goal of this course is to expose the participants to 3-dimensional structures of proteins. It describes the experimental methods used to solve these structures, and databases used to archive, annotate and classify protein structures. Analysis and visualisation software will be used to display, analyze, compare and interpret protein structures. Students will also be introduced to protein structure prediction by homology modeling techniques.

The second part of the course is dedicated molecular modelling, introduction to docking of small molecules (drugs, peptides) to large macromolecules and Molecular graphics.