Monday Aug 15th
Speaker: Sven Bergman, (University of Lausanne and SIB)
Title: Integrative analysis of large-scale data
Abstract: High-throughput technologies like microarrays, next-generation sequencing or mass-spec allow for measuring huge sets of observables for very large collections of samples. The processing of these data in order to extract biological insights is very challenging. One important aspect is the reduction of complexity by grouping similar features or samples together. Specifically, modules contain subsets of samples that exhibit a coherent pattern over some of the measured features. I will present a computational tool, the Iterative Signature Algorithm that enables the efficient extraction of such modules from large datasets. Existing structured information on the features, like those available in ontologies, can be used to annotate modules. Yet, the modular approach can also be used to co-analyze several sets of data. For example, we developed the so-called Ping-pong Algorithm to identify co-modules from two large datasets. We applied this tool for the integrative analysis of gene expression and drug response data from the NCI60 sample collection. This approach allows for predicting gene-drug interactions using only high-throughput data with a significant increase of true positives. Modular analysis is also useful in the context of genome-wide association studies (GWAS) that aim at linking molecular phenotypes (like gene expression or mass-spec data) with genotypes. We will present recent research in this field that is aimed towards an integrative analysis of organismal and molecular phenotypes with very large collections of genotypic markers.
Slides of the lecture.
Speakers: Rico Rueedi, Tanguy Corre, Andrea Prunnotto, Barbara Piasecka (University of Lausanne and SIB)
The goal of the ISA afternoon workshop is to provide you with a hands-on experience of how to perform a modular analysis. To this end we ask participants to install: R and our ISA Bioconductor package, eisa.
Software downloads necessary prior to arrival: R and our ISA Bioconductor package
Usually, eisa can be installed by issuing the commands (within R):
although, depending on the platform, preliminary steps may be required.
In case the basic installation command does not work, several packages need to be "manually"
installed before. The command:
would install the required packages if needed.
Then the basic command :
While we will provide some toy data, we also highly encourage interested participants to bring their own data set for analysis. Although the ISA was originally developed for the analysis of gene expression data, it can be applied to any table of two-dimensional large-scale numerical data. By "large-scale" we mean that the smaller dimension should be at least around 50, while the larger dimension can be up to 100,000 (although that may imply a relatively long computation time, so for practical purposes one should focus first on a smaller subset). For participants with their own data set, we can also provide an ISA package for Matlab, if preferred.
Recommended introductory preparation:
- Bergmann S. et al., Phys. Rev. E 67, 031902
- Gábor Csárdi, Zoltán Kutalik, Sven Bergmann Bioinformatics: 2010, 26(10);1376-7
- Andreas Lüscher, Gábor Csárdi, Aitana Morton de Lachapelle, Zoltán Kutalik, Bastian Peter, Sven Bergmann, Bioinformatics: 2010, 26(16);2062-3
- The documentation of the R-implementation of ISA, on which the workshop will be loosely based
Slides of the tutorial.
Exercises of the tutorial.
18:00-19:00: Pre-session with Nicolas Le Novere (reminder on elementary chemical kinetics and phase diagram)
20:30- : Free time