Mine your protein sequences
Lausanne, 27 June 2008
Protein sequences and protein domain analyses have become standard in silico resources for molecular biologists. Behind database searches with the blast program, and the organization of proteins into domains as provided by InterPro, there exist many other methods to investigate the different aspects of protein sequences: their modular organization, their classification and the relationships between structure and function. The purpose of this workshop is to provide insight into these methods. One general principle will be promoted during this workshop, one deals with groups of proteins. Aligned protein families, or protein sub-sequences (domain), or sets of unaligned sequences, always contain more information than individual sequences. Domain hunting methods (PSI-blast, Profile search, profile-HMM) are "classical" but powerful methods for characterizing protein domains. These require that the sequences, or parts thereof (a domain), be arranged into a multiple sequence alignment. An introduction to these methods, and exercises will be given in the morning using the MyHits web server. There exist several methods that don't require the protein sequences be arranged into a multiple sequence alignment as a prerequisite to any analysis. These methods can be used to automatically classify sequences and to detect conserved "diagnostic" motifs. An introduction to these lesser known techniques and exercises will be given in the afternoon.