Загрузил Юрий Панов

guex2009

реклама
S162
Electrophoresis 2009, 30, S162–S173
Celebrating 30 years
Nicolas Guex1
Manuel C. Peitsch1,2
Torsten Schwede3,4
1
Swiss Institute of
Bioinformatics, Lausanne,
Switzerland
2
Philip Morris International,
Research and Development,
Neuchâtel, Switzerland
3
Biozentrum, University of Basel,
Switzerland
4
Swiss Institute of
Bioinformatics, Basel,
Switzerland
Received March 3, 2009
Revised April 13, 2009
Accepted April 14, 2009
Automated comparative protein structure
modeling with SWISS-MODEL and SwissPdbViewer: A historical perspective
SWISS-MODEL pioneered the field of automated modeling as the first protein modeling
service on the Internet. In combination with the visualization tool Swiss-PdbViewer, the
Internet-based Workspace and the SWISS-MODEL Repository, it provides a fully integrated sequence to structure analysis and modeling platform. This computational
environment is made freely available to the scientific community with the aim to hide the
computational complexity of structural bioinformatics and encourage bench scientists to
make use of the ever-increasing structural information available. Indeed, over the last
decade, the availability of structural information has significantly increased for many
organisms as a direct consequence of the complementary nature of comparative protein
modeling and experimental structure determination. This has a very positive and
enabling impact on many different applications in biomedical research as described in
this paper.
Keywords:
Bioinformatics / Homology modeling / Protein structure / SWISS-MODEL /
Swiss-PdbViewer
DOI 10.1002/elps.200900140
1 Introduction
Comparative protein structure modeling and experimental
efforts complement each other with the goal of providing
structural models for diverse applications in biomedical
research. Stable, accurate, reliable and fully automated
modeling pipelines are required to provide structural information for the rapidly growing amount of sequence data. SWISSMODEL pioneered the field of automated modeling as the first
protein modeling service on the Internet (e-mail-based interface in 1991 and the first web-based interface in 1993). In
combination with the visualization tool Swiss-PdbViewer (aka
DeepView), it provides a fully integrated sequence to structure
platform, which has been described in our 1997 paper
‘‘SWISS-MODEL and the Swiss-PdbViewer: An environment
for comparative protein modelling’’ in Electrophoresis [1]. When
the original SWISS-MODEL and Swiss-PdbViewer article was
published, protein modeling was still a very specialized field,
Correspondence: Professor Torsten Schwede, Swiss Institute of
Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50/70, CH-4056 Basel, Switzerland
E-mail: torsten.schwede@unibas.ch
Fax: 141-61-267-15-84
Abbreviations: DAS, distributed annotation system; HMM,
hidden Markov model; PDB, protein data bank; PSI, protein
structure initiative; RMSD, root mean square deviation
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
mostly due to the necessity to use specialized hardware and
software. Our goal was to hide much of the complexity of
comparative modeling and make this technology accessible to
a broader audience and to empower non-structural scientists to
leverage the available molecular structure information
to design experiments. Since the pioneering days, several
other groups have followed suit and developed similar servers
to automate a variety of algorithms and methods, focusing on
different aspects of the modeling workflow: While programs
like SWISS-MODEL [1–5] or COMPOSER [6] derive the model
coordinates using information from aligned template fragments in Cartesian space, many servers are based on
MODELLER, applying satisfaction of spatial restraint techniques to generate the model coordinates [7, 8]. The introduction
of hidden Markov model (HMM) methods [9, 10] has
significantly improved the sensitivity of template detection
and accuracy of target–template alignments in comparative
modeling. Several methods have been developed, which
attempt to combine information from multiple template
structures, e.g. through iterative clustering approaches
[11], conformational space annealing methods [12] or by
profile–profile threading alignment followed by iterative
refinement of the assembly of threading fragments [13, 14].
Recently, methods originally developed for fragment based de
novo modeling have been shown to be effective for comparative
modeling [15] and refinement of structure models [16].
Authors appear in alphabetical order
www.electrophoresis-journal.com
General
Electrophoresis 2009, 30, S162–S173
Several modeling servers for specialized tasks have been
developed, e.g. modeling of antibodies [17, 18]. For a list of
available modeling servers, please refer to [19, 20]. During the
most recent critical assessment of techniques for protein
structure prediction experiments (CASP), it became apparent
that the best fully automated modeling methods have
improved to a level where they challenge many human
predictors in producing accurate models [14, 20, 21].
In the following paragraphs, we will describe the
SWISS-MODEL protein structure prediction and analysis
environment, which today consists of a modeling server [3],
a web-based personalized workspace [2, 22], the visual front
end Swiss-PdbViewer [1] and a repository of annotated
comparative models [23–25].
2 How SWISS-MODEL and SwissPdbViewer evolved over the last decade
2.1 Comparative protein structure modeling
Homology (or comparative) protein structure modeling is the
method of choice for generating reliable and accurate 3-D
models of proteins that share significant sequence similarity
with proteins of known structure. Automated modeling servers
made different modeling algorithms easily accessible to the
general user and removed the need to learn idiosyncratic
software commands – making them valuable tools for both
modeling experts and non-experts alike. By removing the
individual personal expert bias, the development of automated
modeling pipelines has made modeling reproducible. Moreover, the assessment of automated modeling methods on
larger data sets allows estimating their expected accuracy [20,
26–29]. Today, all protein structure modeling approaches make
use of one or more automated pipelines.
The SWISS-MODEL pipeline consists, similarly to most
homology modeling approaches, of the following steps: First, a
library of experimental template structures is searched for
templates sharing significant sequence similarity with the
targeted protein, and the most suitable template(s) are selected.
Based on the alignment between the sequence of the target
protein and the template structure(s), the coordinates of the
model are constructed for the structurally conserved regions of
the model. Residues corresponding to insertions and deletions
in the target–template alignment have to be modeled de novo
without using template information. After applying limited
molecular mechanics-based energy minimization to regularize
the geometry of the models, model quality estimation methods
are used to detect potential errors and inaccuracies.
Since the initial implementation over a decade ago, all of
these steps have been further developed and significantly
improved. Comparative modeling critically depends on the
detection of suitable templates from a library of structures. To
this end, we created the SWISS-MODEL template library
derived from the remediated protein data bank (PDB), which
aims to remove some of the inconsistencies in the original
depositions [30]. The SWISS-MODEL template library contains
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
S163
searchable sequence databases, profiles and structure quality
annotation (e.g. experimental resolution, mean force potential
scores) for each chain, excluding low quality entries (e.g. entries
consisting only of Ca coordinates). The introduction of profilebased sequence comparison methods such as PSI-BLAST [31]
and later HMM-HMM profile methods [10] has significantly
improved the sensitivity and precision of template selection
and alignment. Today, SWISS-MODEL is using a hierarchical
approach to first identify target regions sharing high sequence
similarity to their templates before applying more sensitive
HMM-HMM profile methods to detect and align more
distantly related templates. Possible templates are ranked
according to their E-value, sequence identity to the target,
resolution and structure quality [23]. Templates are progressively selected from this list, where new templates are added if
they significantly increase the coverage of the target sequence,
or add new information (e.g. templates spanning several
domains help to infer relative domain orientation). Coordinate
building in the SWISS-MODEL pipeline is performed by
transferring template information from aligned template
fragments in Cartesian space. Regions corresponding to
insertions and deletions in the alignment are built using both
backbone libraries and de novo loop-building procedures. First,
an ensemble of fragments compatible with the flanking
regions is constructed using constraint satisfaction programming. The best fragment is selected using a scoring scheme,
which accounts for force field energy, steric hindrance and
favorable interactions like hydrogen bond formation. In cases
where constraint satisfaction programming does not give a
satisfying solution and for loops above ten residues, a library
derived from experimental structures is searched to find
compatible fragments. The reconstruction of the amino acid
side chains is based on the weighted positions of corresponding residues in the template structures. Starting with
conserved residues, the model side chains are built by
isosterically replacing template structure side chains. Feasible
side chain conformations are selected from a backbonedependent rotamer library [32], which has been carefully
constructed taking the quality of the source structures into
account. A scoring function assessing favorable interactions
(hydrogen bonds, disulfide bridges) and unfavorably close
contacts is applied to select the most likely conformation. The
stereochemistry of the resulting models is regularized using a
short energy minimization procedure with the Gromos 96
force field [33]. Model quality estimation is performed using
mean force potential approaches such as ANOLEA [34] and
QMEAN [35].
2.2 Automated modeling server
The first automated protein modeling server was built
before the advent of the Web and its highly interactive
technology. In 1991, for the first time a modeling request
could be submitted using a formatted E-mail. With the
arrival of the World Wide Web, SWISS-MODEL was among
the first bioinformatics services available on the Web as part
www.electrophoresis-journal.com
S164
N. Guex et al.
of the ExPASy system [36]. The first Web-based user
interface to SWISS-MODEL automatically created a correctly
formatted E-mail and sent it to the modeling server [5]. In
more recent years, this aging interface and communication
mode was replaced by the SWISS-MODEL Workspace.
Today, an interactive personalized web-based working
environment [2, 22] allows several projects to be performed
in parallel. In addition to structure modeling, SWISSMODEL Workspace offers different types of modelingrelated tasks such as domain assignment, template selection, prediction of secondary structure or disordered
segments, or model quality estimation. In-page visualization
using Java applets provides a fast preview of the overall fold
of the model, while further detailed exploration and finetuning of the models is possible with Swiss-PdbViewer (see
below). Currently, SWISS-MODEL Workspace receives
1500 interactive modeling requests every day.
Electrophoresis 2009, 30, S162–S173
The SWISS-MODEL Repository web interface (Fig. 1)
can be queried for specific proteins using database accession
codes (e.g. UniProt AC and ID, GenBank, IPI, Refseq) or
directly with the protein amino acid sequence, or fragments
thereof, e.g. for a specific domain (http://swissmodel.
expasy.org/repository/). The functional and domain annotation for the target protein is retrieved dynamically using
web service protocols in real time to ensure that the latest
annotation information is provided – even if the model has
been built some time before.
In order to allow for additional (not pre-computed)
analyses on the models or on the underlying protein target
sequence, we have implemented a tight link between the
SWISS-MODEL Repository and the corresponding modules
in the Workspace, which allow, e.g. for estimation of model
quality using different global and local quality scores.
2.3 SWISS-MODEL Repository
2.4 Database interoperability and programmatic
access
In spring 1998, we subjected all entries of Swiss-Prot and
trEMBL (equivalent to all protein sequences known at that
time) to the SWISS-MODEL pipeline in a completely
automated process called 3D-Crunch [26]. This followed
several experiments that tested the concept of genome scale
protein modeling on bacterial [37–39] and yeast genomes
[38, 40]. This was the first time a large-scale data set was
available to analyze the performance of an automated
modeling pipeline. Based on 3D-Crunch and the early
experiments using both confirmed and putative proteins
derived from several bacterial and the yeast genomes
enabled us to make a first analysis of the potential
of protein modeling to close the sequence to structure
gap. In the following months, this was instrumental
in improving the server’s performance and provided the
initial seed models for the SWISS-MODEL Repository
[23–25, 39].
In later years [23–25], the SWISS-MODEL Repository
has been developed as a relational database of annotated
models, aiming at comprehensive and up-to-date coverage of
selected model proteomes. As interactive model building
can be relatively time-consuming, a comprehensive database
of pre-computed models provides the opportunity to crosslink model information with other biological data resources,
such as sequence databases or genome browsers, in real
time. In the repository, model target sequences are uniquely
identified by their md5 cryptographic hash of the full-length
amino acid sequence. This mechanism allows the redundancy in protein sequence databases to be reduced,
and facilitates cross-referencing with resources using
different accession code systems [23]. Regular incremental
updates include new target sequences from the UniProt
database [41] and newly available template structures [42].
However, when major improvements to the underlying
modeling algorithms have been made, full updates are
required.
The integration of different types of data, such as sequence
annotations and 3-D structure information for large
amounts of diverse data in heterogeneous formats, is still
an open challenge in Bioinformatics. Protein models
provide a natural bridge connecting sequence-based data
resources, such as genome browsers and protein structure
information. However, unlike experimental results that
remain static once entered into the corresponding databases, model information is intrinsically dynamic as models
need to be re-calculated when better template structures
become available or improvements in modeling algorithms
allow building better models for a given target sequence. We
have therefore developed technologies capable of dynamic
integration of sequence, experimental and model structure
information.
The Protein Model Portal [43] is a component of the
Protein Structure Initiative (PSI) structural genomics
knowledge base [44] and provides a single interface to access
several million pre-built models from (i) the SWISS-MODEL
Repository [23], (ii) ModBase [7], (iii) several large-scale PSI
centers as well as (iv) to experimental structures from the
PDB [42].
The ‘‘distributed annotation system’’ (DAS) [45] is a
light-weight mechanism for web-service-based annotation
exchange, which is widely used in genome browsers and
other software frameworks for sequence annotation. The
DAS concept relies on an XML specification that defines the
communication between server and client. We have implemented a DAS-server for the SWISS-MODEL Repository
based on the DAS/1 standard. Any DAS compatible annotation system can thereby extend its sequence annotation by
3-D model information using either UniProt accession
codes or md5-hashes of the corresponding amino acid
sequences as identifiers. The SWISS-MODEL Repository
DAS service is accessible at http://swissmodel.expasy.org/
service/das/swissmodel/.
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
www.electrophoresis-journal.com
Electrophoresis 2009, 30, S162–S173
General
S165
Figure 1. Example for a
SWISS-MODEL Repository
entry for a model of
UniProt entry A4C2S2, a
protein of unknown function from Polaribacter
irgensii 23-P.
2.5 Accuracy of automated models
Possible applications of protein models depend largely on the
quality of the models. Therefore, evaluation of model quality is a
crucial step in homology modeling. During the 3D-Crunch
experiment, a control set of 1200 models for proteins of known
3-D structure was generated, sharing 25–95% sequence identity
between the template and the target. For the first time it was
thereby possible to analyze the reliability of automated modeling
on a large scale [26, 46]. SWISS-MODEL was the first
comparative modeling service to join the EVA project for the
continuous and automated assessment of modeling servers [29].
Between 2000 and 2006 (256 weekly releases of the PDB) the
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
sequences of 21 318 proteins representing 18 078 distinct
protein target chains have been submitted to the SWISSMODEL. The resulting models have been evaluated based on
the root mean square deviation (RMSD) of Ca atoms following
global superposition of the model and the experimental target
structures. This process allows estimating the overall expected
accuracy as a function of the percentage of sequence identity
between target and best template. As expected, model RMSD
increases with decreasing alignment accuracy. All models and
evaluation results are available on the EVA website [29].
While the assessment of a prediction method can
provide an estimate of the average performance of a method,
the differences in accuracy reached for different modeling
www.electrophoresis-journal.com
S166
N. Guex et al.
targets are much larger than the differences between
different methods for the same modeling target [20, 21, 29].
At the time of modeling, the accuracy of a model is
unknown and cannot be measured directly as the ‘‘real’’
structure is unknown. Therefore, the accuracy of each
model has to be estimated individually using model quality
estimation methods [35, 47].
2.6 Protein structure visualization and analysis with
Swiss-PdbViewer
The aim of enabling non-specialists to utilize structural data
on standard desktop computers creates a particular challenge. Indeed, while providing such an environment creates
the opportunity for scientists with no particular expertise in
structural biology to obtain and visualize proteins models in
a completely automated way, it also opens the door to overinterpretations as these models are certainly not devoid of
errors and inaccuracies. Over the years we incorporated
various validation tools on the SWISS-MODEL server
(WhatIf [48], ANOLEA [34]) and coloring schemes in
Swiss-PdbViewer (‘‘protein problems’’) or tools such as
Ramachandran plots and mean force potential to highlight
residues with abnormal topologies [49]. Further guidance on
the proper utilization and limitation of those modeling tools
has been disseminated through the publication of a chapter
in Current Protocols [50] and through on-line courses and
tutorials for students. In particular, our long-standing
collaboration with Prof. Gale Rhodes (University of Southern Maine), who continuously maintained and updated his
tutorial for each new release of Swiss-PdbViewer, has been
key to the success of this application.
2.7 New Swiss-PdbViewer features
The basic functionality of Swiss-PdbViewer [1, 49] has
remained the same with the main purpose to serve as (i) a
simple way to visualize align and compare structures and (ii)
an interface with the SWISS-MODEL server. Since its
inception, particular emphasis has been put on the user
interface and interface reactivity. The interface and all
algorithms are implemented in native C code and are very
efficient. Windows are synchronized to provide visual
feedback between the structure(s) displayed, the sequence
alignment and the residues selected for display. SwissPdbViewer also provides an extended set of tools to
manipulate sequence–structure alignments, selections,
display and allow general protein analysis and validation.
However, compared with the version described in the
original article, the current version offers a more complete
set of tools, among which the possibility to compute
molecular surfaces, detect cavities and electrostatic potentials. Furthermore, it is now possible to perform homology
modeling directly within the application. The original
loop building approach that uses a loop library has been
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Electrophoresis 2009, 30, S162–S173
supplemented with a de novo loop-building method based
on satisfaction of spatial restraints and the rotamer library
has been updated to a backbone-dependent rotamer
library [32]. Support for energy minimization is provided
through an implementation of the GROMOS96 [33] force
field.
In the original version of Swiss-PdbViewer, the paradigm was to use several structures to facilitate the exploration of one given target-sequence. Therefore, it was not well
suited to the exploration of structural differences in
sequence families until structural models for each individual member of the family were obtained. Thus, we changed the paradigm and several sequences and/or structures
can now be loaded simultaneously. With the increasing
number of whole genome association studies, mapping
the non-synonymous SNPs in the structural context has
become a common task. However, it is relatively tedious
to map SNPs to structures principally because of the
necessity to convert genomic coordinates to structural
coordinates. To facilitate this process, we introduced the
possibility to load cDNA sequences in the software: the
translated amino acid sequences and their predicted structural information remain associated with their nucleotide
sequences throughout the process, even when alignments
are altered.
Since its first release, Swiss-PdbViewer has been tightly
linked to SWISS-MODEL, and thus it has been extended to
support the recently released SWISS-MODEL Workspace [2]
through direct communication with the server. Modeling
templates can be searched and retrieved from the server
using BLAST, modeling requests can be submitted and then
models can be retrieved directly from the server. Overall, the
communication capabilities of Swiss-PdbViewer have been
increased and it is now possible to import sequences,
structures and compounds or to align sequences using
MUSCLE [51] on a remote server (Fig. 2). Furthermore, the
addition of a scripting language created the possibility for
users to write additional commands for the user interface
and/or to process sequences in batch mode.
We added new ways to superpose structures. The
popular ‘‘Magic Fit’’ command relies on the correct detection of a stretch of similar residues at sequence level to
‘‘seed’’ a structural fit. However, this will fail when no
sequence similarity can reliably be identified for distantly
related proteins. Therefore, we included a method for
sequence-independent superposition using vectorized
secondary structure information as seed to search for
possible ways to superpose distantly related proteins. As this
method relies on similarly organized secondary structure
elements, it cannot be used to explore the conservation of
finer-grained local similarities based on sparse residues
such as catalytic ones. Thus, as a way to identify common
local structural arrangements of residues, we also developed
a method that allows searching for specific 3-D motifs in a
given set of proteins. Briefly, for each position of the motif,
it is possible to specify a list of desired amino acids, the
secondary structure, the minimum and maximum backbone
www.electrophoresis-journal.com
Electrophoresis 2009, 30, S162–S173
General
S167
Figure 2. Swiss-PdbViewer – a tool for protein structure modeling, visualization and analysis. Structural data can be retrieved directly
from the PDB [42] using accession numbers or simple text queries. When available, electron density maps can be retrieved from the
Uppsala Electron-Density Server [92]. Small molecular compounds can be retrieved from PubChem [93], and energy minimized with the
Dundee PRODRG2 server [94]. cDNA sequences can be retrieved from GenBank [93], whereas amino acid sequences can be imported
from ExPASy [36] or GenBank. Identification of homologous sequences or structures is achieved using the BLAST service of the SWISSMODEL Workspace [2, 3]. Protein structures can be searched for the presence of user-defined 3-D motifs, and sequences can be aligned
using built-in tools or external tools, such as MUSCLE [51] running at the Vital-IT (http://www.vital-it.ch) Center for high-performance
computing of the Swiss Institute of Bioinformatics. Protein modeling requests can be directly submitted to SWISS-MODEL and results
re-imported into the workspace for further refinement.
separation between residues and a set of additional distance
constraints between any pairs of atoms. Those 3-D motifs
can be generated directly from within Swiss-PdbViewer and
then submitted to the Vital-IT cluster (http://spdbv.vitalit.ch/) to search a non-redundant set of structures. Results
can then be retrieved directly in the interface for visualization, superposition and analysis.
3 Structural coverage – the structure gap
Since DNA sequencing data is outgrowing structure
determination efforts at exponential rates (Fig. 3), protein
structure modeling will be the only available method to
generate accurate structural models for the vast majority of
proteins. Recent work by Levitt (personal communication)
has confirmed the notion that, although the number of
multi-domain architecture families grows rapidly and at the
same rate as the number of newly sequenced genes [52],
almost all of this complexity arises from the arrangement of
known single domains within a chain, particularly for
eukaryotes. For model organisms, humans and known
pathogens, the repertoire of structural domains is finite by
definition. Comprehensive coverage of the complete protein
domain space by representative structures appears as a
reachable goal in the mid-term perspective, and has been set
as one of the scientific aims of the PSI structural genomics
efforts [53–55]. Structural genomics has considerably
increased the structural coverage of protein sequence space,
significantly contributed to describe novel structural
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
families and often provided the first representatives for
functional groups that had not been structurally characterized before [56–58]. As a consequence, for many organisms,
availability of structure information has significantly changed over the last decade.
The quality of a protein model reflects the deviation of
the template structure relative to the actual structure of the
target as well as limitations of sequence comparison and
alignment methods. It is generally accepted that the
percentage of sequence identity between target and template
allows for a reasonable first estimate of the model quality,
and that the core Ca atoms of protein models sharing 50%
sequence identity with their templates will deviate by
approximately 1.0 Å RMSD from their experimentally
elucidated structures for regions of proteins not subject to
molecular rearrangements upon binding to an other molecular entity. Taking Escherichia coli as an example, during
the 3D-Crunch experiment in 1998 [26] only a very small
fraction of sequence entries were amenable to protein
modeling using templates sharing more than 30% sequence
identity with the target protein, resulting in a coverage of
15% of the target sequences. Today, profile-based methods
for sequence comparison and alignments allow extending
target–template alignments to more remotely related
templates, while at the meantime, experimental template
structures are available for many more protein families. We
have computed a retrospective estimate of structural coverage of the E. coli proteome (Fig. 4). For each of the 4173
sequences in the complete E. coli proteome obtained from
UniProt [59], a PSI-BLAST [31] profile was calculated using
www.electrophoresis-journal.com
S168
N. Guex et al.
Electrophoresis 2009, 30, S162–S173
Figure
3. Number
of
entries in public sequence
and structure databases.
Although the number of
entries in the PDB [42] is
growing
exponentially,
sequence databases [59,
95] are growing at a much
higher rate – widening the
structure knowledge gap.
Figure
4. Retrospective
analysis
of
structural
coverage of E. Coli over
time. We have analyzed
retrospectively
which
structure information –
either experimental structures or models of various
levels of target–template
sequence identity – was
available at a given point
in time for the residues in
the proteome of the
model organism E. Coli.
a non-redundant protein sequence database (current as of
December 2008). This profile was used to search the
sequences of experimentally determined structures deposited in the PDB [42] as of December 2008 for suitable
templates. For each year starting in 1972, we recorded the
highest sequence identity to the closest template for each of
the 1 358 278 residues in the E. coli proteome. Figure 4
shows the steady increase in structural knowledge. Today,
for about 50% of all E. coli protein sequences, a model can
be built using a template sharing at least 30% sequence
identity with the target sequence, covering approx. 23% of
all residues – compared with ca. 11% in 1998. This observation may lend support to the early prediction (1990s) that
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
by 2020 we will be able to generate at least one reasonable
quality protein structure model for most proteins of the
major model organisms.
4 Applications of models in biomedical
research
There is a wide range of applications for comparative
models [60, 61], such as designing experiments for sitedirected mutagenesis or protein engineering, predicting
ligand binding sites and docking small molecules in
structure-based drug discovery [62, 63], studying the effect
www.electrophoresis-journal.com
General
Electrophoresis 2009, 30, S162–S173
of mutations and SNPs [64, 65], phasing X-ray diffraction
data in molecular replacement [16, 66], as well as protein
engineering and design. Hereafter, we provide only just a
few examples of applications of models mainly built with
our modeling environment.
4.1 Functional analysis of proteins
Insights into the 3-D structure of a protein can be of great
assistance in assigning its molecular function, while its
biological role and localization are much more difficult to
relate to its structure. Predicting the molecular function of a
protein on the sole basis of a 3-D structure is however, in
itself, a very challenging task. Indeed, if the active site has
been observed previously [41, 67, 68] or, if the protein has
been co-crystallized with a cognate ligand, we have a better
chance of succeeding. In our hands, we were able to verify
and confirm the assignment of several Caenorhabditis elegans
insulin-like genes using low-accuracy models [69]. Similarly,
the trimeric nature of the CD40L was first proposed based
on a low-accuracy model where the target and the TNF-a
template share less than 26% sequence identity [70].
Generally speaking, the study and comparison of 3-D
structural features, as opposed to the study of linear
sequence alignments, allows to reason on how proteins
might interact with other molecular entities and permits to
map functional epitopes [71, 72]. Similarly, the combination
of experimental mutagenesis data with biophysical measurements allows to build models that fit the data and that can in
turn be used to propose new hypotheses [73, 74].
4.2 Studying the impact of mutations and SNPs on
protein function
Diseases, or less-severe phenotypic variations, which can be
unequivocally assigned to single point mutations, provide a
good framework to understand the molecular function and
biological role of a protein. Therefore, protein models can be
readily applied to interpret the impact mutations can have on
the overall structure and, thus, the function of a protein [64, 65].
It is through ‘‘visual inspection’’ associated with a good
knowledge and understanding of the rules underlying protein
structure that the most useful hypothesis regarding the reasons
for mutant malfunction can be made (for concrete examples see
[45, 64, 65, 75, 76]). There is an increasingly large body of data
on naturally occurring mutations (over 43 000 human sequence
variants are reported in Swiss-Prot) and SNP, of which a
sizeable proportion will alter the translated protein sequences.
Interpreting the potential functional effects of these mutants
will be crucial to elucidate the molecular basis of human
diseases. The ability to map mutations onto structures or
models is also particularly relevant in the context of infectious
diseases where agents such as HIV and Influenzae have a high
rate of mutations and for which a wealth of sequences data is
collected.
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
S169
4.3 Planning site-directed mutagenesis experiments
One definite advantage of 3-D structure and models in
functional protein analysis is that they provide a solid base
for site-directed mutagenesis experiments aimed at the
elucidation of the molecular function of proteins. Even
medium and low-accuracy models can be used as frameworks
for experiment planning to guide the selection of key mutants
designed to test functional hypothesis [77] or to modulate a
protein’s biophysical properties [78]. These experimentally
generated mutants complement the naturally occurring ones
mentioned in Section 4.2, and together with the mapping of
other facts such as sites of post-translational modifications,
greatly contribute to the elucidation of protein function [79].
For instance, the comparative models that were generated for
the Fas ligand, its protein family members [5] and receptor
illustrate how models can be applied to (i) understand the
impact of naturally occurring mutations [80, 81], (ii) experimental mutagenesis and (iii) interpret and map other known
features such as glycosylations to understand the finer
molecular function of a protein.
4.4 Molecular replacement
Solving the phase problem in crystallography experiments is
a crucial step towards reconstructing atomic structures that
optimally fit the experimental data. As phases cannot be
measured directly, they have to be obtained indirectly using
experimental methods such as heavy-atom isomorphous
replacement, anomalous scattering or by molecular replacement [16, 61, 66, 82]. The first application of a model built
with SWISS-MODEL in molecular replacement was
performed by Karpusas and co-workers [83] to obtain a 2 Å
resolution X-ray structure of the human CD40 ligand (PDB
entry 1aly). The authors used our published murine
homology model [70] (PDB entry 1cda) to build a human
model of CD40L and then applied the latter ‘‘model of a
model’’ to the molecular replacement approach. A more
recent example can be seen here [13].
5 Concluding remarks
Over the last 15 years, we have witnessed the transition from
a situation, where structural information was available only
in rare cases, to today’s context, where for many model
organisms either experimental structures or models are
available for a large part of proteins. Protein modeling today
is well established and routinely used in various biomedical
research applications. However, there are still major
challenges ahead:
(i) Template coverage: Systematic international structural
genomics projects have contributed significantly to the
increase in novel structural information in the PDB in
www.electrophoresis-journal.com
S170
(ii)
(iii)
(iv)
(v)
(vi)
N. Guex et al.
recent years [56]. However, continued effort in this
direction is required to map out the remaining
uncharted regions of the protein universe. Especially
membrane proteins will require significant attention.
Depending on the biomedical interest in specific
protein families, different levels of sampling granularity may be appropriate. Since large protein families
tend to be functionally more diverse, finer grained
sampling will be required to elucidate functional
differences.
Modeling complexes from individual domains: Often, the
structures of individual domains are experimentally
better tractable than multi-domain proteins or
complexes. Computational modeling of the relative
orientation of the individual domain components is
therefore an important goal. Remarkable progress has
been made in this endeavor in recent years as
documented in the community wide experiment on
the comparative evaluation of protein–protein docking
for structure prediction experiments [84]. With the
increasing amount of complete genome data becoming available, approaches based on mutual information analysis are becoming increasingly powerful
[85–87].
Model refinement: Comparative modeling methods are
based on the basic assumption that structure information for the target protein can be inferred from the
template structure for evolutionary-related proteins.
However, with increasing evolutionary distance,
considerable structural differences between target
and template will occur. Recently, significant progress
has been reported for model refinement using Monte
Carlo sampling approaches initially developed for
ab initio modeling [16].
Modeling small induced differences: While evolutionary
inference often allows modeling conserved properties
of a protein such as its overall fold, it is often desirable
to predict small, functionally divergent features, such
as variations in substrate specificity or ligand affinity
within a family of proteins, structural effects of
mutations or non-synonymous SNPs, or other functional properties. Also, some regions in the apostructure of a protein may not correspond to the
conformation it adopts when binding a partner, e.g.
activation loops of kinases.
Model quality estimation: Possible applications of
models ultimately depend on their accuracy. However,
at the time of modeling, the accuracy of a model is
unknown and has to be predicted. Several approaches
for estimating the expected accuracy of models have
been developed [35, 47, 88–90]. However, there is still
a long way to go until the suitability of a model for a
certain application can be predicted reliably.
Visualization of uncertainty and precision: Experimental
structures as well as models have limitations in their
precision, which may even vary for different regions
within the same structure. However, many graphical
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Electrophoresis 2009, 30, S162–S173
molecular representations suggest invariable atomic
precision throughout the structure, and do not
visualize the uncertainty of the underlying structure
data. With the increase in available low-resolution
experimental data and composite experimental
computational models, the question of visualization
of uncertainty will become more urgent.
(vii) Integrative/hybrid modeling: Ultimately, all structure
determination methods are ‘‘hybrid’’ methods as they
rely – to different extents – on both experimental data
and computational components such as molecular
force fields. While many low-resolution experimental
techniques do not produce sufficient data to directly
derive atomic precision structures, they still provide
valuable information about certain aspects of the
macromolecular assembly. By combining various
complementary sources of information, both experimental and computational, it is possible to derive an
integrative model that would not have been possible
with any of the individual components alone, as has
been impressively demonstrated for the NPC nuclear
pore complex [91].
In many ways, the challenges and limitations of
comparative modeling that existed 12 years ago are still valid
today. However, protein structure modeling has made the
transition from a niche approach for anecdotal examples to a
mainstream technology applicable to a majority of proteins.
The availability of whole genome data does not only allow
for better evolutionary inference and improved sequence
alignments; in combination with automated structure
modeling it opens the possibility to compare proteins and to
analyze functional differences in their structural context.
Now is the time to change our mental picture of a
protein as a ‘‘linear string of letters’’ to a ‘‘3-D structure in
the functional context of its evolutionary relatives’’. We hope
that the SWISS-MODEL and Swiss-PdbViewer suite of tools
will contribute to make that change.
We are particularly thankful to Timothy N. C. Wells
(GlaxoWellcome, now at Medicines for Malaria Venture),
Jonathan C. K. Knowles (GlaxoWellcome, now at Roche), and
Allan Baxter (GlaxoSmithKline) who have established the
necessary environment in the beginning of this project, and to
Michael W. Lutz and David B. Searls for their support.
Furthermore, we are deeply indebted to Stanley K. Burt, Robert
W. Lebherz and Jack R. Collins as well as the entire staff at the
Advanced Biomedical Computing Center at NCI-Frederick
(Frederick, MD, USA) for their support in operating the US
mirror of the SWISS-MODEL server. We are extremely grateful
to Gale Rhodes of the University of Southern Maine for coordinating the active Swiss-PdbViewer user community and his
outstanding commitment to teaching in structural biology. We
thank Alexander Diemand for his contributions to the SwissPdbViewer Linux code. We are deeply indebted to Konstatin
Arnold, Jürgen Kopp, Rainer Pöhlmann, Michael Podvinec,
www.electrophoresis-journal.com
Electrophoresis 2009, 30, S162–S173
Lorenza Bordoli and Florian Kiefer for their many contributions
to the development and daily operations of the SWISS-MODEL
Server, Repository and Workspace. We gratefully acknowledge
financial support by GlaxoSmithKline, Novartis, the SNF Swiss
National Science Foundation, the Biozentrum of the University
of Basel and the Swiss Institute of Bioinformatics.
The authors have declared no conflict of interest.
6 References
[1] Guex, N., Peitsch, M. C., Electrophoresis 1997, 18,
2714–2723.
[2] Arnold, K., Bordoli, L., Kopp, J., Schwede, T., Bioinformatics 2006, 22, 195–201.
[3] Schwede, T., Kopp, J., Guex, N., Peitsch, M. C., Nucleic
Acids Res. 2003, 31, 3381–3385.
[4] Peitsch, M. C., Biochem. Soc. Trans. 1996, 24, 274–279.
[5] Peitsch, M. C., Biotechnology 1995, 13, 658–660.
[6] Srinivasan, N., Blundell, T. L., Protein Eng. 1993, 6, 501–512.
[7] Pieper, U., Eswar, N., Webb, B. M., Eramian, D., Kelly, L.,
Barkan, D. T., Carter, H. et al., Nucleic Acids Res. 2009,
37, D347–D354.
[8] Sali, A., Blundell, T. L., J. Mol. Biol. 1993, 234, 779–815.
[9] Karplus, K., Barrett, C., Hughey, R., Bioinformatics 1998,
14, 846–856.
General
S171
[25] Kopp, J., Schwede, T., Nucleic Acids Res. 2004, 32,
D230–D234.
[26] Peitsch, M. C., Schwede, T., Guex, N., Pharmacogenomics 2000, 1, 257–266.
[27] Marti-Renom, M. A., Stuart, A. C., Fiser, A., Sanchez, R.,
Melo, F., Sali, A., Annu. Rev. Biophys. Biomol. Struct.
2000, 29, 291–325.
[28] Rychlewski, L., Fischer, D., Protein Sci. 2005, 14, 240–245.
[29] Koh, I. Y., Eyrich, V. A., Marti-Renom, M. A., Przybylski,
D., Madhusudhan, M. S., Eswar, N., Grana, O. et al.,
Nucleic Acids Res. 2003, 31, 3311–3315.
[30] Henrick, K., Feng, Z., Bluhm, W. F., Dimitropoulos, D.,
Doreleijers, J. F., Dutta, S., Flippen-Anderson, J. L. et al.,
Nucleic Acids Res. 2008, 36, D426–D433.
[31] Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J.,
Zhang, Z., Miller, W., Lipman, D. J., Nucleic Acids Res.
1997, 25, 3389–3402.
[32] Lovell, S. C., Word, J. M., Richardson, J. S., Richardson,
D. C., Proteins 2000, 40, 389–408.
[33] van Gunsteren, W. F., Billeter, S. R., Eising, A., Hünenberger, P. H., Krüger, P., Mark, A. E., Scott, W. R. P. et al.,
Biomolecular Simulations: The GROMOS96 Manual and
User Guide, VdF Hochschulverlag ETHZ, Z ü rich 1996.
[34] Melo, F., Feytmans, E., J. Mol. Biol. 1998, 277,
1141–1152.
[35] Benkert, P., Tosatto, S. C., Schomburg, D., Proteins
2008, 71, 261–277.
[10] Soding, J., Bioinformatics 2005, 21, 951–960.
[36] Appel, R. D., Bairoch, A., Hochstrasser, D. F., Trends
Biochem. Sci. 1994, 19, 258–260.
[11] Fernandez-Fuentes, N., Madrid-Aliste, C. J., Rai, B. K.,
Fajardo, J. E., Fiser, A., Nucleic Acids Res. 2007, 35,
W363–W368.
[37] Peitsch, M. C., Wilkins, M. R., Tonella, L., Sanchez, J. C.,
Appel, R. D., Hochstrasser, D. F., Electrophoresis 1997,
18, 498–501.
[12] Joo, K., Lee, J., Lee, S., Seo, J. H., Lee, S. J., Lee, J.,
Proteins 2007, 69, 83–89.
[38] Peitsch, M. C., Guex, N., in: Wilkins, M. R., Williams,
K. L., Appel, R. O., Hochstrasser, D. F. (Eds.), Proteome
Research: New Frontiers in Functional Genomics,
Springer 1997, pp. 177–186.
[13] Zhou, H., Pandit, S. B., Lee, S. Y., Borreguero, J., Chen, H.,
Wroblewska, L., Skolnick, J., Proteins 2007, 69, 90–97.
[14] Zhang, Y., Proteins 2007, 69, 108–117.
[15] Chivian, D., Baker, D., Nucleic Acids Res. 2006, 34,
e112.
[16] Qian, B., Raman, S., Das, R., Bradley, P., McCoy, A. J.,
Read, R. J., Baker, D., Nature 2007, 450, 259–264.
[17] Sivasubramanian, A., Sircar, A., Chaudhury, S., Gray,
J. J., Proteins 2009, 74, 497–514.
[18] Marcatili, P., Rosi, A., Tramontano, A., Bioinformatics
2008, 24, 1953–1954.
[19] Fox, J. A., McMillan, S., Ouellette, B. F., Nucleic Acids
Res. 2006, 34, W3–W5.
[20] Battey, J. N., Kopp, J., Bordoli, L., Read, R. J., Clarke, N.
D., Schwede, T., Proteins 2007, 69, 68–82.
[21] Kopp, J., Bordoli, L., Battey, J. N., Kiefer, F., Schwede,
T., Proteins 2007, 69, 38–56.
[39] Peitsch, M. C., Proc. Int. Conf. Intell. Syst. Mol. Biol.
1997, 5, 234–236.
[40] Sanchez, R., Sali, A., Proc. Natl. Acad. Sci. USA 1998, 95,
13597–13602.
[41] Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C.,
Boeckmann, B., Ferro, S., Gasteiger, E. et al., Nucleic
Acids Res. 2005, 33, D154–D159.
[42] Berman, H., Henrick, K., Nakamura, H., Markley, J. L.,
Nucleic Acids Res. 2007, 35, D301–D303.
[43] Arnold, K., Kiefer, F., Kopp, J., Battey, J. N., Podvinec, M.,
Westbrook, J. D., Berman, H. M. et al., J. Struct. Funct.
Genomics 2009, 10, 1–8.
[44] Berman, H. M., Westbrook, J. D., Gabanyi, M. J.,
Tao, W., Shah, R., Kouranov, A., Schwede, T. et al.,
Nucleic Acids Res. 2009, 37, D365–D368.
[22] Bordoli, L., Kiefer, F., Arnold, K., Benkert, P., Battey, J.,
Schwede, T., Nat. Protoc. 2009, 4, 1–13.
[45] Jenkinson, A. M., Albrecht, M., Birney, E., Blankenburg,
H., Down, T., Finn, R. D., Hermjakob, H. et al., BMC
Bioinformatics 2008, 9, S3.
[23] Kiefer, F., Arnold, K., Kunzli, M., Bordoli, L., Schwede, T.,
Nucleic Acids Res. 2009, 37, D387–D392.
[46] Schwede, T., Diemand, A., Guex, N., Peitsch, M. C., Res.
Microbiol. 2000, 151, 107–112.
[24] Kopp, J., Schwede, T., Nucleic Acids Res. 2006, 34,
D315–D318.
[47] Cozzetto, D., Kryshtafovych, A., Ceriani, M., Tramontano, A., Proteins 2007, 69, 175–183.
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
www.electrophoresis-journal.com
S172
N. Guex et al.
[48] Hooft, R. W., Vriend, G., Sander, C., Abola, E. E., Nature
1996, 381, 272.
[49] Guex, N., Diemand, A., Peitsch, M. C., Trends Biochem.
Sci. 1999, 24, 364–367.
[50] Guex, N., Schwede, T., Peitsch, M. C., Curr. Protoc.
Protein Sci. 2001, Chapter 2, Unit 2 8.
[51] Edgar, R. C., Nucleic Acids Res. 2004, 32, 1792–1797.
[52] Yooseph, S., Sutton, G., Rusch, D. B., Halpern, A. L.,
Williamson, S. J., Remington, K., Eisen, J. A. et al., PLoS
Biol. 2007, 5, e16.
[53] Burley, S. K., Nat. Struct. Biol. 2000, 7, 932–934.
[54] Kim, S. H., Curr. Opin. Struct. Biol. 2000, 10, 380–383.
[55] Sanchez, R., Pieper, U., Melo, F., Eswar, N., MartiRenom, M. A., Madhusudhan, M. S., Mirkovic, N. et al.,
Nat. Struct. Biol. 2000, 7, 986–990.
Electrophoresis 2009, 30, S162–S173
[73] Scheib, H., McLay, I., Guex, N., Clare, J. J., Blaney, F. E.,
Dale, T. J., Tate, S. N. et al., J. Mol. Model 2006, 12,
813–822.
[74] Sanders, R. W., Hsu, S. T., van Anken, E., Liscaljet, I. M.,
Dankers, M., Bontjer, I., Land, A. et al., Mol. Biol. Cell
2008, 19, 4707–4716.
[75] O’Hara, F. P., Guex, N., Word, J. M., Miller, L. A., Becker,
J. A., Walsh, S. L., Scangarella, N. E. et al., J. Infect. Dis.
2008, 197, 187–194.
[76] Pajerowska-Mukhtar, K. M., Mukhtar, M. S., Guex, N.,
Halim, V. A., Rosahl, S., Somssich, I. E., Gebhardt, C.,
Planta 2008, 228, 293–306.
[77] Junne, T., Schwede, T., Goder, V., Spiess, M., Mol. Biol.
Cell 2006, 17, 4063–4068.
[78] Schwede, T. F., Badeker, M., Langer, M., Retey, J.,
Schulz, G. E., Protein Eng. 1999, 12, 151–153.
[56] Levitt, M., Proc. Natl. Acad. Sci. USA 2007, 104, 3183–3188.
[79] Peitsch, M. C., Bioinformatics 2002, 18, 934–938.
[57] Nair, R., Liu, J., Soong, T. T., Acton, T. B., Everett, J. K.,
Kouranov, A., Fiser, A. et al., J. Struct. Funct. Genomics
2009, 10, 181–191.
[80] Hahne, M., Peitsch, M. C., Irmler, M., Schroter, M.,
Lowin, B., Rousseau, M., Bron, C. et al., Int. Immunol.
1995, 7, 1381–1386.
[58] Redfern, O. C., Dessailly, B., Orengo, C. A., Curr. Opin.
Struct. Biol. 2008, 18, 394–402.
[81] Notarangelo, L. D., Peitsch, M. C., Immunol. Today
1996, 17, 511–516.
[59] UniProtConsortium, Nucleic Acids Res. 2009, 37,
D169–D174.
[60] Schwede, T., Sali, A., Honig, B., Levitt, M., Berman,
H. M., Jones, D., Brenner, S. E. et al., 2009, 17, 151–159.
[61] Tramontano, A., in: Schwede, T., Peitsch, M. C. (Eds.),
Computational Structural Biology, World Scientific
Publishing, Singapore 2008.
[82] Stirnimann, C. U., Grütter, M. G.,
in: Schwede, T.,
Peitsch, M. C. (Eds.), Computational Structural Biology,
World Scientific Publishing, Singapore 2008.
[83] Karpusas, M., Hsu, Y. M., Wang, J. H., Thompson, J.,
Lederman, S., Chess, L., Thomas, D., Structure 1995, 3,
1031–1039.
[84] Janin, J., Wodak, S., Structure 2007, 15, 755–759.
[62] Hillisch, A., Pineda, L. F., Hilgenfeld, R., Drug Discov.
Today 2004, 9, 659–669.
[85] Gobel, U., Sander, C., Schneider, R., Valencia, A.,
Proteins 1994, 18, 309–317.
[63] Vangrevelinghe, E., Zimmermann, K., Schoepfer, J.,
Portmann, R., Fabbro, D., Furet, P., J. Med. Chem. 2003,
46, 2656–2662.
[86] Burger, L., van Nimwegen, E., Mol. Syst. Biol. 2008, 4,
165.
[64] Feyfant, E., Sali, A., Fiser, A., Protein Sci. 2007, 16,
2030–2041.
[65] Wattenhofer, M., Di Iorio, M. V., Rabionet, R., Dougherty, L., Pampanos, A., Schwede, T., Montserrat-Sentis,
B. et al., J. Mol. Med. 2002, 80, 124–131.
[87] Weigt, M., White, R. A., Szurmant, H., Hoch, J. A.,
Hwa, T., Proc. Natl. Acad. Sci. USA 2009, 106, 67–72.
[88] Eramian, D., Eswar, N., Shen, M. Y., Sali, A., Protein Sci.
2008, 17, 1881–1893.
[89] Paluszewski, M., Karplus, K., Proteins 2009, 75,
540–549.
[66] Raimondo, D., Giorgetti, A., Giorgetti, A., Bosi, S.,
Tramontano, A., Proteins 2007, 66, 689–696.
[90] Wallner, B., Elofsson, A., Proteins 2007, 69, 184–193.
[67] Bartlett, G. J., Porter, C. T., Borkakoti, N., Thornton,
J. M., J. Mol. Biol. 2002, 324, 105–121.
[91] Alber, F., Dokudovskaya, S., Veenhoff, L. M., Zhang, W.,
Kipper, J., Devos, D., Suprapto, A. et al., Nature 2007,
450, 683–694.
[68] Laskowski, R. A., Thornton, J. M., Humblet, C., Singh, J.,
J. Mol. Biol. 1996, 259, 175–201.
[69] Duret, L., Guex, N., Peitsch, M. C., Bairoch, A., Genome
Res. 1998, 8, 348–353.
[70] Peitsch, M. C., Jongeneel, C. V., Int. Immunol. 1993, 5,
233–238.
[92] Kleywegt, G. J., Harris, M. R., Zou, J. Y., Taylor, T. C.,
Wahlby, A., Jones, T. A., Acta Crystallogr. D Biol. Crystallogr. 2004, 60, 2240–2249.
[93] Sayers, E. W., Barrett, T., Benson, D. A., Bryant, S. H.,
Canese, K., Chetvernin, V., Church, D. M. et al., Nucleic
Acids Res. 2009, 37, D5–D15.
[71] Wan, Y., Zheng, Y. Z., Harris, J. M., Brown, R., Waters,
M. J., Mol. Endocrinol. 2003, 17, 2240–2250.
[94] Schuttelkopf, A. W., van Aalten, D. M., Acta Crystallogr.
D Biol. Crystallogr. 2004, 60, 1355–1363.
[72] Guimaraes, A. J., Hamilton, A. J., de, M. G. H. L.,
Nosanchuk, J. D., Zancope-Oliveira, R. M., PLoS ONE
2008, 3, e3449.
[95] Boutet, E., Lieberherr, D., Tognolli, M., Schneider, M.,
Bairoch, A., Methods Mol. Biol. 2007, 406, 89–112.
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
www.electrophoresis-journal.com
General
Electrophoresis 2009, 30, S162–S173
S173
Dr. Nicolas Guex studied plant biology and biochemistry at the University of Lausanne. In the
early nineties, during the course of his Ph.D., he pioneered the use of molecular biology in the
Institute of Plant Biology, isolated and sequenced two genes of the glyoxylate cycle, built a
molecular model for one of them and initiated the development of Swiss-PdbViewer. He
obtained his Ph.D. in 1995 and joined the group of Dr. Manuel Peitsch at GlaxoWellcome,
where he contributed to the development of SWISS-MODEL. From 1996-2002, he also taught
postgrade Structural Biology modules at the University of Geneva, EPFL and for the Swiss
Institute of Bioinformatics.
During his 12 years at GlaxoSmithKline, he occupied positions of increasing responsibilities,
led a group specialized in Evolutionary and Structural Bioinformatics and contributed to
several drug discovery research programs. In 2008 he returned to the Swiss Institute of
Bioinformatics, in the Vital-IT team, where he continues the development of Swiss-PdbViewer,
contributes his bioinformatics and biology expertise to research groups and develops specialized
software to support research projects that necessitate high performance computing. Nicolas has
been developing and optimizing computer software since 1979.
Manuel C. Peitsch is Director Computational Sciences and Bioinformatics with Philip Morris
International Research and Development, which he joined from the Novartis Institutes of
BioMedical Research (NIBR) where he successively led Informatics & Knowledge Management and later Systems Biology. Prior to joining Novartis in 2001, Manuel held several
leadership positions in bioinformatics, scientific computing and knowledge management with
GlaxoWellcome and GlaxoSmithKline. Manuel obtained his Ph.D. in biochemistry from the
University of Lausanne (Switzerland) and spent his post-doctoral years at the Laboratory of
Mathematical Biology of the National Cancer Institute in Frederick MD and at the University
of Lausanne. Since 2002 he is Professor for Bioinformatics at the University of Basel. Manuel
is a co-founder of several initiatives, including two start-up companies and the Swiss Institute
of Bioinformatics. He is a member of the Swiss National Research Council, the Chairman of
the Executive Board of the Swiss Institute of Bioinformatics and an active scientific advisor to
several academic and commercial entities.
Torsten Schwede obtained his Ph.D. in chemistry from the Albert-Ludwigs University of
Freiburg i.Br. (Germany) for his studies in the field of protein X-ray crystallography. As a
postdoctoral fellow at GlaxoWellcome in Geneva, and later as research scientist at GSK R&D,
his research interests focused on computational structural biology. In the group of Manuel
Peitsch, he took the responsibility for the further development of the SWISS-MODEL server.
Since 2001 he is professor for Structural Bioinformatics at the Biozentrum of the University of
Basel and group leader at the Swiss Institute of Bioinformatics (SIB).
His research group is devoted to molecular modeling of protein structures and their functional
properties. Central to this aspect is the development of fully automated expert systems, such as
the SWISS-MODEL server for comparative protein structure modeling, and the Protein
Model Portal of the PSI Structural Genomics Knowledgebase. Applied aspects such as
simulation of protein ligand interactions and structure based protein engineering complement
his group’s research activities.
Torsten is chairman of the Biozentrums research core program "Computational and Systems
Biology". In addition he serves on several boards, including the executive board of the Swiss
Institute of Bioinformatics and the scientific advisory board of the PDBe (EMBL-EBI).
& 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
www.electrophoresis-journal.com
Скачать