SARA (Side-chain Angular Replacement Algorithm) 1.0

SARA is a very fast method for doing single side chain replacements in protein structures by using a coarse- grained method. It is over five times faster than the leading all-atom approach, and generates biologically realistic side-chain angles. The solutions found by SARA typically deviate less than 1 Å and 12 degrees from native structures or the best all-atom solution. Run-time for the algorithm is highly predictable and can easily be tuned by the user. These characteristics makes SARA an excellent choice for high-throughput applications like structural genomics, evolutionary simulations and structure-based phylogenetics.

SARA was written by Johan Grahnen in object-oriented C++ and is encapsulated in a collection of classes for easy integration with existing software.

SARA is maintained by David Liberles.



We supply SARA 1.0 as a compressed archive containing a pre-compiled 32-bit Linux executable, the C++ source code and some brief documentation. Why not download SARA and try it out right now? Refer to the paper describing the algorithm and this web page for further details.


See Grahnen,J.A., Kubelka,J. and Liberles,D.A. (2010) (submitted) for a description of the algorithm. Please cite the same reference in any published work employing some or all of our code.


Included the software package is a Debian Linux 32-bit binary 'sara' which can be used immediately if your system is compatible (see Testing). To compile from source, verify that your system supports GNU make and has GCC version 4.3 or better installed. Then simply type


to compile the source code. To remove the object files after finishing, type

make clean

Note that support for the C++0X standard is required -- in particular, the <tr1/memory> library must exist and shared_ptr must be available in the standard namespace. See Scotty Meyer's summary for a listing of compatible compiler versions. We cannot support any configurations beyond Linux with GCC 4.3 or better, but the code should run on any system supporting a Boost-derived shared_ptr template. It has been successfully compiled on Debian 5.0.4, Ubuntu 10.04 and MacOS Darwin 9.8.0 with GCC upgraded to 4.4.

You may wish to refer to Boris Kolpackov's blog for specific information on how to access the shared_ptr template in other compilers and earlier versions of GCC. See also Using A Local Copy of GCC.

For problems with compiling the pseudo-random number generator files (randomc.h, stocc.h, mersenne.cpp and stoc1.cpp), refer to the documentation on Agner Fog's web page.

Testing Your Installation

To test your installation, try making the 30L->30Y replacement in chain A of PDB structure 1D4T (necessary files are included):

./sara 1d4tA.pdb 1d4tA-new.seq 1d4tA-new.bead 100 1 1 1

Open 1d4tA.bead and 1d4tA-new.bead in your favorite viewing software to examine the replacement. Note that bead sizes in our 2-bead model are typically not the same as atom sizes in your molecular viewer, and they may need to be adjusted.


SARA takes 7 parameters on the command line. Attempting to run the program with fewer inputs will trigger a usage message:

Usage: ./sara [protein PDB file] [protein seq file] [output file] [# steps] [temp] [step length] [step variance]

The protein PDB file should be cleaned up prior to use (see Pre-processing PDB Structures), and the sequence file should contain an all upper-case one-letter version of the novel sequence, all on one line. See files 1d4tA.pdb and 1d4tA-new.seq for examples. The number of steps, temperature, step length and step variance all effect the speed and accuracy of the algorithm: read the paper for a description of their effects. We recommend 100 steps, a temperature of 1.0, step length of 1.0 and step variance of 1.0 for a fast single replacement.

Pre-processing PDB Structures

Input PDB structures must contain a single chain, with residues consecutively numbered starting with 1, and conforming to a particular version of the PDB standard (see 1d4tA.pdb for an example). Unfortunately not all structures in the PDB look like this when you first download them. For your convenience, we provide two Perl scripts to help with formatting your file:

  1. Split your PDB file (1ABC in this example) into its constitutive chains:

    perl 1ABC.pdb ./

  2. Renumber the chain of interest (here, chain A):

    perl 1ABCA.pdb 1ABCA-num.pdb

  3. Verify that it meets the format specifications with e.g. your favorite text editor (compare with 1d4tA.pdb).


Changing The Energy Function

SARA replaces side chains by inserting them into roughly the same position as the previous one, and then optimizing the angle of the new side chain by finding a minimum of an energy function. We use a Lennard-Jones approximation of the vdW energies of the replaced side chain in the default implementation (see the paper for more detail), but also provide an alternative linear repulsion function. If desired, you can supply your own energy function.

Using A Local Copy Of GCC

If you have installed a local copy of GCC that is v 4.3 or better, and you are still having problems compiling, try the following:

  1. In the supplied Makefile, change

    CC = g++


    CC = /usr/local/bin/gcc

    assuming that you installed GCC to /usr/local/.

  2. In the same Makefile, alter

    $(CC) -o sara $(sidechain-replacer-objects)


    $(CC) -o sara -L /usr/local/lib -lstdc++ $(sidechain-replacer-objects)

    again assuming that the new installation is in /usr/local/.

  3. Run


    again, which should compile and link the code properly.

Switch directory names as necessary, and consult the gcc man file for details on options "-L" and "-l".