.:: MCCE ::.
 
.:: Database ::.
 
 
.:: About Us ::.
 
 
.::  Login ::.
 
 
.:: Documentation ::.
 

3.3. Customizing control file - line by line remarks of "run.prm"

3.3.1. The most modified section

Five entries in this section outline a mcce run. One can specify the input pdb file and steps.

prot.pdb                                                    (INPDB)

This lin specifies the input structural file of step 1.

f        step 1: pre-run, pdb-> mcce pdb                    (DO_PREMCCE)

Step 1 converts regular pdb file to mcce extended pdb format. The flag "t" designates "do this step".

f        step 2: make rotatmers                             (DO_ROTAMERS)

Step 2 makes and optimizes rotamers of the protein. The flag "t" designates "do this step".

f        step 3: do energy calculations                     (DO_ENERGY)

Step 3 prepares energy look-up table for conformers. The flag "t" designates "do this step".

f        step 4: monte carlo sampling                       (DO_MONTE)

Step 4 simulates a pH or Eh titration with Monte Carlo sampling. The flag "t" designates "do this step.

3.3.2. The less modified entries

These are additional important controls of the program.

8.0      Protein dielectric constant for DelPhi             (EPSILON_PROT)

Dielectric constant of protein. usually epsilon 8 is good for small soluble protein while 4 is good for big trans-membrane protein (>200 residues). If one wants to use dielectric constant other than 8 and 4, one has to prepare a parameter directory named "param##" where "##" is the dielectric constant as an integer. In this parameter directory, the "RXN" entries of tpl files should be recalibrated for the new dielectric constant.

/usr/local/mcce/extra.tpl                                   (EXTRA)

This line points to a file that contains entries to overwrite those in the parameter directory and provides extra controls over the program.

/usr/local/mcce/name.txt MCCE renaming rule.                (RENAME_RULES)

This is the renaming rule file. It is sometimes necessary to divide a big group into separate ionizable groups, or merge ligands and cofactors as one group. This can be done by renaming the atoms and residues instructed by this file.

ph       "ph" for pH titration, "eh" for eh titration       (TITR_TYPE)

The type of titration simulation of step 4, pH or Eh.

0.0      Initial pH                                         (TITR_PH0)

The starting pH of titration simulation, or the pH of an Eh titration.

1.0      pH interval                                        (TITR_PHD)

The increment of pH titration (size of step).

0.0      Initial Eh                                         (TITR_EH0)

The starting Eh, or the Eh of a pH titration.

30.0     Eh interval (in mV)                                (TITR_EHD)

The increment of Eh titration (size of step).

15       Number of titration points                         (TITR_STEPS)

Number of titration points in step 4, Monte Carlo simulation.

3.3.3. Customizing step 1, preformatting pdb file

These entries are specifically for step 1 and do not have effect on other steps.

t        Label terminal residues?                           (TERMINALS)

The program will identify N and C terminal residues based on the geometry if the flag is "t". When some residues are missing in the input pdb file, false terminal residues would be labeled at the "breakage". Sometimes one needs to disable this feature to work with a stand alone amino acid.

0.1      cut off water if % SAS exceeds this number         (H2O_SASCUTOFF)

As surface water and salt can be taken into account by the continuum model, these molecues are recursively stripped off if they are exposed with the SAS exceeding this cutoff threshold.

2.0      distance limit of reporting clashes                (CLASH_DISTANCE)

The mcce program reports atom pairs with abnormally short atom-atom distances. This line specify the distance threshold in angstroms. Distance 2.0 is good to find atoms within bond distance and switching 2.0 to 2.4 will report possible ligand sites.

3.3.4. Customizing step 2, rotamer making

Step 2 is rotamer making, These lines are for the rules of making rotammers.

f        Use control file "head1.lst" for rotamer making    (ROT_SPECIF)

This line is the flag of using head1.lst or not. "t" means to use. The initial rotamer placing is controlled by either the following lines or an input file "head1.lst" created by step 1 based on the same rules and an addition file "list_rot.gold" which defines hot spot of the protein. The file "head1.lst" contains residue specific rotamer making rules. One can edit this file, and resume the program from step 2 with the flag "t" to get more initial rotamers certain some residues.

t        Do swap (stereo isotope)                           (ROT_SWAP)

This is the flag to make stereo-isotope of Threonine residues, in which, the OG1 and CG2 are swapped.

t        Do rotate?                                         (PACK)

When the mcce program optimize side chain rotamers, the initial rotamers is rotated from the native conformer. This flag turns on the "rotate" subroutine to rotate the single bonds by given number of steps.

6        number of rotamers in a bond rotation              (ROTATIONS)

The number of steps of a bond rotation. This entry and $(PACK) defined above is used by the same "rotate" subroutine.

f        Do swing?                                          (SWING)

Another way of rotating single bonds is to rotate a predefined angle rather than given number of steps to complete a cycle. "swing" subroutine will do this type of rotamers if the flag is "t". Each single bond will yield two new rotamers which are clock-wise and counter-clock-wise rotamers. The rotamers made here is based on the rotamers given by "rotate" subroutine. Turing on both $(PACK) and $(SWING) is not recommended because the combination of two subroutines creates too many rotamers and the computation is both CPU and memory demanding.

10.0     phi in degrees of swing                            (PHI_SWING)

This is the rotation angle of each "swing". This entry and $(SWING) defined above is used by the same "swing" subroutine.

1.0      SAS cuttoff of making fewer rotamers               (SAS_CUTOFF)

Surface residues don't have as many rotamers as surface residues because their electrostatic interactions are largely screened by solvent, and rotamers don't help the calculation very much. The residues whose fractional SAS are over the number defined in this line will get half rotation step as defined in line (ROTATIONS). To use a smaller number like "0.25" to reduce surface side chain rotamers.

10.0     Cutoff of self vdw in kcal/mol                     (VDW_CUTOFF)

After initial rotamers are placed, those with big self VDW potential (VDW within side chain and VDW with backbone) will be deleted. This number is the threshold of such deleting action. First the minimum VDW will be found in side chain conformers of a residue, and then conformers with high self energy are deleted. Only heavy atoms (non Hydrogen) are considered at this stage.

5000     number of repacks                                  (REPACKS)

While self-energy pruning removes most inaccessible side chain conformers, "repacking" finds which conformers are more accessible than others in context of other residues. Starting from random selections of microstates (one conformer from each residue), the protein is repacked with optimizing residues sequentially. Here is the number of starting microstates.

0.03     occupancy cutoff of repacks                        (REPACK_CUTOFF) 

If fractional counts of a conformer in the "repacking" is less than this number, the conformer gets removed.

t        h-bond directed rotamer making                     (HDIRECTED)

This is flag of making Hydrogen bond directed rotamers. With rotamers obtained after self energy pruning, the conformers are checked with the conformers of neighboring residues. When O-O and O-N distances fall into 2.5 to 3.5 Angstroms, the pair of conformers is optimized to have O-O or O-N distance closer to 2.9 Angstroms.

36 Limit number of the H bond conformers                    (HDIRLIMT)

When the number of hydrogen bond directed rotamers are over this number, "better" rotamers with more ideal hydrogen bond distance are elected.

0.2      threshold for two conformers being different       (HDIRDIFF) 

At the end of the rotamer making, the RMSD of atoms between two side chain rotamers should exceed this number, otherwise one of them will be deleted. This entry is different from the following $(PRUNE_RMSD) entry. Any two conformers are within the distance defined by this line is treated as similar conformers and the second one would be deleted, while two conformes have to satidfy three threshold values $(PRUNE_RMSD), $(PRUNE_ELE) and $(PRUNE_VDW) to be similar confomers.

0.5      Pruning cutoff of RMSD                             (PRUNE_RMSD)

The conformers are pruned by the geometry and pairwise interaction vector at the end of the step 2. $(PRUNE_RMSD) defines the geometry difference. If the maximum of the atom distance between two conformers is above this number, these two conformers are treated as non-similar conformers. When two conformers are similar conformers as defined by $(PRUNE_RMSD), $(PRUNE_ELE) and $(PRUNE_VDW), the second conformer would be deleted.

1.0      Pruning cutoff of eletrostatic pairwise            (PRUNE_ELE)

The conformers are pruned by the geometry and pairwise interaction vector at the end of the step 2. $(PRUNE_ELE) defines the electrostatic vector difference. If any dimension of the electrostatic pairwise interaction vector is different by this number, these two are treated as non-similar conformers. When two conformers are similar conformers as defined by $(PRUNE_RMSD), $(PRUNE_ELE) and $(PRUNE_VDW), the second conformer would be deleted.

8.0      Pruning cutoff of vdw pairwise                     (PRUNE_VDW)

The conformers are pruned by the geometry and pairwise interaction vector at the end of the step 2. $(PRUNE_VDW) defines the electrostatic vector difference. If any dimension of the van der Waals potential pairwise interaction vector is different by this number, these two are treated as non-similar conformers. When two conformers are similar conformers as defined by $(PRUNE_RMSD), $(PRUNE_ELE) and $(PRUNE_VDW), the second conformer would be deleted.

 f        Do relaxation on water                             (RELAX_WAT)

Flag of doing water relaxation. Water molecules are given both translational and rotational degrees of freedom in this relaxation process.

3.2   Distance between water and heavy atom to move water (WATER_RELAX_THR)

If the distance between water (O) and heavy atom is bigger than this number, them move the water away.

3.3.5. Customizing step 3, energy lookup table calculation

Step 3 is using DelPhi to compute reaction field energy and electrostatic interaction. This section is set up of DelPhi program.

f        Use SAS + Coulomb's law for ele interaction         (QUICK_ENERGIES)

Step 3 can optionally do the energy lookup table with emprical analytical equations rather than solving PB equation. Using these equations would be very fast (a few minutes vs. a few hours by DelPhi) but the error of calculated pKa would be as large as 3 pH units. If you just want find interesting spots of a big protein or refine rotamers, then try this. To turn the emprical calculation on, mark the flag as "t".

/usr/local/mcce/bin/delphi DelPhi executable                 (DELPHI_EXE)

The location of the DelPhi executable. This line is configured at the time of installation. The DelPhi program in this location is modified to read only unformatted pdb file.

80.0     Solvent dielectric constant for DelPhi              (EPSILON_SOLV)

The dielectric constant of solvent. Although it is editable here, modifying this entry would void the calculation as all other parameters were calibrated at dielectric constant 80.

65       Grids in each DelPhi                               (GRIDS_DELPHI)

The DelPhi grid size. It must be an odd number. More grid points produce more accurate result but costs much more CPU time. Since mcce program uses focusing technique, a larger number is not necessary.

2.0      The target grids per angstrom for DelPhi           (GRIDS_PER_ANG)

The DelPhi focusing run will end up with this targeted resolution. A minimum value of 1.0 grid per angstrom is recommended.

1.4 Radius of the probe (RADIUS_PROBE)

The size of the probe to evaluate solvent accessible surface (SAS) and defines dielectric boundary. If this value is set to be 0, then the surface detected becomes van der Waals surface.

2.0      Ion radius                                         (IONRAD)

Ion radius is a parameter of Poisson-Boltzmann equation solver, DelPhi.

0.15     Salt                                               (SALT)

Salt concentration of the solvent in unit M (Moles/Liter).

f        Reassign charge and radii before delphi            (REASSIGN)

The mcce program will reassign charges and radii to atoms from the parameter files if this flag is "t". Usually, the charges and radii are read in from extended pdb file "step2_out.pdb", in which these parameters are already assigned but editable.

f        Recalculate torsion energy when write out head3.lst (RECALC_TORS)
If this flag is "t", then recalculate torsion energy for all conformers even when step 3 only caculates a part of conformers.

3.3.6. Customizing step 4, Monte Carlo sampling

In this step, the mcce program first loads self energy of conformers from file "head3.lst" and pairwise interaction from directory "energies", then simulate the titration with Monte Carlo sampling.

f        Average the pairwise, "f" uses the smaller         (AVERAGE_PAIRWISE)

The electrostatic pairwise interaction calculated in two directions are slightly different. We can either average them or choose the smaller (absolute) value. As the difference may be from the inaccurate dielectric boundary, using the smaller value reduces the dielectric boundary error.

20.      Warning Threshold of difference in pairwise        (WARN_PAIRWISE)

When the difference of pairwise interaction in two directions is very different, that is, absolute value bigger than 0.5 kcal/mol and percentage difference is bigger than this number, the program will issue warning message.

5.0 Big pairwise threshold to make big list (BIG_PAIRWISE)

Multiple-flips happen on the big list of a residue, which enables proton transfer from site to site, instead of site to solvent. The big list is the neighbor list with big pairwise interaction. The threshold here defined here is used to make such big lists for each residue.

-1       Random seed, -1 uses time as random seed           (MONTE_SEED)

Random seed, -1 to use time as the random seed.

298.15   Temperature                                        (MONTE_T)

The temperature of Monte Carlo simulation. As the reference standard free energy (pKa of acid in solution) is at room temperature, using other temperatures may require modifications on other parameters.

3        Number of flips                                    (MONTE_FLIPS)

A new microstate can be created by flipping conformers of one or multiple residues. The strategy we used is to flip one residue and its neighbors so that concerted motion can be properly sampled. The number here is the maximum of flipping sites.

100      Annealing = n_start * confs                        (MONTE_NSTART) 

The steps of annealing. An independent Monte Carlo sapling goes through annealing, reducing, and equilibration stages. The sampling starts from a totally random microstate which might be at "high" energy state, and gradually "cooled" down to equilibrated states. The microstates in the annealing stage would be discarded.

1000     Equalibration = n_eq * confs                       (MONTE_NEQ)

The length (steps) of reducing stage. After reducing stage, those conformers who are not occupied or 100% occupied will be given such occupancy and not be sampled.

0.001    Cut-off occupancy of the reduction                 (MONTE_REDUCE)

Some conformers are never selected in the reducing stage, so they will be flaged and not be sampled any more. The cutoff decision is based on the value defined here, a fractional number of the occupancy. The total occupancy of conformers in a residue is 1 unless all of them are fixed.

The following 4 lines are for basic Monte Carlo sampling.

6        Number of independent monte carlo sampling         (MONTE_RUNS)

The number of independent Monte Carlo samplings at one titration point. The standard deviation of these independent runs will be reported for each conformer in progress file "mc_out" and the biggest standard deviation will be output to stdout.

5000     Sampling = n_iter * confs                          (MONTE_NITER)

The number of steps in sampling at the equilibration stage. The conformer occupancy statistics is collected at this stage.

5000     Trace energy each MONTE_TRACE steps, 0 no trace    (MONTE_TRACE)

The interval of tracing the running microstate energy. Every this number of steps, the mcce program will write out the microstate energy to file "mc_out". You can see how this energy fluctuates and decide if the sampling is trapped in local minima. In addition, the average of this energy is the system energy corresponding to Enthalpy H.

1000000  Maximum microstates for analytical solution        (NSTATE_MAX)

If the microstates are less than this number, calculate the conformer occupancy analytically, otherwise, use Monte Carlo sampling..

The following lines are for advanced Monte Carlo sampling. The advanced Monte Carlo sampling uses less time by detecting early convergence.

f        Using Yifan's monte carlo                          (MONTE_ADV_OPT)

The flag to turn on advanced Monte Carlo sampling.

f        Using format from old version                      (MONTE_OLD_INPUT)

Flag it with "t" to load energy terms from old MCCE program.

5000     Min Sampling = n_iter * confs                      (MONTE_NITER_MIN)

The minumal number of steps of Monte Carlo sampling. No termination before this number of steps.

-1       Max Sampling = n_iter * confs(-1 stop @ converged)  (MONTE_NITER_MAX)

The maximal number of steps of Monte Carlo sampling. -1 means ndefinite sampling until converged.

10000    Number of iterations each cycle                  (MONTE_NITER_CYCLE)

One cycle is a set of Monte Carlo runs without book keeping, convergence checking etc., in order to save time.

1000     niter * n_conf check convergence                   (MONTE_NITER_CHK)

Convergence is checked every x cycle. x = (int)(this number * (number of conformer) / $(MONTE_NITER_CYCLE)) + 1

-1       Number of the reduce steps(-1 stop @ converge)     (MONTE_N_REDUCE)

A reduce step is a separate MC run restarting from a different random microstate. Conformers with occupancy < $(MONTE_REDUCE) is fixed to have occupancy 0, therefore no longer sampled. Convergence will be checked again between different reduce runs. New reduce run is initialized until number of reduce run reaches this number ($(MONTE_N_REDUCE) >0) or convergence criteria is satisfied ($(MONTE_N_REDUCE) <0).

0.01     Threshold for convergence                          (MONTE_CONVERGE)

This is convergence criteria both for single monte carlo run (explained in $(MONTE_NITER_MAX)) and for reduce steps (explained in $(MONTE_N_REDUCE)). RMSD is get from comparing average occupancy at this checking point to average occupancy at last checking point. When RMSD of all active conformer occupancy is smaller than this number, convergence criteria is satisfied.

f        Calculate free energy                              (MONTE_DO_ENERGY)

If this flag is "t", free energy subroutine is called. Caution, free energy calculation is expensive.

Free energy is calculated by non-linear average of energies from all the unique microstates sampled in MC. Therefore unique microstates are collected. And in every MC step the occupied microstate is checked against all collected microstates to see if it is a new unique one.

298.15   Starting temperature for annealing                (ANNEAL_TEMP_START)

Annealing can be run in a different temperature (usually high) before data collection to bring the system out of local minimum. This parameter gives starting temperature of annealing step. Final temperature is $(MONTE_T).

0        Number of steps of annealing                      (ANNEAL_NSTEP)

Each annealling run changes the system temperature by ($(MONTE_T)-$(ANNEAL_TEMP_START))/$(ANNEAL_NSTEP), to bring the temperature linearly to the MC temperature.

1000     Number of iterations for each annealling step    (ANNEAL_NITER_STEP)

Each annealing step runs x iterations. x = (This number) * (Number of active conformers)

3.3.7. Miscellaneous entries

These lines are for advanced use of the program.

/usr/local/mcce                                           (MCCE_HOME)

The home directory of mcce. It is generally set up by the installation script. The program will use this path to search the parameter directories.

1         delphi start conformer number, 0 based            (DELPHI_START)

You can control to do which conformers in step 3, energy lookup table calculation. Step 3 is the most expensive calculation in mcce program. Sometimes you may want to split to calculation into pieces to take advantage of computer clusters. This number is the starting number of conformers. The conformer list before step 3 can be found in step2_out.pdb.

99999     delphi end conformer number, self included        (DELPHI_END)

The end conformer number of the step 3. To run all conformers, make sure this number is bigger than the number of all conformers.

f        skip delphi in step3 (DEBUG option)                (SKIP_ELE) 

Do step 3 without really calling DelPhi when the flag is "t". This if for debug use.

/tmp     delphi temporary file folder, "/tmp" uses node     (DELPHI_FOLDER)

The directory to store DelPhi temporary files for focusing runs. When running mcce on a computer cluster, it is expensive to read and write files to the master node disk. As /tmp directory is always on compute node's local disk, the temporary files are not actually ransferred to the master node and the disk I/O is shared by nodes. Other times, you may want to check the error of DelPhi runs. Then use "./" to place the DelPhi temporary files to the working directory. DelPhi directory is named as delphi_?????? (?????? is a 6 character unique random string) under the directory defined by this line.

t        clean up delphi focusing directory?                (DELPHI_CLEAN) 

The flag "t" will remove the DelPhi directory after step 3.

debug.log                                                   (DEBUG_LOG)

The file name of the debug output file. Renaming this file is not recommended.

new.tpl local parameter file for unrecognized res           (NEWTPL)

The file name of temporary parameter file of the unrecognized cofactor. Once a cofactor parameter file is created and put into the permenant parameter directory, you should remove this file, otherwise the duplicated entries in this file void those in parameter directory.

1.7      defalut van der Waals radius, for SAS              (DEFAULT_RADIUS)

This value will be assigned to any atoms without explicitly defined radii in the parameter files.

0.5      factor to 1-4 LJ potontial (1.0 is full)           (FACTOR_14LJ)

VDW scaling factor for 1-4 interaction as suggested by AMBER.

1.0      dielectric constant for Coulomb's law              (EPSILON_COLUMB)

In step 2, optimization was done with Coulomb's law for the electrostatic interactions. This value is the dielectric constant for Coulomb's law for this purpose.

<Prev | Contents | Next >

 
 

:: Doc Index

 

:: Complete Doc