QCT Bayesian Statistics
From UC_Chemistry
The three steps used in our version of QCT (JCP129(13):134505) are
- Calculate <DU>_0(lambda) from a solvent-only system
- Calculate <DU>_1(lambda) from a solute-solvent system, and
- Calculate HS,IS(lambda) profiles from (1) and (2), respectively.
- Combine all of the above into a neat summary file.
Computationally, three major python scripts (calc_du.py, HS_est.py, and get_stats.py) implement this functionality, and their usage is described below. Several helper scripts also exist and their usage will be described in context.
Contents |
Getting ucgrad Working
First, in order to use most of these scripts, you must have python with numpy installed on your system and have access to our own utilities directory, ucgrad.
You can get ucgrad from forcesolve (forcesolve/cg_topol/ucgrad) by downloading or directly using svn:
svn export http://forcesolve.svn.sourceforge.net/svnroot/forcesolve/trunk/cg_topol/ucgrad ucgrad
To access it using from ucgrad import * within python, you have three options:
- Add it to your python site-packages directory (/usr/lib64/python2.4/site-packages/ on my system).
- Set the variable PYTHONPATH to wherever you keep ucgrad (e.g. $HOME/src/ucgrad).
- Make sure it is in the current directory.
I use the first and second ones, having added a symlink in site-packages.
Calculating <DU> Profiles, calc_du.py
Scripts:calc_du.py takes the 'DU' direction, model/biasing energies, list of 'rmin'-s, and list of DU values for each frame and it outputs the re-weighted lambda,effective sample number,DU,variance,simulation with best bound,bound, and bound error plot.
Calculating <F> Profiles, calc_u.py
Scripts:calc_u.py takes the same input as calc_du.py, but du can be any number tabulated for each frame, and only lambda,effective sample number,DU, and variance are output. This lets you calculate the conditional profile for any quantity (F(x)) as a function of lambda -- i.e. <F|lambda>.
Calculating HS,IS Profiles, HS_est.py
Scripts:HS_est.py takes a single parameter file (listing the location of histogram data and its format) as input.
Combining All of the Above
Scripts:get_stats.py takes the HS, IS, <DU|lambda>_0, and <DU|lambda>_1 files as input and outputs a file containing lambda, Upper bound+err, Upper Est.+err, Lower Est.+err, Lower Bound+err., LR Avg. Est.+err, HS+err, IS+err, total+err, err2. err2 is the half the difference between the upper and lower estimates (those computed from the mean and variance of only their respective <DU> data). This error may give some indication of the applicability of the Gaussian assumption which states LR Avg. = mu^ex_OS,LR.
