NAMD Trajectory Conversion
From UC_Chemistry
| Revision as of 04:04, 20 July 2008 Songhk (Talk | contribs) ← Previous diff |
Current revision David Rogers (Talk | contribs) (added VMD section) |
||
| Line 9: | Line 9: | ||
| * pdb to dcd conversion | * pdb to dcd conversion | ||
| catdcd -o tip3p.dcd -pdb tip3p.pdb | catdcd -o tip3p.dcd -pdb tip3p.pdb | ||
| + | |||
| + | == Manually Using VMD == | ||
| + | If you're not tied to the command line, you can also use VMD to do this by loading and then writing the molecule. Of course, you also have to load the psf for your dcd to write a pdb. This came in handy for some of my latest projects converting Amber NetCDF files to gromacs trr files. These trr files are handy for QCT analysis, since '''g_mindist''' just needs a trr and an index file (no tpr necessary) to output r_min for each frame. | ||
| + | Now, if you didn't save the box info, you can type the following into the vmd console before writing your file to get that right (replacing 150x150x98 with your own dimensions). | ||
| + | <pre> | ||
| + | set sel [atomselect top all] | ||
| + | set ns 0 | ||
| + | while { $ns < [molinfo top get numframes] } { | ||
| + | molinfo top set frame $ns | ||
| + | molinfo top set { a b c } { 150.0 150.0 98.0 } | ||
| + | molinfo top set { alpha beta gamma } { 90.0 90.0 90.0 } | ||
| + | incr ns | ||
| + | } | ||
| + | </pre> | ||
| == Direct NAMD Binary == | == Direct NAMD Binary == | ||
| Line 36: | Line 50: | ||
| __EOF | __EOF | ||
| </pre> | </pre> | ||
| + | |||
| + | You can even avoid the '''od''' maneuver if you are using python: | ||
| + | <pre> | ||
| + | from numpy import * | ||
| + | f = open("$i.bout", 'rb+') | ||
| + | f.seek(4) | ||
| + | x = fromfile(f, float64, 3*62, '') | ||
| + | x = reshape(x, (-1,3)) | ||
| + | print sum(x,0)/len(x) | ||
| + | |||
| + | f.seek(4+8*3*61) | ||
| + | x[371].tofile(f, '') | ||
| + | </pre> | ||
| + | Where I have illustrated changing the coordinate of the last atom on the fly for possible subsequent analysis. | ||
| + | |||
| == Fixing atom name spacing == | == Fixing atom name spacing == | ||
Current revision
I googled the web for NAMD conversion of dcd to pdb, since all my in-house analysis programs read pdbs and the dcd file format is a too bit cryptic and hacky for me to invest a lot of time writing an i/o interface for. The results were less-than-spectacular, since the major contribution assumes you are already familiar running VMD scripts -- in which case you wouldn't really need help doing this would you? After looking through some crazy tcl scripts plugging into vmd / namd as an interpreter, I realized there was a program called catdcd to get this done.
Enlightened process:
- download and run catdcd, a program from the makers of NAMD designed to solve these sorts of problems.
- My command was as follows, but your mileage may vary.
./catdcd -o traj.pdb -otype pdb -first 5000 -last 10000 -stride 1000 -s protein.pdb traj.dcd
- pdb to dcd conversion
catdcd -o tip3p.dcd -pdb tip3p.pdb
Contents |
Manually Using VMD
If you're not tied to the command line, you can also use VMD to do this by loading and then writing the molecule. Of course, you also have to load the psf for your dcd to write a pdb. This came in handy for some of my latest projects converting Amber NetCDF files to gromacs trr files. These trr files are handy for QCT analysis, since g_mindist just needs a trr and an index file (no tpr necessary) to output r_min for each frame. Now, if you didn't save the box info, you can type the following into the vmd console before writing your file to get that right (replacing 150x150x98 with your own dimensions).
set sel [atomselect top all]
set ns 0
while { $ns < [molinfo top get numframes] } {
molinfo top set frame $ns
molinfo top set { a b c } { 150.0 150.0 98.0 }
molinfo top set { alpha beta gamma } { 90.0 90.0 90.0 }
incr ns
}
Direct NAMD Binary
Some of my recent analyses have used the binary trajectory directly. This was a simple matter of
1. Converting a frame from the .dcd file into a NAMD binary format
- (I still haven't found an easy way to access dcd-s directly)
i=10 catdcd -o $i.bcoor -first $i -last $i -otype namdbin traj.dcd od -N4 -td4 $i.bcoor od -w24 -j4 -tf8 $i.bcoor | head
- The od command reveals the true form of the bcoor file -- a 4-byte int giving the number of atoms (skipped with -j4) followed by a list of 8-byte real numbers.
2. Scam the binary 8-byte floats into their own file.
dd if=$i.bcoor of=$i.bout bs=4 skip=1 count=372
- Here I have extracted only atom numbers 1-62 by skipping the first block (4-byte int) and outputting only 372=62*3 coord.s * 2 blocks/coord.
3. Read using your favorite program.
python <<__EOF
from numpy import *
x = fromfile("$i.bout")
x = reshape(x, (-1,3))
print sum(x,0)/len(x)
__EOF
You can even avoid the od maneuver if you are using python:
from numpy import *
f = open("$i.bout", 'rb+')
f.seek(4)
x = fromfile(f, float64, 3*62, '')
x = reshape(x, (-1,3))
print sum(x,0)/len(x)
f.seek(4+8*3*61)
x[371].tofile(f, '')
Where I have illustrated changing the coordinate of the last atom on the fly for possible subsequent analysis.
Fixing atom name spacing
- This program does not maintain the proper spacing for the atom names, which can mess up some downstream processing (my own python pdb reader in particular). Fix using the following sed script, fixnames.sed:
s/ \([N,C,O]\) / \1 /; s/ \([H,C,N,S]\)\([1-4,A-Z]\) / \1\2 /;
- invocation: sed -f fixnames.sed traj.pdb >traj_out.pdb
- Make sure you choose a different output name, since sed will delete your output file before it begins processing!
- invocation: sed -f fixnames.sed traj.pdb >traj_out.pdb
Fixing the MODEL tags and atom name spacing
Here is another script you can use if you want to add the MODEL tags back in to the file.
- fixnames2.sed:
s/ \([N,C,O]\) / \1 /; s/ \([H,C,N,S,O]\)\([1-4,A-Z]\) / \1\2 /; s/END/END\ ENDMDL\ MODEL/;
- invocation:
echo MODEL >traj_out.pdb; sed -f fixnames2.sed traj.pdb | sed -e '$d' >>traj_out.pdb
- note that pymol only requires ENDMDL tags be present to read multi-conformer pdbs (so you can just switch END -> ENDMDL if you are lazy)
- invocation:
Deleting Stuff From the PDB
Your PDB file may be too large to process with programs written by mere mortals. Try some/all of the following to make the file smaller.
Deleting unwanted residues
Since the residue names are listed for each line in the file, it is enough to run a simple grep -v on the pdb to get rid of the annoying residues.
grep -v ' POPC \| HOH \| SOL \| TIP3 ' bigger.pdb >smaller.pdb
Deleting nonpolar hydrogen atoms
Deleting all lines matching a pattern is easily accomplished with grep -v
- Here is the list of nonpolar hydrogens in my system (20 amino acids with CHARMM22 atom names) skip_lines:
HB[1-3] *ALA HB[1-2] *CYS H[B,G,E][1-3] *MET HB[1-2] *ASP HB[1-2] *ASN H[B,G,D][1-2] *ARG H[B,G,D][1-2] *PRO H[B,G][1-2] *GLU HB[1-2] *GLN HG[1-2] *GLN HG[1-2][1-3] *VAL HB *VAL HB[1-2] *TRP H[E,H,Z][1-3] *TRP HB[1-2] *LEU HD[1-2][1-3] *LEU HG *LEU HG[1-2][1-3] *ILE HD[1-3] *ILE HB *ILE H[B,G,D,E][1-2] *LYS H[B,D,E][1-2] *PHE HZ *PHE H[B,D,E][1-2] *TYR HG2[1-3] *THR HB *THR HB[1-2] *SER HB[1-2] *HIS HA *ASP HA *CYS HA *ALA HA *MET HA *ASN HA *ARG HA *PRO HA *GLU HA *GLN HA *VAL HA *TRP HA *LEU HA *ILE HA *LYS HA *PHE HA *THR HA *SER HA[1-2] *GLY HA *TYR HA *HIS
- invoke with grep -v -f skip_names traj_out.pdb >traj_ua.pdb
- be aware of any issues with further processing, because the standard atom masses are now incorrect
- as an alternative, you also could use pdb2gmx with a ua gromos forcefield to accomplish the above
