Happy New Year!
Happy New Year to everyone! 2007 is here and I am now back to work and programming. I managed to take a week off, and barely touched a computer at all (well, I did mess around a little with the GNU multi-precision library GMP for the purposes of playing around with large prime numbers, but since this wasn't Sire it doesn't count ;-)
I've been back coding for a few days now and have been concentrating on really sorting out the forcefield API. Now that I've had lots of time playing with forcefields and really exploring their power, I am now confident that I have now come up with a generic API that can be used to fit them all. Part of the way that I have done this is to add generic types via generic properties that are attached to molecules. So, for example, lets imagine I have a TIP4P water molecule (I know, my favorite molecule - though I've tested this on proteins as well!).
I create charges for all of the atoms via the python line;
chgs = AtomicCharges( [0.0,
0.52 * mod_electrons, \
0.52 * mod_electrons, \
-1.04 * mod_electrons] )
and LJ parameters via;
ljs = AtomicLJs( [ LJParameter( 3.15365 * angstrom, \
0.1550 * kcal_per_mol ), \
LJParameter.dummy(), \
LJParameter.dummy(), \
LJParameter.dummy() ] )
I can then associate these parameters with a TIP4P molecule via;
tip4p.setProperty( "charges", chgs )
tip4p.setProperty( "ljs", ljs )
Now, when I add this molecule to a forcefield that requires charge and LJ parameters, I can do so via generic interface function
cljff.add(tip4p, [cljff.parameters().coulomb() == "charges", \
cljff.parameters().lj() == "ljs"])
This line adds the TIP4P molecule to the forcefield, and tells the forcefield that the coulomb parameters are contained in the property called 'charges', while the LJ parameters are contained in the property called 'ljs' (these are actually the default names for this forcefield, so I could just have written cljff.add(tip4p) - but this gives you a better idea).
The code is clever enough to be able to check that the property that you have set is of the right type for the forcefield (e.g. the charges are indeed held by AtomicCharges, with one charge per atom).
This is very powerful, as it means that now everything that is needed by a molecule can be stored with a molecule. So, for example, if a molecule has different charges for different forcefields, then one set of charges could be stored as the property called 'charges1' while the second could be stored as 'charges2'. Indeed, I now have tons of different properties, so it is possible to associate lots of different meta information with each molecule (including, for example, providence information such as the file from which the molecule was loaded, information about how the parameters were generated etc.). It will even be possible for one molecule to hold another molecule as a property, thus allowing molecules to be nested.
Another key problem that I think that I have now solved is the problem of recording arbitrary selections of atoms in a molecule, and of representing and manipulating that selection in a time and memory-efficient manner. I'll leave the details of how I have solved this for another time (if you are interested, look in AtomSelection in the chryswoods branch), but the upshot is that I can now efficiently hold selections of atoms. These selections can also be given as a property to a molecule. This means that I could have a property called "qm atoms", which contains a selection of atoms, and rules for how to generate QM link atoms. The QM forcefield can look to see if such a property has been supplied, and if it has, it that property can be used to select the atoms that should be in the QM region, and to build any necessary link atoms. The beauty of this is that the atom selection is done once, and is forever held by the molecule, even when it is not being used in a QM forcefield. Since Molecule objects are easily saved and restored from a binary format, this means that one worker could set up the Molecule for a QM calculation, but give it to another worker who performs the MM equilibration. The molecule returned by the MM worker will still contain the QM selection property, which is guaranteed to still be valid. Indeed, the MM worker may themselves have added other properties to the molecule that may then be of use to the QM worker (e.g. a "log" property, that describes everything that this molecule has been subjected to, and who did what). I think that this will be very useful for collaborative projects, and will allow the Molecule class to be used as a lingua franca for the sharing of molecular data between people and groups. The only downside of course is that the data format is binary (but versioned and completely portable), though I am looking into how I could go about representing my Molecule in XML.