Performance of the Signature Program

This example is a study case that was designed to quantify how well the stochastic structure generator performs.

In the example, the potential energy distribution of the isomers of C8H10 were analyzed using the deterministic and the stochastic versions of the structure generator presented above.

One may notice that in the present example there is no need to resolve the signature equation. In fact, the structures to be generated are all composed of 8 carbon atoms and 10 hydrogen atoms; these atoms constitute the list of molecular fragments.
The deterministic version of the structure generator computed 4008 different isomers of C8H10 (not including eventual stereoisomers). To generate these isomers the deterministic version of the structure generator ran for 353.0 s CPU time on a SGI Personal Iris Workstation.

For comparison, the stochastic version of the structure generator estimated the number of isomers to be 3399. To calculate this number the stochastic generator was run for 16.5 s CPU time until the deviation between the estimates was lower than 1000.
To compute the energy distribution, the potential energy of each isomer was calculated using the MM simulations provided by Polygraf 3.21 (Molecular Simulations Inc.). All energies were minimized using the DREIDING force field and using a conjugate gradient algorithm for 500 steps or until the root mean square between two successive conformations was lower than 0.1 (kcal/mol)/A.

The potential energy distribution of the 4008 isomers is shown in the figure. This distribution was obtained in 114 285 s of CPU time on a SGI Personal Iris Workstation. Most of the time was spent minimizing the potential energy (as mentioned above, it took only 353 s to generate all the isomers without minimization). Using the same hardware, it took 12 223 s to generate a potential energy distribution from a random sample of 500 isomers, and 1 344 s for a random sample of 50 isomers.

Let f be the fraction of isomers of the total population having their potential energy in any given range of the figure. Let f500 be the same fraction obtained from the sample of 500 isomers, and let f50 be the fraction obtained from the sample of 50 isomers. It can be shown from figure 2 that on average | f500-f | = 0.17 f, and | f50-f | = 0.35 f. Therefore, the sample of 50 isomers gives an acceptable approximation of the potential energy distribution.

The same calculations were performed for all the hydrocarbons CnHn+i, with n varying between 2 and 8, and i varying between 0 and 2. For each of the previous hydrocarbons, it was concluded that a good approximation of the potential energy distribution can be obtained by generating a sample that represents only a small fraction of the total population.