whereρ1,2,3are weights to control the tradeoff among energy, forces and dipole mo- ments. The predicted forcesFpredare constructed as negative analytical derivatives of energies,
Fpred(α) =−∂Epred
∂R(α) , (6.1)
where Rare atomic coordinates, so that energy and forces are consistent and can be used for optimization or molecular dynamics. The loss weights were set up em- pirically to account for different scales of energy, forces and dipole moments. In our case we set the them to 0.9, 0.3 and 0.1 for energy, forces and dipole moments, respectively. Additional details of the NN setup can be found in [116].
Atomic forces (or gradient of the energy) were under the training too. One would expects that structures withs=0 should have forces to be zero because the structure defines very stable structure. However, numerical error of computations shifts the value from zero.
0 0.0002 0.0004 0.0006 0.0008 0.001 0.0012 0.0014 0.0016 0.0018 0.002 0.0022
20 30 40 50 60 70 80
F˜(NN−DFT)/n,[Ha/Bohr]
Number of C atoms,n
FIGURE6.5: Variation of predicted force and DFT per atom as a func- tion of number of carbon atoms for not perturbed s = 0 dataset.
F˜(NN−DFT) =√︂∑ni=1∑α=x,y,z(FNNi,α −FDFTi,α )2/3n
Nevertheless, the trained NN reproduces the atomic force with high accuracy (see Fig.6.5). The biggest loss for the set is 0.002 Ha/Bohr and the lowest is less than 0.0002 Ha/Bohr.
Apart from energies and atomic forces, the dipole moments were also under the training, as can be seen from Fig.6.6where the Cartesian components ofpNN−pDFT are shown. The average error of reproduction is about 0.015 D.
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045
20 30 40 50 60 70 80
p˜(NN−DFT)/n,[Debye]
Number of C atoms,n xy z
FIGURE6.6: The difference between predicted and DFT dipole mo- ment per atom for not perturbed s = 0 dataset. p˜(NN−DFT) =
√︂
(pmNN−pmDFT)2
As a summary, essentials of not perturbed s = 0 structures were learned by neural network. Additionally, for the complete picture of NN training it is good to mention the statistics for perturbed structures too. The statistics for nons = 0 structures energyEℓ, forceFℓand dipole momentDℓloses are presented in Table6.1.
Clearly, one can see that with an increase of perturbationsthe loses are increasing as expected. The idea behind holds, we guided NN training to grasp essentials of non-perturbeds=0 systems with some features of perturbed ones. The next step is verifying much finer feature.
TABLE6.1: Loss statistics for perturbed fullerenes.
s Eℓ[Ha] Fℓ[Ha/Bohr] pℓ[Debye]
0.005 0.0009 0.0046 0.025
0.01 0.0014 0.066 0.077
0.015 0.01 0.668 0.167
0.025 0.38 9.22 0.61
6.4.2 Energy minimization
After training, we used the NN as a kernel in minimization procedure. By this pro- cedure, we verify whether the MLP is accurate enough in another perspective.
In Fig. 6.7a difference between minimized and DFT energies as a function of number of carbon atoms for not perturbed datasets = 0 is depicted. The difference is decreasing with an increase of number carbon atom in the system. However, with respect to original value it happens that energy difference become positive. It means that after minimization the DFT energy is still lower than NN energy prediction.
(a)
−0.014
−0.012
−0.01
−0.008
−0.006
−0.004
−0.002 0 0.002
20 30 40 50 60 70 80
E˜(NNmin−DFT)/n,[Ha]
Number of C atoms,n
(b)
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04
20 30 40 50 60 70 80
Position/ScaleRMSD,[Bohr]
Number of C atoms,n
FIGURE 6.7: (a) The energy variation between minimized (NN) and original (DFT) configurations and (b) position/scale root mean squared deviation (RMSD) as a function of the number of carbon atoms for not perturbeds=0 dataset,E˜(NNmin−DFT) = (ENNmin−
EDFT), whereENNminis the minimized energy value by NN.
Moreover, by Fig.6.7b we see how much changes were introduced in the system by minimization procedure. The variation of Root Mean Squared Deviation (RMSD) varies up to 0.04 Bohr. evertheless, for example for C60, C70and C80it is very small.
That confirms that the changes for most important fullerenes are small and the neu- ral network is capable to distinguish the minimum of the system.
6.4.3 Comparison with GAP-20
Lastly, we provide a comparison between our neural network with perhaps so far the best MLP for carbon, GAP-20 [118].
The GAP-20 shows good results for many carbon structures including fullerenes.
Because of different reference states, we compare the above defined isomerization energies for both methods in Fig.6.8a.
Both methods give results nearer to the DFT results with increasing number of carbon atomsnwhere the fullerenes become more similar while our NN specialized to fullerenes and trained to this particular DFT (although not to testing samples)
(a)
0 0.001 0.002 0.003 0.004 0.005 0.006
20 30 40 50 60 70 80
E˜(X−DFT)/n,[Ha]
Number of C atoms,n GAP-20
NN
(b)
0 0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004
20 30 40 50 60 70 80
F˜(X−DFT)/n,[Ha/Bohr]
Number of C atoms,n GAP-20
NN
FIGURE6.8: A comparison of energy (a) and force (b) predictions be- tween GAP-20 (empty box) and our NN (solid box) for the same DFT configuration with the suitable reference point (Ih-C60), which was the same configuration, but calculated energy value was computed
separately by each neural network.
seems to be a bit more accurate. A similar trend can be seen also in comparison of forces shown in Fig.6.8b, where our NN is more accurate as the forces were included in the model.
X=1, Y=2
./Figures/snaps/data/splitted/render/20_A-tga-converted-to.jpg./Figures/snaps/data/splitted/render/20_A-tga-converted-to.jpg./Figures/snaps/data/splitted/render/20_A-tga-converted-to.jpg./Figures/snaps/data/splitted/render/20_A-tga-converted-to.jpg
X=2, Y=3
./Figures/snaps/data/splitted/render/20_A-tga-converted-to.jpg
X=3, Y=4
X=4, Y=5
X=5, Y=6
X=6, Y=7
X=7, Y=8
X2 X4 X6 X8 Y0
FIGURE6.9: Fullerenes overview.XandYdefine number ofCatoms in the system. IPR fullerenes are highlighted by the color frame.