1. Getting Started

This package is a re-implementation of the xc-diff network architecture, originally implemented in PyTorch as a stand-alone SCF framework for Kohn-Sham density functional theory.

The architecture has been modified and translated to JAX-compliant methods and objects, based on the equinox library.

This package is designed to work well with the pyscfad package, an end-to-end autodifferentiable version of pyscf written with JAX functions.

2. Example

The following example is intended to work through the directories located in xcquinox/examples, to go from generating the exchange and correlation networks to optimization and usage of them for PySCF-AD calculations.

This directory and the sub-directories within will walk you through the process of taking randomly initialized exchange/correlation networks (MGGA and non-local ones) through the pre-training and optimization processes, and then using these networks to do a calculation in PySCF-AD.

In the following examples, some Python scripts have a variable XCQUINOX_EXAMPLE_DIR that should be set to point to the xcquinox/examples directory that comes with the package.

# 00.generation

Here, generate.py calls on the xcquinox.net.make_net utility function to generate eight total networks – for each exchange and correlation networks, two MGGA-level and two non-local networks are created with different random seeds to have different initial network weights.

The DEPTH and NHIDDEN (number of nodes in a layer) are set to 3 and 16, respectively, but can be easily changed for a hyperparameter exploration.

Many options to xcquinox.net.make_net are available, but if not otherwise specified the function assumes the default values are to be used. You’re encouraged to take a look at the documentation for this function to learn what options exist and what the default networks will do.

After running generate.py, two directories (mgga/ and nl/) will appear in 00.generation, each containing two exchange and two correlation networks. These will be used in the pre-training process.

# 01.pretraining

Here, we have re-created the same mgga/ and nl/ directory structure in which to pre-trained the generated networks to fit SCAN exchange/correlation energy densities, via the invokation of the xcquinox/scripts/pretrain_exc.py script. We have proactively copied the network.config* files from the 00.generation directories into the corresponding 01.pretraining directories, as the utility functions use these config files to generate the correct network structures during the next steps.

Running bash run_script.sh in any of these directories will pre-train the generated networks to fit to the exchange or correlation energy densities generated by the SCAN functional, for a small subset of atoms/molecules from the G2/97 dataset.

These networks are optimized separately as a conditioning step before using the networks in the optimization process.

# 02.optimization

First, we run makexcs.py which calls xcquinox.xc.make_xcfunc to generate the xcquinox.xc.eXC object which combines the separate exchange and correlation networks that have been pre-trained. This creates 4 XC functionals, two in each of the generated mgga and nl directories.

Then, we descend into the train directory, where a few scripts already exist – namely

  • supervisor.sh, a short bash script which begins the sv_train_traj.py supervisor script to monitor the optimization process. sv_train_traj.py is useful because memory usage during the optimization process might max out, or become close enough to max out such that the training relies on swap space or otherwise sub-optimal data loading. To circumvent this possibility, sv_train_traj.py keeps watch over the system’s memory utilization, and once it exceeds a given percentage the script kills the current training job and restarts in in the next run*/ directory. Importantly, this script looks for xc_config_dir to copy the cnetwork.config and xnetwork.config files into each run*/ directory to be able to reconstruct the network with the correct architecture in each cycle.

  • run_script_local.sh, this is the initial start script for the training cycle. In it, we point to a training set located at xcquinox/scripts/script_data/training_subsets/01/subat_ref.traj, an ASE trajectory containing the ozone molecule as its first entry and the oxygen atom as its second (thus –singles_start 1, telling the script that the atomization energy atoms to use are located beyond this index in the ASE trajectory). We point the script to the pre-trained XC network with which we wish to start, and it tells the script we’re training an MGGA network with xc_xc_level MGGA. To maximize the number of epochs per cycle, we set mf_grid_level 1 to keep the calculation arrays relatively low-cost, but for production training this will likely need to be changed.

  • run_script_restart_local.sh, which is the same script as run_script_local.sh but with xc_xc_net_path changed to point to CHECKPOINTPATH, which sv_train_traj.py will sed replace to point to the copied checkpoint from the previous cycle.

Try running bash supervisor.sh and allow the training cycle to continue for a bit. You can edit the various parameters to use a different pre-trained network, such as one of the non-local networks we pre-trained.

# 03.calculation

This directory similarly contains three separate scripts – a supervisor.sh script to call sv_calculate_traj.py, a supervisor script that functions similarly to sv_train_traj.py, an initial script to start the calculations (run_script.sh), and a script to use in the restarted calculations should memory usage exceed the allowed amount (run_script_restart.sh), which contains a string INSERTINDEXHERE that sv_calculate_traj.py will replace with the correct index upon a restarted calculation.

These scripts work in concert to utilize the XC functional pointed at as the driver behind a PySCF-AD calculation, carried out with the calculate_traj.py script. In this script, since networks are specified to be loaded, the PySCF-AD kernel driver is overwritten with one that utilizes the network as the SCF calculation driver.

Feel free to play around with the networks used – you are encouraged to use the optimized network generated in the example and compare with the pre-trained networks used to initiate optimization. As it stands, the network used by default in this portion is set to be one of the MGGA networks produced by 02.optimization/makexcs.py.