-
Notifications
You must be signed in to change notification settings - Fork 14
Building Atomic Graphs
Because it makes use of the Atomic Simulation Environment via Julia's PyCall, ChemistryFeaturization can build graphs from any file type that ase.io.read()
function can read in to an Atoms
object – full list here or via ase info --formats
on the command line.
The primary function that builds the graphs is (no surprise) the build_graph
function. The only required argument is the path to the structure file, but it has a few configurable options as well. The primary one is the method of constructing the edge weights of the graph. The "cutoff" method corresponds to the method used in the original cgcnn.py package: consider all neighbors up to some cutoff radius (defaults to 8 Å), and add up to a maximum of some number of neighbors (defaults to 12). In this implementation, the number cutoff is "soft," which is to say if there are additional neighbors at exactly the same distance, they will be added as well, even if the total number is above 12. The last option for this method is what function to use to set the edge weights based on separation distance. Currently the two options are inverse_square
or exp_decay
.
The other method is the Voronoi method (activated by setting use_voronoi=true
), adopted from the Ulissi Group's fork of cgcnn.py. Note that this method can only be used on fully periodic structures! It constructs a Voronoi tessellation to generate the neighbor list, and edge weights are added as the inverse of the polyhedra face areas.
The function also has a normalize
option (defaults to true), which will rescale all edge weights such that the maximum value is 1.0.
The function build_graphs_batch
allows for processing of sets of structure files with the same options outlined above. Its required inputs are simply paths to a folder containing structure files, and a folder in which to serialize (to .jls
files) the built graphs. Optionally, you can feed in a featurization scheme (vector of AtomFeat
objects) and a set of feature vectors (dictionary from atomic symbols to prebuilt vectors, will be built automatically from the featurization scheme if not provided) if you'd like to featurize the graphs as well. It also takes all the same keyword arguments as the build_graphs
function.
There is a corresponding function read_graphs_batch
which takes a path to a folder of serialized graphs and reads them into an array of AtomGraph
objects.