Skip to content

Terminology

Rachel Kurchin edited this page Oct 14, 2020 · 2 revisions

There are a lot of seemingly similar terms used for quantities in this package that refer to disparate things. Here is my best attempt to clarify them as I use them here:

  • Feature: A quality or quantity associated with an atom that we wish to encode, such as atomic mass, row in the periodic table, etc.
  • Feature vector: The (typically one-hot-style) encoding of the values of a set of features associated with a particular atom (or, in the case of Weave featurization, pair of atoms, etc.). For example, if we were encoding the atomic mass (across five possible bins) and periodic table block (s, p, d, or f) of hydrogen, the associated feature vector would be [1 0 0 0 0 1 0 0 0] (the first five slots corresponding to atomic mass and the last four to block)
  • Featurization: either the process of assigning feature vectors to chemical elements, or a description of a scheme for doing so, encoded as a Vector of AtomFeat objects (eventually, we plan to implement BondFeat, PairFeat, etc...)
  • Feature matrix: A collection of feature vectors associated with the atoms in a structure. Its shape should be (# features, # nodes).
Clone this wiki locally