Releases · project-gemmi/gemmi

30 Nov 20:02

wojdyr

v0.7.0

32ed576

0.7.0 Latest

Latest

C++14 (or later) is required to build the library, C++17 (or later) to build Python bindings.
Expect breaking changes, especially in Python bindings.
The lists below are not complete, but should cover most of the changes.

Library

Added unified logging of warnings/errors from various gemmi functions (class Logger)
replaced string Model::name with int Model::num
mmcif: better handling of null auth_comp_id
fixes for mmJSON
Removed deprecated functions:
- UnitCell.fractionalization_matrix and orthogonalization_matrix – use frac.mat and orth.mat
- count_hydrogen_sites() – use has_hydrogen() or count_atom_sites(gemmi.Selection('[H,D]')
- Grid::resample_to() – use interpolate_grid()
unified API of Grid interpolation functions. They now have parameter order that can be 0 (nearest value), 1 (linear interpolation), or 3 (cubic). In C++ there are also functions such as trilinear_interpolation() to ensure no overhead.
to_pdb: write HET records
Extended selection syntax with: [metals] and [nonmetals].
Added function set_is_metal() intended for debatable metalloids
improved interoperability with MMDB (a CCP4 library)
MonLib: removed read_cif args
mtz: fixed writing BATCH records
hydrogen placement: fixes needed for new files with metals in CCP4 Monomer Library
pdb: fixed reading TLS S tensor
Structure metadata: expanded RefinementInfo

Python

Python bindings migrated from pybind11 to nanobind.
- Much lower runtime overhead, faster build times, better error diagnostics.
- Built-in typing stubs.
- Only Python 3.8+.
- Sadly, no support for Buffer Protocol. It was replaced with NumPy __array__ methods.
  For NumPy, you can also use .array properties that were available also in the previous releases.
- No implicit conversions from list to ndarray, and from bytes to string (let me know where it causes problems)
- gemmi.ValueSigmaAsuData.value_array has now shape (N,2)
Added pickling support for Structure, Model, Chain, Residue, Atom, cif.Document, cif.Block.
Added function interpolate_position_array (#323).
Python extension module is now installed into site-packages/gemmi/ (this change should be invisible to the user)

Program

gemmi convert --sifts-num is now more customizable
gemmi sf2map: added option --check (see docs)
gemmi cif2mtz: add a rule to spec to convert pdbx_F_calc_with_solvent to F-model (+phase)
gemmi xds2mtz: handles merged files from XSCALE
gemmi mtz2cif and merge: recognize extension .ahkl as XDS file

Assets 2

06 Sep 20:56

wojdyr

v0.6.7

0da57ac

0.6.7

This is primarily a bug-fix release. New Python bindings are not included yet.

Enhancements:

New subcommand gemmi set for changing coordinates, B-factors and occupancies in coordinate files (mmCIF and PDB). Unlike other tools, it replaces numbers while leaving the rest of the file intact. An alternative to CCP4 PDBSET keywords: BFACTOR, OCCUPANCY, SHIFT, NOISE. Note that gemmi convert offers overlapping capabilities. For instance, gemmi convert --apply-symop=x+0.123,y,z shifts the coordinates similarly to gemmi set --shift='9.3 0 0' (the latter takes the shift in Angstroms).
Improved anisotropic scaling of structure factors. More work is planned in this area.

Fixes:

fixed reading of mmCIF files without _atom_site.auth_seq_id
in Topology preparation: fixed a couple of bugs, peptide links are now assumed to be CIS for ω=0±60° (previously, ω=0±30°)
fixed re-assignment of ATOM/HETATM record types (gemmi convert --assign-records)
fixed gemmi convert --sifts-num for UniProt sequence numbers >5000

And various minor changes that are hard to describe concisely.

Assets 2

28 May 19:22

wojdyr

v0.6.6

0b23c06

0.6.6

Library:

SmallStructure: changed how the space group is read and accessed.
Relying on H-M space group names alone was not always sufficient. The new mechanism uses the list of operations and Hall symbol in preference to the H-M symbol – the order is configurable.
symmetry triplets: parse decimal fractions (small molecule files may use notation such as x+0.25 instead of x+1/4)
tabulated space groups: a few more settings: B 1 2 1, B 1 21 1, F 1 m 1, F 1 d 1, F 1 2 1
X-ray scattering coefficients: changed the default value of IT92::ignore_charge to true (i.e. charges are now ignored by default; before version 0.6.3 they were always ignored)
cif::Table: added method ensure_loop() that converts tag-value pairs into a loop; might be needed before calling append_row()
place_hydrogens(): fix for NH3-like configurations
improved gemmi->mmdb conversion
Grid: tweaked good_grid_size() to ensure that when creating a grid up to a certain d_min, all reflections up to d_min are in the grid (it matters when no oversampling is applied)
DensityCalculator: deprecated function set_grid_cell_and_spacegroup(), use grid.setup_from()
fixed TNT-compatible reciprocal space ASU calculation for non-standard settings
infer_polymer_end(): complicate the heuristic even more, to detect files that have HETATM incorrectly used for standard residues in a polymer (such files were reported, they are either a result of mutating from non-standard residues, or a buggy program)
added function assign_het_flags() to re-set ATOM/HETATM flags
Model: added funtions calculate_b_iso_range() and calculate_b_aniso_range(); the first one can be used to detect if pLDDT is in the range 0-100 (like from AlphaFold) or 0-1 (like from ESMFold)
writing mmCIF: write _entity_poly_seq.hetero
added flag Entity::reflects_microhetero that shows if sequences were read from SEQRES (and don't account for point mutations) or from _entity_poly_seq; new function add_microhetero_to_sequences() changes the former to the latter

Program:

gemmi sfcalc: added a few more options
gemmi convert: added options --assign-records[=A|H], improved --sifts-num, adding microheterogeneities to _entity_poly_seq when converting from PDB
gemmi cifdiff: added option -t for basic comparison of values for a single tag

Other:

minimal WebAssembly port (C++ code compiled with emscripten) of Structure,
as a proof-of-concept and for reading mmCIF files in UglyMol
examples/to_rdkit.py: example of conversion of gemmi ChemComp to RDKit Mol

and a number of less important changes

Assets 2

17 Feb 13:40

wojdyr

v0.6.5

e471b13

0.6.5

Library:

gemmi can now be built with zlib-ng, a faster fork of zlib (good for working with large, compressed files)
experimental: binary serialization of Structure (contained objects, such as Model, Chain or UnitCell, can also be serialized separately)
finalized handling of 5-character monomer names; uses the tilde-hetnam extension (ABCDE ↔ ~DE) for PDB files
when atom names in the coordinate file match previous names (_chem_comp_atom.alt_atom_id) from the monomer library (the names in the CCD and therefore also in the ML change occasionally), print better diagnostic; added function MonLib::update_old_atom_names() to update the names in a Structure
topology: fixed handling of two bonds between the same two residues
options for handling mmCIF files with incorrect entities (modified add_entity_ids() when called with overwrite=true)
added function Intensities::prepare_merged_mtz()
a few bug fixes (for instance, in handling of negative residue numbers in the selection syntax)

Python bindings:

generating type stubs - see #293
python: cif.Loop.val() has been replaced with __getitem__/__setitem__
fixed Mtz.Batch.ints and Mtz.Batch.floats

Program

subcommand diff has been renamed to cifdiff
subcommand prep has been renamed to crd
validate: more options for checking monomer files
gemmi-grep: added option --extended-regexp
mtz2cif: added column names Iplus/Iminus (used by ccp4i2) to the default conversion spec

Note: this list is meant to show important changes only.

Assets 2

13 Dec 16:20

wojdyr

v0.6.4

84b5803

0.6.4

Library

completely changed build system for Python module, from setuptools to scikit-build-core
optimized electron density calculation: single-precision version is now about 2x faster and slightly less exact; some other grid-based calculations also got optimized in the process
as part of the above optimizations, some of the grid computations require that the model is in the standard orientation (conventional axis directions); in other cases (which are very rare after the remediation of non-standard coordinate frames in the PDB) call standardize_crystal_frame()
CIF output: more flexible formatting
mmCIF writing: category _entity_poly is included by default, with pdbx_strand_id and pdbx_seq_one_letter_code
minor changes in reading mmCIF coordinate files
cif: added functions Loop::add_columns(), Loop::remove_column(), Column::erase()
MRC map format: ORIGIN record is ignored (previously, if ORIGIN was non-zero, Ccp4::full_cell() returned false and some map properties were not set)
new function Grid::symmetrize_avg()
fixed bug in ReciprocalGrid::prepare_asu_data()
added function read_pir_or_fasta() for reading sequences (previously it was undocumented and more limited)
added function pdbx_one_letter_code() which returns a string like AA(MSE)H…, for _entity_poly.pdbx_seq_one_letter_code
new functions expand_one_letter() and expand_one_letter_sequence() that take ResidueKind.AA/RNA/DNA as argument replaced expand_protein_one_letter*()
adjusted weights in align_sequence_to_polymer()
added function assign_best_sequences()
PDB reading: added Structure::ter_status flag to indicate if TER records were: absent, present, clearly in wrong places
experimental (not documented yet) new functions: Model::get_cra(), Model::get_parent_of()
Topo::Bond stores a flag for bonds between different symmetry images
ChemComp::Atom: store _chem_comp_atom.alt_atom_id as old_id, use it in new function update_old_atom_names()
riding hydrogens: added H had wrong occupancy in special, rare cases
added Vec3f – Vec3 with single-precision numbers
minor API changes: Binner::setup() doesn't return anything, changed argument types of Scaling::scale_data(), align_sequences()

Program

new tool gemmi-diff that compares categories and tags in two (mm)CIF files
gemmi-align prints vertical list with option --verbose
gemmi-residues has new options: -e, -sss, --chains
gemmi-rmsz: added option --missing to print missing atoms
gemmi-validate: more options for validating monomer files
gemmi-h: more options
gemmi-mtz: prints info about SYMM records

Assets 2

07 Sep 13:08

wojdyr

v0.6.3

28b5670

0.6.3

new: normalization of amplitudes using so-called "Karle" approach, similar as in the CCP4 program ECALC
added X-ray scattering coefficients for ions (previously, the charge of atom was ignored)
pdb: reading CONECT records, and an option to also write them
when reading pdb, if any chain has 2+ TER records, all TER records are ignored
more configuration options for writing pdb files
added functions Mtz::expand_to_p1() and Mtz::read_file_gz()
cif::Block::find_value(tag) now returns also value from the corresponding loop if that loop has only one row
changes in gemmi-validate related to validation with DDL2
gemmi-sfcalc: added option --sigma-cutoff
gemmi sf2map --mapmask: if the unit cells in coordinate file is different than in SF file, use only the latter
improved transform_to_assembly(), expand_ncs() and rename_chain()
cif2mtz: Mtz column for pdbx_DELPHWT has now label PHDELWT (#272)
fixed ensure_asu(): phase-shift (for phases and H-L coefficients) was wrong
fixed UnitCell::find_nearest_image() for non-crystals with NCS
fixed DensityCalculator::requested_grid_spacing()
changes and enhancements in add_chemcomp_to_block(), in solvent masking, in mtz2cif,
and in several other places
added python bindings to MtzToCif, cif::Ddl, PdbWriteOptions, changed how options for PDB writing are passed, more bindings for Mtz::Batch

Assets 2

25 May 17:22

wojdyr

v0.6.2

3f7b8d7

0.6.2

a number of fixes, mostly in topology preparation
support for extended (longer) CCD and PDB codes that are about to be introduced by the PDB
gemmi-convert: added option to rename a monomer
a few changes and additions in cif2mtz, including:
- anomalous data written as separate rows for F+ and F- is now converted as expected
- _refln.F_squared_meas is now a synonym for F_squared_meas
gemmi-grep: new option --only-tags
gemmi-validate: a couple of new checks and options
pdb and mmCIF: convert MODRES <-> _pdbx_struct_mod_residue
cif.Block: blocks with no name (just data_) used to have the name set to "#", now it's " "

Assets 2

19 Apr 12:39

wojdyr

v0.6.1

7639b1f

0.6.1

changed how CISPEP is stored: previously, it was assumed that a link between two residues is either TRANS or CIS; if the residues have atoms with alternative conformations, both link types can be present at the same time
riding hydrogens: previously, hydrogens had the same altlocs as the parent atom; now if the parent atom has a single conformation, but it has neighbors in multiple conformations, the hydrogens will be also added in multiple conformations
major changes in NearestNeighbor: now it's possible to search atoms not only in the first nearest cells, but in any number of nearest cells; find_nearest_atom() was changed to find, by default, a nearest atom within any distance
changes in Mtz.reindex(), primarily to fix determination of the new space group
gemmi-convert: added option --all-auth to write _atom_site.auth_atom_id and auth_comp_id, which are skipped by default (because they are always the same as label_…_id)
added more options to gemmi-rmsz, gemmi-xds2mtz, gemmi-cif2mtz
fix a recent regression in check_polymer_type(): RNA was returned instead of DNA
improved heuristic of detecting where the polymer ends (if TER record is missing)
selection syntax: fix parsing a single sequence id such as "A/208" (it was parsed as A/208-/)
removed SMat33::calculate_eigenvector() – use eigen_decomposition() instead; SoftwareItem::pdbx_ordinal; NeighborSearch::Mark::x,y,z (use ::pos)
more code was moved from headers to src/*.cpp

Assets 2

06 Mar 12:41

wojdyr

v0.6.0

73968c5

0.6.0

C++ library is no longer header-only, several function were moved from headers to src/ to make compilation faster
major changes in cmake build, requiring now cmake 3.15+
improvements in calculating riding hydrogen positions
changed again the scheme of automatically assigned subchain names (A-p -> Axp, because PDB software can't handle non-alphanumeric characters there)
a function for calculating polarization correction for XDS INTEGRATE.HKL
improvements in xds2mtz, converting more data and filling more records in MTZ batch headers
added SpaceGroup::change_of_hand_op()
various bug fixes and small improvements

Assets 2

22 Dec 11:27

wojdyr

v0.5.8

1b45de2

0.5.8

gemmi program has new subcommand xds2mtz that converts from XDS_ASCII to multirecord MTZ
subcommand gemmi-residues has new option -s (--short) for shorter overview of model chains (can be used twice)
cif2mtz: more flexible spec for converting symbols to numbers
preparation of Refmac intermediate files – it can be used to substitute a part of Refmac
MonLib and Topo: a number of changes related to reading a monomer library and in prepare_topology()
changed the scheme of automatically assigned subchain names (Apoly -> A-p)
read_structure(): added optional arg save_doc that stores cif.Document if the read file is mmCIF or mmJSON
reading PDB files: more metadata is read by default
writing mmCIF files: _atom_site.group_PDB is written by default
support for mmCIF extension _atom_site.ccp4_deuterium_fraction
added function copy_from_mmdb(mmdb::Manager* manager) -> Structure
improved check_polymer_type()
Grid::set_size_from_spacing() with different rounding modes
rename src/ to prog/; in the next version the library won't be fully header-only, cpp files will go into src/

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Library

Python

Program

Releases: project-gemmi/gemmi

0.7.0

Library

Python

Program

0.6.7

0.6.6

0.6.5

0.6.4

0.6.3

0.6.2

0.6.1

0.6.0

0.5.8