Skip to content

Releases: jjmccollum/teiphy

Support through Python 3.12, 62-state support for NEXUS outputs, and support for PHYLIP distance/similarity matrices

04 Feb 15:40
f989492
Compare
Choose a tag to compare

This release incorporates contributions from @catsmith to ensure compatibility with Python versions 3.9 through 3.12. (As a result of these changes, Python 3.8 is no longer supported.) To accommodate software like PAUP* and @edmondac's fork of MrBayes (https://github.com/edmondac/MrBayes), the symbol set for NEXUS outputs has been extended to 62 symbols (0-9, a-z, and A-Z). This release also adds support for the use of --table distance and --table similarity options (along with the --proportion and --show-ext flags) with outputs in PHYLIP (.phy and .ph) format to produce PHYLIP-formatted distance or similarity matrices.

Support for similarity matrices and common variation unit counts in distance/similarity matrices

15 Jan 16:24
abfa61b
Compare
Choose a tag to compare

This release introduces the --table similarity option, which produces a tabular output with counts of pairwise agreements between witnesses (or, if the --proportion flag is specified, proportions of agreements among variation units where both witnesses have non-ambiguous readings), as well as the --show-ext flag, which adds the number of variation units where both witnesses have non-ambiguous readings to each cell's value (e.g., 47/50 or 0.94/50). This option can also be used with distance matrices specified with --table distance.

Support for exclusion of fragmentary witnesses

12 Jan 16:57
dbdaaa5
Compare
Choose a tag to compare

With this release, you can exclude fragmentary witnesses from your collation by specifying the --fragmentary-threshold command-line option, followed by a number between 0 and 1 indicating the proportion of variation units at which a witness must be extant (i.e., have a non-missing reading according to the reading type(s) specified with the -m option) to be included in the output. Thus, --fragmentary-threshold 0.7 will exclude all witnesses with more than 30 percent of their readings missing, while --fragmentary-threshold 1.0 will exclude witnesses with any missing readings. (Note that this check is performed after correctors' hands have been filled in, if you have supplied the --fill-correctors option.)

Extended number of states for BEAST 2 outputs

08 Jan 03:55
Compare
Choose a tag to compare

In principle, any number of states should theoretically be permissible in BEAST 2.7 XML inputs, since the states are specified as sequences of probabilities rather than with one-character symbols. But even with sequences encoded in this way, BEAST 2 still requires code maps (for some reason), so we are limited by the space of allowable single-character symbols. Previously, teiphy restricted the set of BEAST state symbols to 0-9 and a-z. This release adds A-Z to the symbol set.

Support for variation unit identification through combination of "n", "from", and "to" attributes

08 Jan 03:05
Compare
Choose a tag to compare

Previously, teiphy assumed that each variation unit (i.e., an app element) would be uniquely identified by its xml:id attribute or its n attribute alone. While this assumption holds in the case of xml:id attributes (which, by definition, must be unique), it does not hold for n attributes. In practice, TEI XML collations assign app elements in the same larger passage of text (e.g., a verse) the same n value as that larger passage and then assign the app elements additional from and to attributes specifying word indices, so as to specify the unique location of the variation unit within that larger passage. To this end, the VariationUnit class of teiphy now checks for from and to attributes in addition to an n attribute and combines them to form a unique ID for the variation unit.

Support for supplying/updating witness date ranges through external CSV file

07 Jan 03:43
Compare
Choose a tag to compare

This release provides a new feature for the convenience of users who have derived their collation data and witness date ranges from different sources: a CSV file containing witness IDs and (potentially empty) minimum and maximum dates can be specified with the --dates-file command-line option. For witnesses in the CSV file, the specified date range will overwrite any existing date range in the TEI XML collation.

Update to mirror current version of STEMMA; dependency updates

01 Jan 13:43
Compare
Choose a tag to compare

This release increases the number of states for STEMMA outputs from 22 to 62 in accordance with the latest updates to STEMMA. It also updates several dependencies to address vulnerabilities noted by Dependabot.

Minor fix for STEMMA outputs and updates to dependencies

10 Nov 08:00
Compare
Choose a tag to compare

This release corrects the previous release's fix for STEMMA outputs (so that they support 22 states rather than 24) and updates several dependencies to address vulnerabilities noted by Dependabot.

Fixes for BEAST loggers and STEMMA state encodings

24 Oct 23:34
654ad14
Compare
Choose a tag to compare

This minor release adds some missing attributes to state/ancestral logger elements in BEAST outputs to ensure that root frequencies (corresponding to intrinsic probability judgments) are incorporated into probability calculations for state sampling. It also fixes a previous bug in mapping variant reading indices to state codes in STEMMA outputs, so that reading indices (up to a maximum of 24 per variation unit) are now mapped to single-character state codes.

Support for time-dependent transcriptional relations in BEAST 2.7 outputs

24 Apr 13:55
5724b9f
Compare
Choose a tag to compare

The main change introduced in this release is support for tagging of potential transcriptional explanations with notBefore and notAfter attributes. If these attributes are present in a variation unit's transcriptional relations list, teiphy will now map the transcriptional relations to an EpochSubstitutionModel with a different substitution model for different slices of time. This feature is only supported for BEAST 2.7 XML outputs. This means that BEAST users can now model time-dependent transcriptional changes (like assimilation to later popular texts, paleographic confusions possible only for earlier or later scripts, etc.) more accurately.

A related change is the addition of more comprehensive rules for updating witness date ranges based on the date range of the work's composition (and vice-versa). This change affects age/date calibrations for NEXUS and BEAST 2.7 XML formats (including the MrBayes NEXUS input format).

This release also fixes an error that prevented the --verbose flag from working correctly.