mappings between the headwords of various NT Greek lexicons, the lemmas of MorphGNT and Nestle 1904, and Strongs and GK numbers
Note that the focus of this repo is in data (and soon code) for mapping between lemmas and other identifiers for lexical items. It is not intended to be the home of glosses, definitions, morphological information, etc. Rather it is a Rosetta Stone of sorts to help integrate various resources with that information.
This data is made available under a CC-BY-SA 4.0 License.
Back in 2006, Ulrik Sandborg-Petersen and I spent time integrating various Greek New Testament analyses we'd separately worked on. One output of that work was our joint paper A New Numbering System for Greek New Testament Lexemes. A few years later, as part of my larger work on a Morphological Lexicon of New Testament Greek, I started a broader integration of various lexical resources including BDAG, Danker's Concise Lexicon and even an old word list Bill Mounce had shared with me in 1997. A lot of the data and scripts for that work are in the morphgnt/morphological-lexicon GitHub repo. But when it recently became clear many people were not aware of the existing work, I decided it was worth extracting lemma mapping data and code out of that repo (where it's easy for it be lost) and start a new repo with much more focus. This repo is the in-progress result.
The main file is lexemes.yaml
. The keys in this file are the lemmas used by the SBLGNT edition of the MorphGNT. The properties under each key are:
full-citation-form
the canonical full citation form (with genitive and article for nouns, etc)bdag-headword
the corresponding headword in BDAGdanker-entry
the corresponding full citation form from Danker's Concise Lexicondodson-entry
the corresponding full citation form from Dodson's lexiconmounce-headword
the corresponding headword from the list Bill Mounce gave me in 1997 (I need to update this with his much more up-to-date data)abbott-smith-header
the corresponding headword from Abbott-Smith NEWabbott-smith-entry
the corresponding full citation form from Abbott-Smith NEW (probably has some extraneous info I need to manually remove)strongs
the corresponding strongs numbergk
the corresponding G/K number
There are also some additional mapping files (all derived from lexemes.yaml
):
alt_mapping.yaml
a mapping from alternative spellings to the key found inlexemes.yaml
gk_mapping.yaml
a mapping from G/K number to the key found inlexemes.yaml
strongs_mapping.yaml
a mapping from Strongs number to the key found inlexemes.yaml
There is also an initial mapping for Nestle 1904 lemmas (via Strongs numbers) at nestle-match.txt
.
differences.yaml
and NOTES.md
contain old notes from my work on this back in 2013. I need to work through them and update them.
A newly added file conflation_sets.txt
lists all the sets of words conflated. Not every pair in every set has been conflated but the file partitions all conflated words such that any pair that is conflated appears in the same partition.
I need to look at more recent data from Mounce, sort out some of the mismatch issues with Nestle 1904, clean up the Abbott-Smith entries more, reintroduce a lot of the validation and utility code from before, integrate other headword lists such as LSJ and possible Brill, and eventually expand things to more GNT editions and a broader corpus of texts.
Ulrik Sandborg-Petersen was my original collaborator on projects that ultimately led to this work. Jonathan Robie is responsible for building the community which has reinvigorated my interest in continuing this and much other work. Eliran Wong encouraged me immensely with his use of earlier versions of this work and the feedback he has given.
For more of my work on linguistics and Ancient Greek, see http://jktauber.com/.