GitHub - Ofekirsh/GPT-2: PyTorch implementation of a GPT-2-inspired transformer language model, trained on English (Shakespeare) and Hebrew corpora, with attention visualizations and evaluation.

GPT-2 Language Modeling: English & Hebrew

This project implements a transformer-based language model inspired by GPT-2.

The model is trained separately on two datasets:

English: Shakespeare's works
Hebrew: A Hebrew language corpus

The results of the models can be found in report.pdf.

The heatmap below shows the attention scores from Layer 1, Head 2 on a sample from the Shakespeare dataset. Brighter values indicate stronger attention. The strong diagonal reflects high attention to the previous token, as expected in causal language modeling.

Running the Code

To train the Transformer-based GPT-2 language model on Hebrew or English text:

python main.py

Make sure your configuration file is defined in conf/conf.json.

Example `conf.json`

{
  "seq_len": 128,
  "batch_size": 64,
  "data_path": "data/heb-data/",
  "n_layers": 8,
  "n_heads": 8,
  "embed_size": 192,
  "learning_rate": 5e-4,
  "gradient_clipping": 1.0,
  "weight_decay": 1e-4,
  "num_batches_to_train": 50000
}

To train on English data, just change the data_path to "data/eng-data/".

Output Files

Model Checkpoint: After training, the model’s weights will be saved as:
```
model_weights.pth
```
Generated Samples: During training, the model occasionally generates and logs example sequences from the current model state.

Example log output:
```
Model sample: '''To be, or not to be: that is the question...'''
```
Samples are generated every 100 batches using a custom better_sample_continuation method. This method takes a prompt (prefix) and generates new tokens one-by-one, applying:
- Temperature scaling to control randomness (lower = more deterministic).
- Top-K sampling to only choose from the top K most probable next tokens, improving quality and coherence.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
conf		conf
data		data
gpt2		gpt2
images		images
tests		tests
.gitignore		.gitignore
README.md		README.md
main.py		main.py
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT-2 Language Modeling: English & Hebrew

Running the Code

Example `conf.json`

Output Files

About

Releases

Packages

Languages

Ofekirsh/GPT-2

Folders and files

Latest commit

History

Repository files navigation

GPT-2 Language Modeling: English & Hebrew

Running the Code

Example conf.json

Output Files

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Example `conf.json`

Packages