Building a basic transformer from scratch only using Pytorch

Description

F.R.I.E.N.D.S-GPT is a model designed to generate scripts based on the episodes of the TV show Friends. This project leverages the principles outlined in the "Attention is All You Need" paper, which introduces transformer models, a groundbreaking architecture for NLP-related tasks.

How It Works

The generator focuses solely on producing scripts, implementing only the decoder part of the transformer model. Here's an overview of the approach:

Data Preparation: The model is trained using a dataset of Friends scripts, which can be found here.
Model Architecture: Only the decoder part of the transformer model is used, as the generator doesn't require prompts or initial inputs. For tasks like language translation or query-response systems, the encoder part of the transformer would be needed.
Training: The model is trained on script data to learn the patterns and structure of the dialogues.
Note - This is a pretty basic representation of how such NLP tasks are accomplished. The final output generates valid text and words, not necessarily meaningful sentences. This obviously can be improved with a more indepth analysis and better hyper-parameter tuning.

Basic Transformer Architecture

Only the Decoder section of the transformer is built in this project.
This is because, the model is designed to just blindly generate scripts without any context of situation or prompt. So it doesn't need an encoder section for the model to build upon.

Results

After training for 5000 epochs, the model achieved a train loss of 1.05 and a test loss of 1.16. These results can be improved further by adjusting hyperparameters and training for more epochs.
I basic script generated by this model can be found in Generated_script.txt

References

Dataset: Friends Netflix Script Data
Paper: Attention is All You Need

This project represents my first venture into NLP, utilizing transformer models to generate character-based scripts. While this implementation might not be the only or the most accurate way to achieve the task, it provides a foundational understanding and can be built upon for further improvements. Another really good watch on this topic is : https://www.youtube.com/watch?v=kCc8FmEb1nY . This video really helped me get a deeper and more proper understanding of how such transformers really work and how to convert them into code.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
__pycache__		__pycache__
Baby_GPT.ipynb		Baby_GPT.ipynb
Friends.csv		Friends.csv
Friends_model.py		Friends_model.py
Friends_script.txt		Friends_script.txt
Generate_script.py		Generate_script.py
Generated_script.txt		Generated_script.txt
GetData.py		GetData.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building a basic transformer from scratch only using Pytorch

Description

How It Works

Basic Transformer Architecture

Results

References

About

Releases

Packages

Languages

Noodle-bg/F.R.I.E.N.D.S-GPT

Folders and files

Latest commit

History

Repository files navigation

Building a basic transformer from scratch only using Pytorch

Description

How It Works

Basic Transformer Architecture

Results

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages