Cloud Native LLM Project (AMOS SS 2024)

AMOS Project

This project is a student project for the AMOS SS 2024 course with the industry partner Kubermatic at Technical University of Berlin,Friedrich-Alexander University of Erlangen-Nuremberg and Free University of Berlin, under the supervision of Prof. Riehle and contact persons are Mario Fahlandt and Sebastian Scheele of Kubermatic.

Overview

Welcome to the Cloud Native LLM Project for the AMOS SS 2024!

This project aims to simplify the Cloud Native ecosystem by resolving information overload and fragmentation within the CNCF landscape.

Our vision is a future where developers and users can effortlessly obtain detailed, context-aware answers about CNCF projects, thereby boosting productivity and enhancing comprehension.

The development of this project follows an open-source and open-model fashion.

The folder structure is as follows: [TBD]

First 1.
Second 2.
Third: 3.

Objectives

Select and Train an Open Source LLM: Identify a suitable open source LLM for training with specific Kubernetes-related data.
Automate Data Extraction: Develop tools to automatically gather training data from publicly available Kubernetes resources such as white papers, documentation, and forums.
Incorporate Advanced Data Techniques: Use concepts and relationship extraction to enrich the training dataset, enhancing the LLM's understanding of Kubernetes.
Open Source Contribution: Release the fine-tuned model and dataset preparation tools. Potentially work in tandem with the AMOS project on knowledge graph extraction to synergize both projects’ outcomes.
Benchmark Development: Construct a manual benchmark to serve as ground truth for quantitatively evaluating the LLM's performance.

Methodology

Dataset Preparation

Data Sources: Collect documentation from CNCF landscape project documentation, white papers, blog posts, and technical documents.
Preprocessing: Normalize and structure the collected data.
Knowledge Extraction: Use Named Entity Recognition (NER) to extract key entities and create relationships between them.

LLM Fine-Tuning

LLM Selection: Evaluate and select an appropriate open-source/open-model LLM based on performance, computational requirements, and licensing.
Fine-tuning Procedure: Use the structured dataset for model training in a repeatable and reproducible manner, ideally using Cloud Native tools like KubeFlow and Kubernetes.

Evaluation

Quantitative Metrics: Use specific benchmarks such as BLEU score and Factual Question Accuracy to assess model performance.
Qualitative Evaluation: Domain experts and project maintainers will evaluate the LLM’s comprehensiveness, accuracy, and clarity.

Potential Impact

This project aims to become a definitive knowledge base for cloud computing, enriching the knowledge of engineers in cloud-native development and supporting the maintenance and growth of open-source projects.

Get Involved!

To get started: [TBD]

Name		Name	Last commit message	Last commit date
Latest commit History 214 Commits
.github		.github
Deliverables		Deliverables
Documentation		Documentation
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
run_all.sh		run_all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cloud Native LLM Project (AMOS SS 2024)

AMOS Project

Overview

Objectives

Methodology

Dataset Preparation

LLM Fine-Tuning

Evaluation

Potential Impact

Get Involved!

About

Releases

Packages

Languages

License

grayJiaaoLi/amos2024ss08-cloud-native-llm

Folders and files

Latest commit

History

Repository files navigation

Cloud Native LLM Project (AMOS SS 2024)

AMOS Project

Overview

Objectives

Methodology

Dataset Preparation

LLM Fine-Tuning

Evaluation

Potential Impact

Get Involved!

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages