Skip to content

Commit

Permalink
EES- 4677 Create Build Pipeline (#28)
Browse files Browse the repository at this point in the history
EES- 4677 Create PR and CI Build Pipelines
  • Loading branch information
sambiramairelogic authored Dec 7, 2023
1 parent 5aa0620 commit add8fa0
Show file tree
Hide file tree
Showing 46 changed files with 5,381 additions and 1,181 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ data/qdrant_storage
__pycache__/
*.py[cod]
*$py.class
data_ingestion_tests/test-output.xml
response_automater_tests/test-output.xml

# OS generated files #
.DS_Store
Expand All @@ -18,7 +20,9 @@ __pycache__/
ehthumbs.db
Thumbs.db

# Typescript #
# Frontend #
chatbot-ui/.next/
node_modules
tsconfig.tsbuildinfo
chatbot-ui/coverage/
chatbot-ui/junit.xml
1 change: 1 addition & 0 deletions .prettierrc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"trailingComma": "all",
"endOfLine": "lf",
"tabWidth": 2,
"semi": true,
"singleQuote": true
Expand Down
4 changes: 0 additions & 4 deletions .vscode/extensions.json

This file was deleted.

6 changes: 0 additions & 6 deletions .vscode/settings.json

This file was deleted.

2 changes: 2 additions & 0 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ qdrant-client = "==1.5.4"

[dev-packages]
pytest = "==7.4.2"
pytest-cov = "==4.1.0"
pytest-azurepipelines = "==1.0.5"
isort = "==5.12.0"
black = "==23.9.1"
flake8 = "==6.1.0"
Expand Down
1,456 changes: 806 additions & 650 deletions Pipfile.lock

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,18 @@ This is a repository for a prototype of a chatbot for the Department for Educati

The app is powered by embeddings so that when a user inputs a query, the relevant parts of the knowledge base are returned and then the app calls the openai api to answer the question.

The tech stack on the backend is the python framework fastapi and the vector database Qdrant.FastApi is a fast, modern framework for building APIs in python. For more information about FastApi visit their [documentation](https://fastapi.tiangolo.com/). Langchain is used to query the Qdrant and interact with the openai api. For more information about langchain visit their [documentation](https://python.langchain.com/en/latest/index.html). For more information on qdrant please visit their [documentation](https://qdrant.tech/documentation/).
The tech stack on the backend is the python framework fastapi and the vector database Qdrant. FastApi is a fast, modern framework for building APIs in python. For more information about FastApi visit their [documentation](https://fastapi.tiangolo.com/). Langchain is used to query the Qdrant and interact with the openai api. For more information about langchain visit their [documentation](https://python.langchain.com/en/latest/index.html). For more information on qdrant please visit their [documentation](https://qdrant.tech/documentation/).

The frontend tech stack is next.js and typescript although this is subject to change.

## App structure

There are three projects contained within this repository, a next.js frontend UI project, a fastapi server for the data ingestion, and fastapi server for the backend which are in the `chatbot-ui`, `data_ingestion` and `response_automater` folders respectively.

The fastapi server for data ingestion has various endpoints to build, rebuild and delete different parts of the vector database, qdrant. To build the database information is extracted from the content apis from the explore-education-statistics service and chunked into smaller units of text. Via the openai and qdrant apis these pieces of text are converted into vector embeddings and subsequently stored in the qdrant vector database. The endpoint to build the database is **.../api/maintenance/publications/build** which is contained in the **data_ingestion/routers/maintenance.py** file. This can be used to build or rebuild all the information from the latest publications in the qdrant vector database. There are also endpoints for building information relating to the methodologies and to delete the embeddings stored within the database contained in the same file. The other two files within the router directory, `publications.py` and `methodologies.py` have endpoints for updating a specific publication or methodologies within the qdrant database. For example, if there was a new release of attendance publication, a post request to the **.../pupil-attendance-in-schools/update** could be triggered.
The fastapi server for data ingestion has various endpoints to build, rebuild and delete different parts of the vector database, qdrant. To build the database, information is extracted from the content apis from the explore-education-statistics service and chunked into smaller units of text. Via the openai and qdrant apis these pieces of text are converted into vector embeddings and subsequently stored in the qdrant vector database. The endpoint to build the database is **.../api/maintenance/publications/build** which is contained in the **data_ingestion/routers/maintenance.py** file. This can be used to build or rebuild all the information from the latest publications in the qdrant vector database. There are also endpoints for building information relating to the methodologies and to delete the embeddings stored within the database contained in the same file. The other two files within the router directory, `publications.py` and `methodologies.py` have endpoints for updating a specific publication or methodologies within the qdrant database. For example, if there was a new release of attendance publication, a post request to the **.../pupil-attendance-in-schools/update** could be triggered.


The latter fastapi server exposes the Qdrant, openai and langchain apis which en. This means when a user inputs a question into the app, the question is sent to the **.../api/chat** endpoint. Here the question is converted into a vector embedding. Based on the cosine similarity of this embedding with the embeddings in the vector database, the three most relevant chunks of the vector database are returned. How the api responds is governed by prompt template (contained in `utils.py`) and the `services.message_service.py`. The latter contains a send_message function which encompasses the logic for interacting with th qdrant, openai and langchain apis and allows the endpoint to send a response as an event stream.
The latter fastapi server exposes the Qdrant, openai and langchain apis. This means when a user inputs a question into the app, the question is sent to the **.../api/chat** endpoint. Here the question is converted into a vector embedding. Based on the cosine similarity of this embedding with the embeddings in the vector database, the three most relevant chunks of the vector database are returned. How the api responds is governed by prompt template (contained in `utils.py`) and the `services.message_service.py`. The latter contains a send_message function which encompasses the logic for interacting with the qdrant, openai and langchain apis and allows the endpoint to send a response as an event stream.

## Prerequisites

Expand All @@ -32,8 +32,8 @@ The latter fastapi server exposes the Qdrant, openai and langchain apis which en

1. Clone the repo
```bash
git clone https://github.com/joesharratt1229/EES_GPT.git
cd EES_GPT
git clone https://github.com/dfe-analytical-services/chatbot-prototype.git
cd chatbot-prototype
```

2. Install [pnpm](https://pnpm.io) if you haven't already:
Expand Down
55 changes: 53 additions & 2 deletions chatbot-prototype.code-workspace
Original file line number Diff line number Diff line change
@@ -1,16 +1,67 @@
{
"folders": [
{
"path": "."
"name": "Root",
"path": "./"
},
{
"name": "Chatbot UI",
"path": "./chatbot-ui"
},
{
"name": "Response Automater",
"path": "./response_automater"
},
{
"name": "Response Automater Tests",
"path": "./response_automater_tests"
},
{
"name": "Data Ingestion API",
"path": "./data_ingestion"
},
{
"name": "Data Ingestion API Tests",
"path": "./data_ingestion_tests"
}
],
"settings": {
"window.title": "Chatbot Prototype",
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter"
},
"[typescript]": {
"editor.defaultFormatter": "esbenp.prettier-vscode"
},
"editor.formatOnSave": true,
"editor.formatOnPaste": true,
"editor.formatOnSaveMode": "modificationsIfAvailable",
"editor.defaultFormatter": "esbenp.prettier-vscode",

"search.exclude": {
// Avoid polluting search results with lockfile content
"Pipfile.lock": true,
"pnpm-lock.yaml": true
},
// Ensure VSCode uses pnpm instead of npm
"npm.packageManager": "pnpm"
"npm.packageManager": "pnpm",
"jest.autoRun": "off",
"jest.disabledWorkspaceFolders": [
"Root",
"Response Automater",
"Response Automater Tests",
"Data Ingestion API",
"Data Ingestion API Tests"
]
},
"extensions": {
"recommendations": [
"ms-python.python",
"orta.vscode-jest",
"esbenp.prettier-vscode",
"ms-python.black-formatter",
"ms-azuretools.vscode-docker",
"ms-azure-devops.azure-pipelines"
]
}
}
19 changes: 18 additions & 1 deletion chatbot-ui/.eslintrc.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
{
"extends": ["next/core-web-vitals", "plugin:prettier/recommended"]
"extends": [
"plugin:@typescript-eslint/recommended",
"plugin:react/recommended",
"plugin:react-hooks/recommended",
"plugin:jsx-a11y/recommended",
"plugin:import/typescript",
"plugin:prettier/recommended"
],
"rules": {
"react/react-in-jsx-scope": "off",
"jsx-a11y/anchor-is-valid": "off",
"no-console": "warn"
},
"settings": {
"react": {
"version": "detect"
}
}
}
15 changes: 15 additions & 0 deletions chatbot-ui/.vscode/tasks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"version": "2.0.0",
"tasks": [
{
"label": "Run Chatbot UI Tests",
"type": "shell",
"command": "pnpm run test",
"group": "test",
"presentation": {
"reveal": "always",
"panel": "new"
}
}
]
}
8 changes: 8 additions & 0 deletions chatbot-ui/components/LoadingDots.test.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import { render } from '@testing-library/react';
import LoadingDots from '@/components/LoadingDots';

describe('Loading Dots', () => {
it('Renders', () => {
render(<LoadingDots color="red" />);
});
});
30 changes: 30 additions & 0 deletions chatbot-ui/components/Page.test.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import { render, screen } from '@testing-library/react';
import Page from '@/components/Page';

describe('Page', () => {
it('Renders', () => {
render(<Page title={'Test Page Title'} />);

expect(screen.getByRole('main')).toBeInTheDocument();
});

it('Renders a title and caption if provided', () => {
render(<Page title="Test Page Title" caption="Test Page Caption" />);

expect(
screen.getByRole('heading', { name: 'Test Page Title' }),
).toBeInTheDocument();

expect(screen.getByText('Test Page Caption')).toBeInTheDocument();
});

it('Renders children if provided', () => {
render(
<Page title={'Test Page Title'}>
<span>This is some child content</span>
</Page>,
);

expect(screen.getByText(/This is some child content/)).toBeInTheDocument();
});
});
2 changes: 1 addition & 1 deletion chatbot-ui/components/Page.tsx
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import React, { ReactNode } from 'react';
import { ReactNode } from 'react';
import PageBanner from './PageBanner';
import PageFooter from './PageFooter';
import PageHeader from './PageHeader';
Expand Down
19 changes: 19 additions & 0 deletions chatbot-ui/components/PageBanner.test.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import { render, screen } from '@testing-library/react';
import PageBanner from '@/components/PageBanner';

describe('Page Banner', () => {
it('Renders', () => {
render(<PageBanner />);

expect(screen.getByText(/This is a new service/)).toBeInTheDocument();
});

it('Displays a link for providing feedback', () => {
render(<PageBanner />);

expect(screen.getByRole('link', { name: 'feedback' })).toHaveAttribute(
'href',
'#',
);
});
});
2 changes: 0 additions & 2 deletions chatbot-ui/components/PageBanner.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import React from 'react';

const PageBanner = () => {
return (
<div className="govuk-phase-banner">
Expand Down
45 changes: 45 additions & 0 deletions chatbot-ui/components/PageFooter.test.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import { render, screen } from '@testing-library/react';
import PageFooter from '@/components/PageFooter';

describe('Page Footer', () => {
it('Renders', () => {
render(<PageFooter />);

expect(screen.getByRole('contentinfo')).toBeInTheDocument();
});

it('Displays the expected links', () => {
render(<PageFooter />);

const expectedLinks: ExpectedLink[] = [
{ name: 'Cookies', target: '#' },
{ name: 'Privacy notice', target: '#' },
{ name: 'Contact us', target: '#' },
{ name: 'Accessibility statement', target: '#' },
{ name: 'Glossary', target: '#' },
{ name: 'Help and support', target: '#' },
{
name: 'Open Government Licence v3.0',
target:
'https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/',
},
{
name: '© Crown copyright',
target:
'https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/uk-government-licensing-framework/crown-copyright/',
},
];

expectedLinks.forEach((link) => {
expect(screen.getByRole('link', { name: link.name })).toHaveAttribute(
'href',
link.target,
);
});
});
});

interface ExpectedLink {
name: string;
target: string;
}
2 changes: 0 additions & 2 deletions chatbot-ui/components/PageFooter.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import React from 'react';

interface Props {}

const PageFooter = ({}: Props) => (
Expand Down
22 changes: 22 additions & 0 deletions chatbot-ui/components/PageHeader.test.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import { render, screen } from '@testing-library/react';
import PageHeader from '@/components/PageHeader';

describe('Page Banner', () => {
it('Renders', () => {
render(<PageHeader />);

expect(screen.getByRole('banner')).toBeInTheDocument();
});

it('Displays the expected links', () => {
render(<PageHeader />);

expect(
screen.getByRole('link', { name: 'Skip to main content' }),
).toHaveAttribute('href', '#main-content');

expect(
screen.getByRole('link', { name: 'Explore education statistics' }),
).toHaveAttribute('href', '#');
});
});
2 changes: 0 additions & 2 deletions chatbot-ui/components/PageHeader.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import React from 'react';

interface Props {}

const PageHeader = ({}: Props) => (
Expand Down
29 changes: 29 additions & 0 deletions chatbot-ui/components/PageTitle.test.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import { render, screen } from '@testing-library/react';
import PageTitle from '@/components/PageTitle';

describe('Page Title', () => {
it('Renders', () => {
render(<PageTitle title="Test Page Title"></PageTitle>);

expect(
screen.getByRole('heading', { name: 'Test Page Title' }),
).toBeInTheDocument();
});

it('Renders a caption if one is provided', () => {
render(
<PageTitle
title="Test Page Title"
caption="Test Caption Text"
></PageTitle>,
);

expect(screen.getByText('Test Caption Text')).toBeInTheDocument();
});

it('Does not render a caption if none is provided', () => {
render(<PageTitle title="Test Page Title"></PageTitle>);

expect(screen.queryByText('Test Caption Text')).toBeNull();
});
});
2 changes: 0 additions & 2 deletions chatbot-ui/components/PageTitle.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import React from 'react';

interface Props {
caption?: string;
title: string;
Expand Down
17 changes: 17 additions & 0 deletions chatbot-ui/jest.config.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import nextJest from 'next/jest.js';

const createJestConfig = nextJest({
dir: './',
});

/** @type {import('jest').Config} */
const config = {
collectCoverageFrom: ['./**/*.{ts,tsx}', '!./**/*.d.ts'],
setupFilesAfterEnv: ['<rootDir>/setupTests.js'],
verbose: true,
testEnvironment: 'jest-environment-jsdom',
resetMocks: true,
reporters: ['default', 'jest-junit'],
};

export default createJestConfig(config);
Loading

0 comments on commit add8fa0

Please sign in to comment.