Skip to content

Latest commit

 

History

History
482 lines (333 loc) · 34.5 KB

docs.md

File metadata and controls

482 lines (333 loc) · 34.5 KB

DTBase Docs

Summary and Table of Contents

DTBase consists of several subparts, such as the frontend and the backend, found in folders at the root level of the repository and under the dtbase folder. We list them here briefly and further below provide extensive documentation for each in turn.

DTBase as a Python Package

The root directory contains the Python package called dtbase that is installed if you do pip install .. After doing that you should be able to open a Python session from anywhere and do import dtbase. This also installs all the Python dependencies for all the subparts of the repo. The importing structure is such that all of the subparts are free to import from dtbase.core, but none of them should otherwise import from each other. E.g. frontend doesn't import anything from backend, and vice versa.

This is a FastAPI application, providing API endpoints for interacting with the database.

This is a Flask application, providing a basic web interface. This will communicate with the backend using HTTP requests, allowing users to insert sensors, locations, and other objects, do user management, browse model results, call services, and view time-series plots or data tables.

Classes to help one develop their own services, such as BaseModel and BaseIngress.

This is where the code for specific models is located.

This is where code for specific data ingress is located. Data ingress is the act of pulling in data from another source, such as an external API or database, and inserting into the your digital twin database via the backend.

Contains the required code to deploy services (data ingress or models) as Azure Functions.

Utilities used by several of the other parts of the repository.

The infrastructure-as-code configuration using Pulumi, for Azure.

DTBase Backend

Folder: dtbase/backend

The backend is the heart of DTBase. The frontend is just a pretty wrapper for the backend, and all the services are just scripts that call various backend API endpoints. All data is stored in the PostgreSQL database, and the backend is the only recommended way to access that database.

The backend is a web app implemented using FastAPI. It takes in HTTP requests and returns responses.

Code structure

  • run.sh. This is the script you call to run the FastAPI app.
  • create_app.py. A tiny script that calls main.create_app().
  • main.py. The module that defines how the FastAPI app is set up, its settings, endpoints, etc.
  • routers. The API divides into subsections, such as /user for user management and /sensor for sensor data. Each of these is implemented in a separate file in routers.
  • models.py. Whenever two files in routers use the same Pydantic model for some endpoint, that is kept in models.py.
  • database. Everything related to direct access to the PostgreSQL database.
    • structure.py. Defines what are all the tables and their columns and constraints.
    • locations.py/models.py/services.py etc. Provide add/edit/delete functions for all the things stored in the database, such as locations, sensors, sensor data, models, model data, and users.
    • queries.py. More complex SQL queries and queries used by several files.
    • utils.py. Miscellaneous utilities for things like creating new database sessions. Most importantly has too module-level constants, DB_ENGINE and DB_SESSION_MAKER, that are the one-stop-shop of all other modules whenever a connection to the database is needed.
  • exc.py. Custom exception classes used by various modules.
  • auth.py. Everything related to authenticating users. Authentication uses JSON Web Tokens (JWT). See below for how this affects using the API.
  • config.py. Various modes in which the database can be run, e.g. for development or debugging purposes.

API documentation

Documentation listing all the API endpoints and their payloads, return values, etc., is automatically generated by FastAPI. If you are developing/running locally, and your backend is running at http://localhost:5000, you can find these docs at http://localhost:5000/docs. Correspondingly for an Azure deployment it will be something like https://<your-azure-app-name>-backend.azurewebsites.net/docs.

Authentication

To be able to access any of the API end points you need an authentication token. You can get one from the /auth/login endpoint using a username and a password. Once you've obtained a token, you need to add it to header of any other API calls you make as a bearer token. So if /auth/login returned

{
    "access_token": "abc"
    "refresh_token": "xyz"
}

then you would call the other end points with the following in the header of the request:

Authorization: Bearer abc

If your token expires, you can use the refresh token to get a new for some time still, by calling the /auth/refresh end point. This one requires setting your header like above, but using the refresh token (xyz) rather than the access token (abc).

Locations

Locations can be defined using any combination of floating point, integer, or string variables. These variables, known as LocationIdentifiers must be inserted into the database before an actual Location can be entered. The set of LocationIdentifiers that is sufficient to define a Location is called a LocationSchema. A Location will therefore have a LocationSchema, and one LocationXYZValue for each LocationIdentifier within that schema (where XYZ can be Float, Integer or String).

An example clarifies: Say you're making a digital twin of a warehouse. All locations in the warehouse are identified by which room they are in, and which rack and shelf in that room we are talking about. Room number, rack code, and shelf number would then be LocationIdentifiers, and the LocationSchema would simply say that to specify a location, these three variables need to be given. Room and shelf number might be integers, and rack code could be a string. Other examples of location schemas could be xyz coordinates, or longitude-latitude-altitude coordinates.

Sensors

The sensor data model is as follows. Every Sensor has a SensorType which in turn specifies the variable(s) it can measure - these are known as SensorMeasures. Each SensorMeasure specifies its datatype (float, int, string, or bool), and these are used to define the type of the corresponding SensorXYZReadings. A Sensor may also have a SensorLocation, which specifies a Location as defined above, and a time window (possibly open-ended) when the sensor was at that location.

For instance, weather station could be a SensorType, and it might record readings for the three different SensorMeasures: Temperature, humidity, and is-it-raining-right-now. The first two would be numbers, and the last one would be a boolean. These would go in the tables SensorFloatReading and SensorBooleanReading. You could then have two instances of this sensor type, i.e. two weather stations, associated with different locations.

Models

Model objects come associated with ModelMeasures, that are exactly analogous to SensorMeasures, i.e. they specify different quantities a model may output. The model outputs, which can again be floats, ints, strings, or booleans, are always associated with a ModelRun, which comes with a timestamp for when this run of the model happened. Each run is also associated with a ModelScenario, which is DTBase's way of keeping track of model parameters or other variations in how models can be run.

For instance, a model might be a weather forecast model, and a scenario might be which location we are forecasting weather for and which parameters the weather simulation uses. Running the model once would result in a single ModelRun, which could be associated with values for multiple ModelMeasures, such as forecasted temperature and forecasted humidity.

Users

Current user management is quite basic: Users can be created, deleted, and listed, and their passwords can be changed. Users are identified by an email address (there is no other notion of username), although currently we never actually send email to that address.

Currently all users have the same rights, including the right to create and delete users. This is simply because we haven't had time to implement a separation between admin users and regular users yet (issue).

The Default User

When starting a new deployment of a DTBase-based digital twin one encounters a chicken-and-egg dilemma: To be able to create users with the backend, one needs to first have a registered user (the /user/create-user endpoint requires a valid JWT token like every other endpoint). The way out of this is the default user. If one sets the environment variable DT_DEFAULT_USER_PASS and starts the backend, at startup time a user with the "email" default_user@localhost is created, with the given password. One can use this to log in and create some proper users. One should then unset the DT_DEFAULT_USER_PASS environment variable and restart the backend. This causes the default user to be deleted.

Often it's handy to keep the default user around for development purposes, but one should be careful not to leave it on a live deployment with sensitive data stored.

DTBase Frontend

Folder: dtbase/frontend

The DTBase frontend is a web server, implemented using Flask, that serves a graphical user interface for the digital twin.

Anything the frontend can achieve in manipulating the twin it does by calling various backend endpoints. It is essentially a graphical interface for the backend with some basic plotting.

The user interfaces of digital twins tend to be largely bespoke. Hence the DTBase frontend is quite basic, and we expect the user to develop it further to serve their own needs, by for instance implementing data dashboards and visualisations that serve their needs. In developing DTBase, our philosophy has been to lead with the backend, and consider frontend features largely nice-to-have addons. Hence some parts of the frontend lag behind the backend in capabilities, and there are things you can currently only do through the backend.

The frontend is a mixture of Python (Flask), Typescript, CSS, and Jinja HTML templates. In this it differs from the rest of the codebase, which is pure Python. Consequently at the root of dtbase/frontend are configuration files for webpack, eslint, and prettier, plus a package.json that specifies our Typescript dependencies.

Some notable Typescript dependencies are

  • Bootstrap 5 for a grid layout of the pages.
  • Datatables for styling our many tables.
  • Chart.js for plotting.

Code structure

  • run.sh. This is the shell script that starts the web server. It does two things of note:
    • Runs the Typescript compiler, producing Javascript that can be included in the pages sent to the browser, using webpack.
    • Runs the Flask app. If the environment variable FLASK_DEBUG is set to 1 both of these are run in "watch mode", where they watch for updates to local files and recompile/restart as necessary. This is handy when developing.
  • frontend_app.py. A simple script that calls create_app from app/__init__.py.
  • app/__init__.py. Module for setting up the Flask app with all its settings and such.
  • app/home, app/locations, app/models, app/sensors, etc. Each of these has the code for the pages in that subsection for the site. They all have a routes.py that defines the Flask routes, and a templates folder for the Jinja templates.
  • app/base. Various resources shared by the different sections of the site. Most notably
    • templates. All the base Jinja templates for things like the sidebar, the login page, and error pages.
    • static. All the CSS and Typescript. See below for more details.
    • static/node_modules. A symbolic link from dtbase/frontend/node_modules. This way all the Typescript dependencies are under the static folder and thus easy to load and import from.
  • exc.py. Custom exception types used by the frontend.
  • config.py. Configuration options like debug mode and dev mode for the frontend.
  • user.py. Everything related authentication on the frontend. We use the flask-login plugin to handle logins. Logging in simply means getting a JWT token from the backend, and a user is logged in as long as there's a valid JWT token associated with their session.
  • utils.py. Miscellaneous Python utilities.

Our Approach to Typescript and Javascript

The vast majority of client-side code is written in Typescript, and it should be in the /app/base/static/typescript folder as .ts files. Webpack, which gets run by run.sh when starting the frontend webserver, sorts out dependencies and transpiles the Typescript into .js files in the /app/base/static/javascript folder. There will be one .js file for every .ts file. The Jinja HTML templates can then include these transpiled Javascript files using <script> tags.

The only pure, non-typed Javascript one should ever write should be minimal amounts in <script> tags in the Jinja templates. The reason we do this at all is that Flask passes some data to the Jinja templates which needs to be further passed onto functions we've written in Typescript. The typical usage pattern looks something like this. In the HTML template we have

{% block javascripts %}
{{ super() }}
<script src="{{ url_for('static', filename='javascript/sensor_list_table.js') }}"></script>
<script>
  const sensors_for_each_type = {{ sensors_for_each_type | tojson | safe }};
  window.addEventListener("DOMContentLoaded", (event) => {
    window.updateTable(sensors_for_each_type);
  });
</script>
{% endblock javascripts %}

The first <script> tag includes a file transpiled from sensor_list_table.ts. The second <script> tag includes a small snippet that

  1. Reads the data passed to us by Flask into a variable sensors_for_each_type.
  2. On page load calls the function window.updateTable defined in sensor_list_table.ts with sensors_for_each_type.

sensor_list_table.ts looks like this:

import { initialiseDataTable } from "./datatables"

export function updateTable(sensors_for_each_type) {
  // blahblah, bunch of things happen here
}

window.updateTable = updateTable

It imports from another module we've written using the ES6 import syntax and defines updateTable. It then effectively "exports" this function to be visible in the global scope, and thus usable in the above snippet in the Jinja template, by assigning it to window.updateTable. (It also exports it in the ES6 exports sense, so that other Typescript modules can use it.)

When reading the .ts files, anything assigned to a field of window is to be used in the templates. All imports/exports between the typescript files should use the ES6 syntax. The Typescript files should not ever assume the presence of any global variables.

DTBase Services

Folder: dtbase/services

What Is a Service?

Services are DTBase's concept for things like scripts to do data ingress and to run models. Anything that one might run either on demand or on a schedule, that would interact with the backend API programmatically.

The way DTBase views a service is a bit abstract but very simple: A service is a URL that DTBase may send a HTTP request to, possibly with a payload. Sending such a request means requesting a service to run. A URL might be for instance https://myownserver.me/please-run-my-best-model, and the payload might be the model parameters. The service is then expected to send requests back to the DTBase backend for things like writing model outputs or ingressed data.

Services are quite a new construct in DTBase (as of 2024-03-08), and remain work in progress. They can currently be created and run using both the backend and the frontend. For more documentation see the backend and frontend for more details on how that is implemented. There are three important features we are planning to build but haven't gotten to:

  • Scheduling services. Currently the only option is to "run now". One should be able to rather say "run every Wednesday morning using these parameters, and every Saturday morning using these other parameters".
  • Allow services to return non-trivial values. Currently any data that needs to be written into the database has to be done by the service calling the DTBase backend end points. This will always be a common way for a service to work, because they might take a while to run, but we would also like to support the service returning an HTTP response with the data. Currently we log the HTTP response, but don't do anything else with it.
  • Make it easier to send data from the database to a service. Currently all you can send to a service is a fixed set of parameters in the JSON payload. If the service wants any data from the backend, it needs to call an end point such as /sensor/sensor-readings. We would rather like to be able to configure the calling of the service so that this data could be part of the request payload.

How To Write a Service?

To write your own service, you need to write a piece of code that takes in an HTTP request, and in response to that does its thing (ingress, modelling, whatever it is) and sends the output to the relevant DTBase backend endpoint. To make this easier, in base.py we have implemented three classes for you to subclass: BaseService, BaseModel, and BaseIngress. The latter two are subclasses of BaseService, and they are in fact all funtionally equivalent, the only difference is in the documentation, where one is geared more towards implementing data ingress and the other towards implementing a model. They make this process easier, by handling correct formatting of the DTBase backend request, most importantly involving authentication tokens.

For more details on how to use these classes, refer to the detailed docstrings of the classes and their methods. You can also see examples of how to use these classes in the models and ingress sections.

DTBase Models

Folder: dtbase/models

This folder hosts two general purpose timeseries forecasting models, ARIMA and HODMD. They work both as useful additions to many digital twins and as examples for how to implement a model that interfaces with DTBase.

The way to implement your own model is to use the BaseModel class as described below. We recommend also reading services section, since BaseModel is just an instance of BaseService, described there.

BaseModel

The BaseModel class has all the tools for interacting with the DTBase backend. It inherits from BaseService. For a custom model, the user should create their own custom model class inheriting from BaseModel. For example:

class CustomModel(BaseModel):
    """
    Custom model inheriting from the BaseModel class.
    """

Method: get_service_data

The user then needs to write a get_service_data method in CustomModel. This method should run the model and return the outputs in a particular format, which BaseModel will then submit to the the DTBase backend. This might look something like

    def get_service_data(model_parameters):
        model = get_model()
        some_data = get_data()
        predictions = model.predict(model_parameters, some_data)
        return predictions

The structure of predictions, i.e. the return value of get_service_data, should be in the following format:

[(endpoint name, payload), (endpoint name, payload), etc.]

Here endpoint name is a string that is the name of a DTBase API endpoint, and payload is a dictionary or a list that is the payload that that endpoint expects. For models, the endpoints that likely need to be returned are:

  • /model/insert-model
  • /model/insert-model-scenario
  • /model/insert-model-measure
  • /model/insert-model-run

Even though calls to endpoints like insert-model and insert-model-measure only need to be run once, it is safe to make every run of the model call those endpoints. If the model/measure/scenario already exists in the database the backend will just ignore the attempt to write a duplicate, and return a 409 status code, which BaseModel handles for you.

Calling the Model

The model can then be called like this:

cm = CustomModel()
cm(
    model_parameters=my_favourite_parameters,
    [email protected],
    dt_user_password="this is a very bad password",
)

The dt_user_email and dt_user_password arguments are for user credentials that can be used to log into the DTBase backend. Any other keyword arguments, like in this case model_parameters, are passed onto the get_service_data function. Note that these have to be passed as keyword arguments, positional ones won't work.

The URL for the DTBase backend with which BaseModel communicates is set by an environment variable called DT_BACKEND_URL. So on Linux/Mac you would call your model script as something like

DT_BACKEND_URL="http://myownserver.runningdtbase.com" python my_very_own_model_running_script.py

DTBase Ingress

Folder: dtbase/ingress

This section details how to write your own data ingress in DTBase using OpenWeatherMap as an example.

BaseIngress

The BaseIngress class has all the general purpose tools for interacting with the backend. It inherits from BaseService. For a custom data ingress, the user should create their own custom data ingress class inheriting from BaseIngress. For example:

class CustomDataIngress(BaseIngress):
    """
    Custom ingress class inheriting from the BaseIngress class.
    """

Method: get_service_data

The user then needs to write a get_service_data method in the CustomDataIngress. This method should extract data from a source (could be an online API or from disk or anywhere) and return the data:

    def get_service_data():
        api_url = 'example/online/api/endpoint'
        data = request.get(api_url)
        return_value = process_data(data)
        return return_value

The structure of return value should be as follows:

[(endpoint name, payload), (endpoint name, payload), etc.]

Here each endpoint name is a string for the name of a DTBase backend endpoint. Each payload should be in the specific format required by that endpoint. For more details about the backend endpoints see the backend section.

For example, if we would like to insert two different types of sensor readings, then the output of get_service_data should look something like this:

[
    (
        '/sensor/insert-sensor-readings',
        {
            "measure": {
                "name": <name of the first sensor measure>,
                "units": <units for the sensor measure>
            },
            "unique_identifier": <sensor unique identifier>,
            "readings": <list of readings>,
            "timestamps": <list of timestamps>
        },
    ),
    (
        '/sensor/insert-sensor-readings',
        {
            "measure": {
                "name": <name of the second sensor measure>,
                "units": <units for the sensor measure>
            },
            "unique_identifier": <sensor unique identifier>,
            "readings": <list of readings>,
            "timestamps": <list of timestamps>
        },
    ),
]

Calling the Ingress Class

The ingress class can then be called like this:

ingresser = CustomDataIngress()
ingresser(
    [email protected],
    dt_user_password="this is a very bad password",
)

The user credentials can be used to log in to the DTBase backend. Any extra keyword arguments passed when calling ingresser are passed onwards to get_service_data.

The URL for the DTBase backend with which BaseIngress communicates is set by an environment variable called DT_BACKEND_URL. So on Linux/Mac you would call your model script as something like

DT_BACKEND_URL="http://myownserver.runningdtbase.com" python my_very_own_ingress_script.py

Behind the scenes calling ingresser

  1. Logs into the backend
  2. Runs the get_service_data method to extract data from a source
  3. Loops through the return value of get_service_data and posts it to the backend.

OpenWeatherMap Example

This section will now go through the ingress_weather example in detail.

The goal of the weather ingress is to extract data from the OpenWeatherData API and enter it into our database via the backend. There are two different APIs depending on whether the user wants historical data or forecasting.

1. Define Payloads

Three constants are defined at the top of ingress_weather.py:

  • SENSOR_TYPE: Define the sensor type as detailed by the /sensor/insert-sensor-type API endpoint.
  • SENSOR_OPENWEATHERMAPHISTORICAL: Define the sensor as detailed by the /sensor/insert-sensor API endpoint. This sensor is for the historical data API.
  • SENSOR_OPENWEATHERMAPFORECAST: Same as previous sensor but for forecasting.

These constants are dictionaries and define the sensor type and sensor payloads. The payload for entering sensor-readings is built in the get_service_data method.

2. OpenWeatherDataIngress

We then write a custom class that inherits from BaseIngress. There are a number of _* methods that are used to handle different combinations of start and end dates given by the user. A lot of this complexity comes from there being two different APIs for historical and forecast data.

The important method is get_service_data. This method takes in dt_from and dt_to arguments to define when the user wants to extract information from the API. There is then some specific preprocessing to get the exact data we want from the API.

Finally, we return data in this format:

sensor_type_output = [("/sensor/insert-sensor-type", SENSOR_TYPE)]

sensor_output = [("/sensor/insert-sensor", sensor_payload)]

sensor_readings_output = [
            ("/sensor/insert-sensor-readings", payload) for payload in measure_payloads
        ]

return sensor_type_output + sensor_output + sensor_readings_output

Note that the get_service_data method must returns a list of tuples structures as (endpoint name, payload) for the ingress method to integrate into the rest of DTBase.

3. Uploading data to database

After writing your own custom get_service_data method, we can then call it to do the ingress. For this to work the backend must be running. Please check the developer docs for how to run the backend locally.

The ingress is run simply done by calling

weather_ingress = OpenWeatherDataIngress()
weather_ingress(
    [email protected],
    dt_user_password="this is a very bad password",
    dt_from=dt_from,
    dt_to=dt_to,
)

Under the hood the class finds the get_service_data method, runs the get_service_data method, and then calls the backend API to upload the data to the database. It handles authentication and error handling. This method accepts as keyword arguments any arguments required by get_service_data.

Note: It uses environment variables for API keys and authentication so ensure you have the correct variables set. For instance, for the OpenWeatherDataIngress, the environment variables you'll need to set are DT_OPENWEATHERMAP_APIKEY and DT_OPENWEATHERMAP_LAT and DT_OPENWEATHERMAP_LONG for the latitude and longitude for the weather location. And of course DT_BACKEND_URL to specify where the backend is to be found (unless it's localhost, which is the default value).

The example_weather_ingress function shows how to use this code to ingress weather data.

DTBase Functions

Folder: dtbase/functions

DTBase supports services as any callable API endpoints that send data to the backend when called. These services do not need to be hosted as part of the same deployment as the database, backend, and frontend of DTBase. However, when implementing your own services, it is often handy to be able to host them as part of the same deployment. For that, we use Azure Functions, Azure's concept for serverless compute that runs on demand and releases resources once it's done running. We recommend reading a bit about Azure Functions before trying to understand this code in detail.

This folder holds all the code necessary for running various services as Azure functions. The code here should be minimal in functionality, and should merely implement the necessary glue bits to have an Azure function e.g. run a model or do ingress. The actual model or ingress code should be in their respective folders.

To test the functions locally, navigate to dtbase/functions and run

func start

You'll need the Azure command line tools for that. You can install them with

brew tap azure/functions
brew install azure-functions-core-tools@4

Examples

Currently there are three example functions in this folder: ARIMA, HODMD, and weather ingress. You can see how their __init__.py files handle interfacing with Azure and base your code on their examples. Any folder under dtbase/functions with a function.json file will be recognised as an Azure function and included in the Docker container built by the GitHub Action.

DTBase Core

Folder: dtbase/core

This folder holds all code that is used by more than one part of the DTBase package. The main parts of DTBase, such as frontend, backend, and services, should never import anything from one another. If they share any code, that code should be in core.

Currently the only things here are:

  • exc.py for some custom exception types we raise.
  • utils.py for miscellaneous utils, mostly for making calling the backend API endpoints smoother.
  • constants.py which reads in a large number of environment variables that are considered package-level constants. These include things like the URL for the backend and the password of the default user.

DTBase Infrastructure

Folder: infrastructure

DTBase has several semi-independent parts that talk to each other, as explained in the above sections of this file. While you can run all of them locally on a single machine, for a more "production" style deployment you may want to host them on a cloud platform. To deploy DTBase on the cloud, you could make the various resources manually: A PostgreSQL database, a couple of web servers for the frontend and the backend, etc. However, for convenience, maintainability, and reproducibility you should probably rather define the resources you need using an infrastructure-as-code (IaC) tool. That means writing a piece of code that defines the cloud resources that you need, various configuration options for them, and how they are connected, and letting the IaC tool then create these resources for you. If you want to change a configuration option, like increase you database disk allocation or add a firewall rule, you can do this in your configuration file and tell the IaC tool to update the cloud resources correspondingly.

This folder has our IaC configuration using a tool called Pulumi. Pulumi configuration files are written in Python, in this case in a file called __main__.py. Global Pulumi configuration is in Pulumi.yaml. For every deployed instance of DTBase managed by Pulumi, which Pulumi calls stacks, there is also a file called Pulumi.name-of-stack-goes-here.yaml, which holds configuration options for that stack. You may have multiple stacks for instance for development and production deployments.

We would like to support multiple cloud platforms, but for historical reasons we currently only support Azure. Hopefully converting __main__.py to use e.g. AWS or GCloud instead should not be too hard.

Set-up Steps

To get a Pulumi stack of DTBase up and running, these are the steps to follow.

  1. Install Pulumi if you haven't yet. See https://www.pulumi.com/
  2. Create an Azure storage account. This storage will only be used to hold Pulumi backend state data. If you have multiple stacks they can all use the same storage account.
  3. Set AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY in .secrets/dtenv.sh. AZURE_STORAGE_ACCOUNT is the name of the storage account you created, AZURE_STORAGE_KEY can be found in the Access Keys of that account. You also need to add the line export AZURE_KEYVAULT_AUTH_VIA_CLI="true" to dtenv.sh if its not there already.
  4. Create a blob storage container within the storage account.
  5. In a terminal run source .secrets/dtenv.sh and then pulumi login azblob://<NAME OF STORAGE CONTAINER>. Note that this affects your use of Pulumi system-wide, you'll have to login to a different backend to manage different projects.
  6. Create an Azure Key Vault. This will hold an encryption key for Pulumi secrets. This, too, can be used by multiple Pulumi stacks.
  7. In the key vault, create an RSA key (yes, it has to be RSA rather than ECDSA).
  8. Give yourself Encrypt and Decrypt permissions on the key vault.

You are now ready to create a new stack. If you ever need to create a second stack, you will not need to repeat the above steps, only the ones below.

  1. Create a new Pulumi stack with pulumi stack init --secrets-provider="azurekeyvault://<NAME OF KEY VAULT>.vault.azure.net/keys/<NAME OF KEY>"
  2. Make sure you're in a Python virtual environment with Pulumi SDK installed (pip install .[infrastructure] should cover your needs).
  3. Set all the necessary configurations with pulumi config set and pulumi config set --secret. You'll find these in __main__.py, or you can keep adding them until pulumi up stops complaining. Do make sure to use --secret for any configuration variables the values of which you are not willing to make public, such as passwords. You can make all of them --secret if you want to play it safe, there's no harm in that. These values are written to Pulumi.name-of-our-stack.yaml, but if --secret is used they are encrypted with the key from your vault, and are unreadable gibberish to outsiders.
  4. Run pulumi up to stand up your new Pulumi stack.
  5. Optionally, you can set up continuous deployment for the webservers and Azure Functionapp. To do this for the frontend, select your frontend WebApp in the Azure Portal, navigate to Deployment Center, and copy the generated Webhook URL; then, head to Docker Hub, select the container used by the WebApp, and create a new webhook using the copied URL. You need to do this for each of the three WebApps: The frontend, the backend, and the function app. This makes it such that every time a new version of the container is pushed to Docker Hub (by e.g. the GitHub Action) the web servers automatically pull and run the new version.