Skip to content

GHC Cheat Sheet

Beeke-Marie Nelke edited this page Sep 24, 2023 · 50 revisions

Legend Set Up

Step 1: Gain access to your Workshop Studio AWS account

  1. Open AWS Workshop Studio at https://catalog.us-east-1.prod.workshops.aws/join?access-code=7a77-0eb03a-4e
  2. Choose your preferred sign-in method. For AWS guided events, choose Email OTP.
image
  1. Enter the code provided by the event organizer into the text box. You will usually find this code on a slide that is being shown, or a paper printout at your table. You can skip this step if the link under #1 already contains the access code.
image
  1. Read and agree to the Terms and Conditions by selecting I agree with the terms and conditions and choose Join Event.
image

Step 2: Access the console of your AWS account

  1. Access the console of your AWS account for the event by clicking the link in the sidebar.
image
  1. Go to the Legend GitHub Website in a new browser tab and click on the "Launch Stack" at the bottom of the page.
image

You will be taken to the Quick create stack page.

Step 3: Setup Legend Workbench

  1. Acknowledge and click Create stack.
image image

This takes a few minutes to complete.

Step 4: Launch Legend Workbench and begin exploration

After the CloudFormation stack completes, you are now ready to launch the Legend Workbench URL.

  1. Select outputs tab and click on Studio Url
image

Step 5: Choose Legend project and create workspace

  1. Choose the LegendStarterModel Legend project by clicking on the drop down.
image
  1. Hit the "+" symbol to create a workspace to start working in Legend. You can give the workspace any name you want.
image image

You are all set!

Data Modeling Guide

Edit the Relational Schema (Follow along)

First, let's create a new table in our relational schema so that we can bring in the new surface temperature dataset.

Build Joins (Follow along)

Now that the table schema for surface temperature is defined, we need to join it with the country dataset. We will add a join for the two tables into the relationship schema.

Create a New Class (Let's do this together)

Let's create a new SurfaceTemperature class.

  1. Make sure that the model package is highlighted/selected.
image
  1. Click on the "+" icon and select Class.
image
  1. Name the new Class SurfaceTemperature. All in one word, camelcase. Click "create" when done.
image
  1. Add 3 attributes to your class by hitting the "+" sign: country, year, temperatureChange. Please see screenshot for datatype and cardinality.
image
  1. Hit F9 or click on the Hammer icon at the bottom right to compile your new code.
image

Map the New Class (Let's do this together)

Now, we will map our new _SurfaceTemperature _ Class to our data.

  1. Navigate to the SustainabilityStatisticsMapping in the mapping package.
image
  1. Drag and drop the SurfaceTemperature Class in the Mapping Editor (where also the other Classes are listed). Hit "Create" in the dialog that pops up.
image
  1. Navigate to the store package, and select the SustainabilityStatisticsDatabase. Drag the store into the source Editor on the right-hand side. Select the surface_temperature_change table from the drop down that appears.
image
  1. Drag and drop the columns of the surface temperature datasets (on the right) to the respective attributes (on the left). See screenshot for details.
image
  1. Hit F9 or click on the Hammer icon at the bottom right to compile your new code.
image

Build the relationship between our 2 datasets

  1. Use the Country Class to build the relationship between ProductionBasedCarbonDioxideEmissions Class and the SurfaceTemperature Class. Navigate to the Country Class and add the surfaceTemperature attribute. See screenshot for details.
image
  1. Now update your Country Class Mapping. Navigate to the SustainabilityStatisticsMapping, select the Country Class and map the country_surface_temperature_join to the surfaceTemperature attribute.
image

Visualize in a Diagram (Let's do this together)

  1. Navigate to the "EmissionsModelDiagram" under the model package.
  2. You can now drag and drop the SurfaceTemperature class onto the diagram. You can see that the relationship to the Country class was added for you based on the association that we modeled earlier. Tip: You can move the class box and the relationship lines around to make the diagram pretty.
image

Mini User Guide for Data Queries

Launch queries

To create a query, right-click on any class.

image

Add attributes

Add attributes you want to query by dragging & dropping them into the "fetch structure panel".

image

Apply operations

Apply operations on your data, by clicking on the "sigma" symbol. Options include:

  • Sum: Adds up all values
  • Average: Returns the average
  • Count: Counts values
  • Distinct count: Returns distinct count value
image

Add filter

Add a filter to your query by dragging & dropping an attribute to the "filter" panel. Click on the drop down to see available filter options. Alternatively, you can add a filter by right-clicking on the table values. Note: the filter specs will not be persisted in the query.

image image

Sort query results

To sort query results, navigate to the "result set modifier" and add filter conditions. Alternatively, you can sort the results directly in the table by clicking on the column headers. Note: the filter specs will not be persisted in the query.

image image image

Remove duplicates

To remove duplicate values to create a unique set of values, click on "result set modifier" and check the "Eliminate duplicate rows" box.

image

Restrict number of rows

To specify in your query how many rows you'd like to have returned go to the "result set modifier" and enter the desired value under "Limit results".

image

Query Results

Query 1: Get the number of distinct countries that have production based carbon dioxide emission statistics. image

Query 2: Get the list of distinct countries that have production-based carbon dioxide emission statistics. image image

Get the top 10 countries with highest carbon dioxide emissions in 2021. image image

Query 3: Get top 10 countries by average emissions across all years available in the data set. image image

Resources

Data Model

Last Resort in case you get stuck during the exercises or your Legend instances crashes. Please refer to this document to see the entire data model code (including data queries). You can copy/paste this in your Studio project when hitting F8 to enter text mode and then F9 to compile. Make sure that the any other code is deleted before doing this.

Legend Website

https://legend.finos.org/

Deck

GraceHopper_Final.pdf