Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generation of SDMX objects from VTL Script #67

Open
javihern98 opened this issue Feb 10, 2025 · 0 comments
Open

Add generation of SDMX objects from VTL Script #67

javihern98 opened this issue Feb 10, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@javihern98
Copy link
Contributor

javihern98 commented Feb 10, 2025

Overview

As part of the PySDMX integration the goal is to generate the VTL objects from the SDMX Information model from a VTL Script

Tasks to implement

These tasks will be performed in two parallel steps:

  • Serialization of AST Object into a String VTL Script (will be done in Add prettier method to API #70 )
  • Generation of pysdmx objects from AST objects. At some point, the code will call the function ast_to_str. The return of this function will be the information we will need to add to the "rulesetDefinition/operatorDefinition" or "expression" attributes of the pysdmx objects to be generated.

API function

Implement a new function called generate_sdmx that gets as input a VTL Script (a full Transformation Scheme with Datapoint Rulesets, Hierarchical Rulesets and User Defined Operators, plus a set of Transformations).

For each SDMX object, we will need to specify the same agency_id and version where necessary, which we will take from the signature.

We need to ensure we are able to generate an Structures file in any supported format, using pysdmx.io.format.Format. We will write the file only if the output_path is not None, similar to other functions.

The function will return a TransformationScheme Object, or None

from pysdmx.io.format import Format
from pysdmx.model.vtl import TransformationScheme

def generate_sdmx(
    script: str,
    agency_id: str,
    version: str = "1.0",
    output_path: Optional[Union[str, Path]] = None
    format: Format.STRUCTURE_SDMX_ML_2_1) -> Optional[TransformationScheme]:
   ...

Internal functions

We will need to define a function called ast_to_sdmx that takes as input the Start element of AST. The purpose of this function is to return the generated pysdmx objetcts.

def ast_to_sdmx(ast: Start, agency_id: str, version: str = "1.0") -> TransformationScheme:
   # Takes as input the very first element of AST
   # Each ast.children will be a DatapointRuleset, a HierarchicalRuleset, a UDO or a Transformation
    ....

Note

Suggest to generate internal methods to generate a Ruleset, UDO or Transformation

Definition of pysdmx objects

Important

All of the below objects are part of pysdmx.model.vtl, do not use any of the VTLEngine classes as the generated objects

Link to definition (see docs on each object): https://github.com/bis-med-it/pysdmx/blob/develop/src/pysdmx/model/vtl.py

  • UserDefinedOperatorScheme (0..1): includes the UserDefinedOperator objects. Do not associate any ruleset_scheme to any UserDefinedOperatorScheme.
  • RulesetScheme (0..1): includes the Ruleset objects. Define at Ruleset.rulesetType if it is a "datapoint" or a "hierarchical" ruleset
  • TransformationScheme: includes all above objects plus the set of Transformation, (at the items attribute). For each Transformation, we will set the isPersistent attribute to True if the transformation has a Persistent Assignment (<-). The Transformation result will be the left child of the Assignment. The Transformation expression will be the remaining expression (right child of the Assignment) as a string.

Important

The AST Template used for this method should only take into account the first assignment and the UserDefinedOperator and DatapointRuleset and HierarchicalRuleset definition. The remaining objects should be added as a string to the expression for each Transformation.

Important

Do not add a semicolon ";" at the end of the expression on each Transformation to avoid issues with FMR

Note

The agency_id and version for each object are part of the method arguments. The id of each object should be the first (or first and second) letter and a number.

Example: Transformation -> T1. RulesetScheme: RS1

Important

Remaining VTL objects (CustomType, NamePersonalisation, VTLMapping) are not in the scope of this issue.

Example on SDMX 2.1 (cannot be read due to missing parsers on pysdmx, just use it as reference):
https://fmr.meaningfuldata.eu/sdmx/v2/structure/structure/MD/all/+/?format=sdmx-2.1&prettyPrint=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants