Add yaml writer test to compare yaml file generated by RMG and the ya… #2770

LekiaAnonim · 2025-03-03T21:04:10Z

Motivation or Problem

Disparities were observed in the Ea, b, and A values for gas and surface reactions when comparing Cantera YAML files generated directly versus those converted from Chemkin files using the cantera.ck2yaml command. A script and corresponding tests have been written to facilitate this comparison.

Description of Changes

Added a new test directory containing a script for comparing the two different YAML files.
Created a test file for validating the comparison process.
Added test data to the YAML files located in the yaml_writer_data directory.
- The directly generated YAML file is located in /test/rmgpy/test_data/yaml_writer_data/cantera/.
- The YAML file converted from Chemkin files is located in /test/rmgpy/test_data/yaml_writer_data/chemkin/.

Testing

A test script was written to compare species names, the number of species, reaction parameters, and the number of reactions in the YAML files generated by both methods.

Reviewer Tips

To review this PR, run the tests in /RMG-Py/test/rmgpy/yaml_writer/test_yaml.py. Note: You may need to adjust the file paths in test_yaml.py to match your local setup.

…lder. So they don't over-write the ones that are being written directly.

This allows the Cantera yaml writer to live alongside it

Nora did most of the work. Richard did some cleanup. Nick fixed species_to_dict function to use correct parameter as rxn species. Co-authored-by: Nora Khalil <[email protected]> Co-authored-by: Richard West <[email protected]> Co-authored-by: Nicholas Tietje <[email protected]>

Although Atom is a subclass of Vertex, it adds an element attribute. This leads to more efficient Cython code (and quietens a warning in vscode).

Basing it on the Chemkin version. For now only evaluate it once, and include everything.

…ml file converted from chemkin files

Copilot

Pull Request Overview

This PR adds a suite of tests and supporting utilities to compare YAML files generated by two different methods (direct Cantera generation versus Chemkin conversion) to ensure consistency in reaction and species data.

Introduces new tests under test/rmgpy/yaml_writer/test_yaml.py for validating species counts, species names, phase-specific species counts, and reaction details.
Implements helper classes (YamlAnalyst and CompareYaml) in test/rmgpy/yaml_writer/compare_yaml_outputs.py to load, analyze, and compare the YAML files.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
test/rmgpy/yaml_writer/test_yaml.py	Adds test cases for comparing YAML outputs.
test/rmgpy/yaml_writer/compare_yaml_outputs.py	Introduces classes to load, parse, and compare YAML file content.

Copilot · 2025-05-22T02:29:58Z

test/rmgpy/yaml_writer/compare_yaml_outputs.py

+        for key, values in self.load_yaml_file().items():
+            if key in [f"{phase['name']}-reactions" for phase in self.load_yaml_file()['phases']]:
+                reactions_dict[key] = self.load_yaml_file()[key]
+            elif key == 'gas_reactions':
+                reactions_dict['gas_reactions'] = self.load_yaml_file()['gas_reactions']
+            elif key == 'surface_reactions':
+                reactions_dict['surface_reactions'] = self.load_yaml_file()['surface_reactions']


Repeatedly calling self.load_yaml_file() within the loop may lead to unnecessary file reads. Consider loading the YAML content once and caching it to improve performance.

Suggested change

for key, values in self.load_yaml_file().items():

if key in [f"{phase['name']}-reactions" for phase in self.load_yaml_file()['phases']]:

reactions_dict[key] = self.load_yaml_file()[key]

elif key == 'gas_reactions':

reactions_dict['gas_reactions'] = self.load_yaml_file()['gas_reactions']

elif key == 'surface_reactions':

reactions_dict['surface_reactions'] = self.load_yaml_file()['surface_reactions']

yaml_content = self.load_yaml_file() # Cache the YAML content

for key, values in yaml_content.items():

if key in [f"{phase['name']}-reactions" for phase in yaml_content['phases']]:

reactions_dict[key] = yaml_content[key]

elif key == 'gas_reactions':

reactions_dict['gas_reactions'] = yaml_content['gas_reactions']

elif key == 'surface_reactions':

reactions_dict['surface_reactions'] = yaml_content['surface_reactions']

Copilot · 2025-05-22T02:29:58Z

test/rmgpy/yaml_writer/compare_yaml_outputs.py

+        phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']]
+        phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']]
+        all_phase_names = set(phase_names1).union(set(phase_names2))
+        count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"], 
+                      'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"]
+                      }


Assuming that the gas and surface phases always appear as the first two elements in the phases list can be brittle. Consider matching phases by name to ensure robustness if the order changes.

Suggested change

phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']]

phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']]

all_phase_names = set(phase_names1).union(set(phase_names2))

count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"],

'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"]

}

phase_names1 = {phase['name']: idx for idx, phase in enumerate(self.yaml1.load_yaml_file()['phases'])}

phase_names2 = {phase['name']: idx for idx, phase in enumerate(self.yaml2.load_yaml_file()['phases'])}

count_diff = {}

for phase_name in ['gas', 'surface']:

if phase_name in phase_names1 and phase_name in phase_names2:

count_diff[phase_name] = count_per_phase1[f"specie_count_{phase_name}"] - count_per_phase2[f"specie_count_{phase_name}"]

else:

count_diff[phase_name] = None # Handle missing phases gracefully

rwest and others added 6 commits December 3, 2024 13:33

Save cantera files that were made from chemkin files to a separate fo…

d286485

…lder. So they don't over-write the ones that are being written directly.

Rename yml to yaml_rms, because it was specific to RMS

0adcae3

This allows the Cantera yaml writer to live alongside it

Minor: Fix a Cython declaration in a Reaction method.

becf208

Although Atom is a subclass of Vertex, it adds an element attribute. This leads to more efficient Cython code (and quietens a warning in vscode).

Reworking the Elements blocks in cantera yaml writer.

133a100

Basing it on the Chemkin version. For now only evaluate it once, and include everything.

add yaml writer test to compare yaml file generated by RMG and the ya…

649e2ac

…ml file converted from chemkin files

LekiaAnonim changed the title ~~add yaml writer test to compare yaml file generated by RMG and the ya…~~ Add yaml writer test to compare yaml file generated by RMG and the ya… Mar 3, 2025

rwest requested a review from Copilot May 22, 2025 02:29

Copilot AI reviewed May 22, 2025

View reviewed changes

rwest force-pushed the yaml_writer_addition branch from 133a100 to 5e45b13 Compare May 30, 2025 14:33

rwest closed this May 30, 2025

rwest mentioned this pull request May 30, 2025

Yaml writer addition #2800

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add yaml writer test to compare yaml file generated by RMG and the ya… #2770

Add yaml writer test to compare yaml file generated by RMG and the ya… #2770

Uh oh!

LekiaAnonim commented Mar 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 22, 2025

Uh oh!

Copilot AI May 22, 2025

Uh oh!

Uh oh!

-        phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']]
-        phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']]
-        all_phase_names = set(phase_names1).union(set(phase_names2))
-        count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"],
-                      'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"]
-                      }
+        phase_names1 = {phase['name']: idx for idx, phase in enumerate(self.yaml1.load_yaml_file()['phases'])}
+        phase_names2 = {phase['name']: idx for idx, phase in enumerate(self.yaml2.load_yaml_file()['phases'])}
+        count_diff = {}
+        for phase_name in ['gas', 'surface']:
+            if phase_name in phase_names1 and phase_name in phase_names2:
+                count_diff[phase_name] = count_per_phase1[f"specie_count_{phase_name}"] - count_per_phase2[f"specie_count_{phase_name}"]
+            else:
+                count_diff[phase_name] = None  # Handle missing phases gracefully

Add yaml writer test to compare yaml file generated by RMG and the ya… #2770

Add yaml writer test to compare yaml file generated by RMG and the ya… #2770

Uh oh!

Conversation

LekiaAnonim commented Mar 3, 2025

Motivation or Problem

Description of Changes

Testing

Reviewer Tips

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!