Skip to content

Add yaml writer test to compare yaml file generated by RMG and the ya… #2770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

LekiaAnonim
Copy link

Motivation or Problem

Disparities were observed in the Ea, b, and A values for gas and surface reactions when comparing Cantera YAML files generated directly versus those converted from Chemkin files using the cantera.ck2yaml command. A script and corresponding tests have been written to facilitate this comparison.

Description of Changes

  • Added a new test directory containing a script for comparing the two different YAML files.
  • Created a test file for validating the comparison process.
  • Added test data to the YAML files located in the yaml_writer_data directory.
    • The directly generated YAML file is located in /test/rmgpy/test_data/yaml_writer_data/cantera/.
    • The YAML file converted from Chemkin files is located in /test/rmgpy/test_data/yaml_writer_data/chemkin/.

Testing

A test script was written to compare species names, the number of species, reaction parameters, and the number of reactions in the YAML files generated by both methods.

Reviewer Tips

To review this PR, run the tests in /RMG-Py/test/rmgpy/yaml_writer/test_yaml.py. Note: You may need to adjust the file paths in test_yaml.py to match your local setup.

rwest and others added 6 commits December 3, 2024 13:33
…lder.

So they don't over-write the ones that are being written directly.
This allows the Cantera yaml writer to live alongside it
Nora did most of the work.
Richard did some cleanup.
Nick fixed species_to_dict function to use correct parameter as rxn species.

Co-authored-by:  Nora Khalil <[email protected]>
Co-authored-by: Richard West <[email protected]>
Co-authored-by:  Nicholas Tietje <[email protected]>
Although Atom is a subclass of Vertex, it adds an element
attribute. This leads to more efficient Cython code
(and quietens a warning in vscode).
Basing it on the Chemkin version. 
For now only evaluate it once, and include everything.
@LekiaAnonim LekiaAnonim changed the title add yaml writer test to compare yaml file generated by RMG and the ya… Add yaml writer test to compare yaml file generated by RMG and the ya… Mar 3, 2025
@rwest rwest requested a review from Copilot May 22, 2025 02:29
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a suite of tests and supporting utilities to compare YAML files generated by two different methods (direct Cantera generation versus Chemkin conversion) to ensure consistency in reaction and species data.

  • Introduces new tests under test/rmgpy/yaml_writer/test_yaml.py for validating species counts, species names, phase-specific species counts, and reaction details.
  • Implements helper classes (YamlAnalyst and CompareYaml) in test/rmgpy/yaml_writer/compare_yaml_outputs.py to load, analyze, and compare the YAML files.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
test/rmgpy/yaml_writer/test_yaml.py Adds test cases for comparing YAML outputs.
test/rmgpy/yaml_writer/compare_yaml_outputs.py Introduces classes to load, parse, and compare YAML file content.

Comment on lines +32 to +38
for key, values in self.load_yaml_file().items():
if key in [f"{phase['name']}-reactions" for phase in self.load_yaml_file()['phases']]:
reactions_dict[key] = self.load_yaml_file()[key]
elif key == 'gas_reactions':
reactions_dict['gas_reactions'] = self.load_yaml_file()['gas_reactions']
elif key == 'surface_reactions':
reactions_dict['surface_reactions'] = self.load_yaml_file()['surface_reactions']
Copy link
Preview

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeatedly calling self.load_yaml_file() within the loop may lead to unnecessary file reads. Consider loading the YAML content once and caching it to improve performance.

Suggested change
for key, values in self.load_yaml_file().items():
if key in [f"{phase['name']}-reactions" for phase in self.load_yaml_file()['phases']]:
reactions_dict[key] = self.load_yaml_file()[key]
elif key == 'gas_reactions':
reactions_dict['gas_reactions'] = self.load_yaml_file()['gas_reactions']
elif key == 'surface_reactions':
reactions_dict['surface_reactions'] = self.load_yaml_file()['surface_reactions']
yaml_content = self.load_yaml_file() # Cache the YAML content
for key, values in yaml_content.items():
if key in [f"{phase['name']}-reactions" for phase in yaml_content['phases']]:
reactions_dict[key] = yaml_content[key]
elif key == 'gas_reactions':
reactions_dict['gas_reactions'] = yaml_content['gas_reactions']
elif key == 'surface_reactions':
reactions_dict['surface_reactions'] = yaml_content['surface_reactions']

Copilot uses AI. Check for mistakes.

Comment on lines +93 to +98
phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']]
phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']]
all_phase_names = set(phase_names1).union(set(phase_names2))
count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"],
'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"]
}
Copy link
Preview

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the gas and surface phases always appear as the first two elements in the phases list can be brittle. Consider matching phases by name to ensure robustness if the order changes.

Suggested change
phase_names1 = [phase['name'] for phase in self.yaml1.load_yaml_file()['phases']]
phase_names2 = [phase['name'] for phase in self.yaml2.load_yaml_file()['phases']]
all_phase_names = set(phase_names1).union(set(phase_names2))
count_diff = {'gas': count_per_phase1[f"specie_count_{phase_names1[0]}"] - count_per_phase2[f"specie_count_{phase_names2[0]}"],
'surface': count_per_phase1[f"specie_count_{phase_names1[1]}"] - count_per_phase2[f"specie_count_{phase_names2[1]}"]
}
phase_names1 = {phase['name']: idx for idx, phase in enumerate(self.yaml1.load_yaml_file()['phases'])}
phase_names2 = {phase['name']: idx for idx, phase in enumerate(self.yaml2.load_yaml_file()['phases'])}
count_diff = {}
for phase_name in ['gas', 'surface']:
if phase_name in phase_names1 and phase_name in phase_names2:
count_diff[phase_name] = count_per_phase1[f"specie_count_{phase_name}"] - count_per_phase2[f"specie_count_{phase_name}"]
else:
count_diff[phase_name] = None # Handle missing phases gracefully

Copilot uses AI. Check for mistakes.

@rwest rwest force-pushed the yaml_writer_addition branch from 133a100 to 5e45b13 Compare May 30, 2025 14:33
@rwest rwest closed this May 30, 2025
@rwest rwest mentioned this pull request May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants