Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update some of the data files #131

Open
1 of 3 tasks
CompRhys opened this issue Aug 27, 2024 · 0 comments
Open
1 of 3 tasks

Update some of the data files #131

CompRhys opened this issue Aug 27, 2024 · 0 comments
Labels
bug Something isn't working data Data loading and processing

Comments

@CompRhys
Copy link
Collaborator

CompRhys commented Aug 27, 2024

Changes to make:

  • Fixes to NaN issue and Formula in mp_energies -- avoids duplication of
df_mp = pd.read_csv(DataFiles.mp_energies.path, na_filter=False)
# TODO get the real formula from the Composition rather than rename.
df_mp = df_mp.rename(columns={"formula_pretty": Key.formula, "nsites": Key.n_sites})
df_mp.loc[
    df_mp[Key.mat_id].isin(["mp-1080032", "mp-1179882", "mp-1009221"]), Key.formula
] = "NaN"
assert len(df_mp[df_mp[Key.formula].isna() | (df_mp[Key.formula] == "")]) == 0
df_mp = df_mp.set_index(Key.mat_id)
  • replace "wyckoff" and "iso_wyckoff" column names with updated names "protostructure" and "prototype"
  • Add magmoms to DataFiles.mp_trj_extxyz.path as seemingly missing for MPTrj EDA. they can be grabbed from the results dict (see Restore MPTrj EDA to working order #130)

This is just running list to be actioned at some point in the future

@janosh janosh added bug Something isn't working data Data loading and processing labels Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data Data loading and processing
Projects
None yet
Development

No branches or pull requests

2 participants