Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic data smoke test. #75

Merged
merged 29 commits into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
bbde0a4
Model runs + draws in notebook, no data output
jf514 Oct 8, 2024
4661907
Configs etc - model not yet working.
jf514 Oct 25, 2024
d5cdd97
Disable energy
jf514 Oct 25, 2024
9baa1c7
Offset shape bugfix (#73)
charles-zhng Oct 20, 2024
a6b983f
Smoke test (#74)
jf514 Oct 25, 2024
7ed9f25
IT'S WORKING
jf514 Oct 25, 2024
d4145c5
Offset shape bugfix (#73)
charles-zhng Oct 20, 2024
8248778
Configs etc - model not yet working.
jf514 Oct 25, 2024
be2f1aa
Offset shape bugfix (#73)
charles-zhng Oct 20, 2024
935c922
Configs etc - model not yet working.
jf514 Oct 25, 2024
2e6e01c
Merge remote-tracking branch 'origin/main' into synthetic-data
jf514 Oct 26, 2024
45774d9
Fix weird merge.
jf514 Oct 26, 2024
e9c1a4f
Clean up synth_model config file.
jf514 Oct 28, 2024
f14910b
Remove TIME_BINS (which was a merge accident.)
jf514 Oct 28, 2024
f6df7e3
Fix smoke test.
jf514 Oct 28, 2024
7b16b5a
Fix smoke test.
jf514 Oct 28, 2024
b75b354
Clean up.
jf514 Oct 28, 2024
6f7da79
Fixed root optimization, but still some debug code.
jf514 Oct 28, 2024
5433344
Add root_kp_index
jf514 Oct 29, 2024
2405815
Forgot model yaml.
jf514 Oct 29, 2024
9a2a599
Reset rodent configs, enable synth config.
jf514 Oct 29, 2024
86d333e
Add synth_data smoke test.
jf514 Oct 29, 2024
84793e9
Missed data file.
jf514 Oct 29, 2024
82c1887
Clean up.
jf514 Oct 30, 2024
d9fe782
Add root opt keypoint to model configs + clean up.
jf514 Oct 30, 2024
418a1e9
Clean up.
jf514 Oct 30, 2024
19a30e0
CR feedback.
jf514 Nov 1, 2024
dc26558
Add synth data generation program.
jf514 Nov 1, 2024
dc881c3
Add comments.
jf514 Nov 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,9 @@ jobs:
with:
fail_ci_if_error: true
verbose: false
token: ${{ secrets.CODECOV_TOKEN }}
token: ${{ secrets.CODECOV_TOKEN }}

# Test probably delete this
- name: Smoke Test
shell: bash -l {0}
run: python run_stac.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Smoke test script needs proper error handling and documentation

The script run_stac.py exists but has several issues:

  • No explicit error handling (try/except blocks) for potential runtime failures
  • Missing documentation about test expectations and success criteria
  • Uses hydra configuration but the config path (./configs) may not be relative to the workspace
  • No explicit exit codes to indicate test success/failure

Suggested improvements:

  • Add proper error handling around load_and_run_stac
  • Document test purpose and success criteria
  • Ensure config path is workspace-relative
  • Return appropriate exit codes for CI integration
🔗 Analysis chain

Smoke test implementation needs improvement.

Several concerns with the current implementation:

  1. The comment suggests this might be temporary - please clarify the intention
  2. The script path should be relative to the workspace
  3. Missing error handling and test result verification
  4. No documentation of what this smoke test verifies

Let's verify the script existence and its location:

Consider implementing the smoke test more robustly:

-      # Test probably delete this
-      - name: Smoke Test
-        shell: bash -l {0}
-        run: python run_stac.py
+      - name: Run E2E Smoke Test
+        shell: bash -l {0}
+        run: |
+          # Run synthetic E2E test and capture exit code
+          python ./tests/smoke/run_stac.py
+          if [ $? -ne 0 ]; then
+            echo "Smoke test failed!"
+            exit 1
+          fi

Let me check the content of the script and its error handling:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the smoke test script existence and location

# Test 1: Check if run_stac.py exists in the repository
echo "Checking for run_stac.py in the repository..."
fd -t f "run_stac.py"

# Test 2: Check if the script contains proper error handling
echo -e "\nChecking for error handling in run_stac.py..."
rg -l "try|except|raise|sys.exit" $(fd -t f "run_stac.py")

Length of output: 324


Script:

#!/bin/bash
# Check the content of run_stac.py to verify error handling and implementation
echo "Content of run_stac.py:"
cat run_stac.py

# Look for any documentation or comments explaining the test
echo -e "\nChecking for documentation comments:"
rg "^#" run_stac.py

Length of output: 876

🧰 Tools
🪛 yamllint

[error] 107-107: no new line character at the end of file

(new-line-at-end-of-file)

594 changes: 594 additions & 0 deletions Mat-to-Nwb-Synth-Data.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion configs/config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
defaults:
- stac: demo
- model: rodent
- model: synth
- _self_

##FLY_MODEL
Expand Down
58 changes: 58 additions & 0 deletions configs/model/synth.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@

MJCF_PATH: 'models/synth_model.xml'

# Frames per clip for transform.
N_FRAMES_PER_CLIP: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Setting N_FRAMES_PER_CLIP to 1 is inconsistent with other model configurations

The verification reveals that:

  • The synth_model.xml file exists in the correct location
  • All other model configurations in the codebase use significantly higher frame counts:
    • Test configs: 360 frames
    • Rodent model: 250 frames
    • Mouse model: 360 frames
    • Fly models: 300-581 frames

Additionally, the codebase enforces that N_FRAMES_PER_CLIP must divide evenly with the total number of mocap frames, suggesting this parameter is critical for temporal processing. Setting it to 1 would severely limit the model's ability to capture motion sequences.

🔗 Analysis chain

Verify model path and frame count configuration.

The single frame per clip (N_FRAMES_PER_CLIP: 1) seems unusually low for a synthetic model and might limit the model's effectiveness in capturing temporal dynamics.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the MJCF model file exists
if [ ! -f "models/synth_model.xml" ]; then
    echo "Error: MJCF model file not found at specified path"
    exit 1
fi

# Check if other synthetic models in the codebase use different frame counts
rg -g "*.yaml" "N_FRAMES_PER_CLIP:" configs/

Length of output: 310


Script:

#!/bin/bash
# Let's try a different approach to verify the model configuration

# Check if the models directory exists and list its contents
if [ -d "models" ]; then
    echo "Models directory contents:"
    ls -la models/
else
    echo "Models directory not found"
fi

# Search for any XML files that might be related to synthetic models
echo -e "\nSearching for XML model files:"
fd -e xml . 

# Look for other config files that might reference frame counts
echo -e "\nSearching for frame count configurations:"
rg "FRAMES_PER" .

# Check for any documentation about the synthetic model setup
echo -e "\nSearching for relevant documentation:"
fd README -t f

Length of output: 5513


# Tolerance for the optimizations of the full model, limb, and root.
# TODO: Re-implement optimizer loops to use these tolerances
FTOL: 5.0e-03
ROOT_FTOL: 1.0e-05
LIMB_FTOL: 1.0e-06

# Number of alternating pose and offset optimization rounds.
N_ITERS: 1
jf514 marked this conversation as resolved.
Show resolved Hide resolved

KP_NAMES:
- 'part_0'

# The model sites used to register the keypoints.
KEYPOINT_MODEL_PAIRS:
part_0: base

# The initial offsets for each keypoint in meters.
KEYPOINT_INITIAL_OFFSETS:
part_0: 0 0 0.01

TRUNK_OPTIMIZATION_KEYPOINTS:
-part_0

jf514 marked this conversation as resolved.
Show resolved Hide resolved
INDIVIDUAL_PART_OPTIMIZATION:
"T1R": ["base", "base"]

# Color to use for each keypoint when visualizing the results
KEYPOINT_COLOR_PAIRS:
base: 0 .5 1 1

# What is the size of the animal you'd like to register, relative to the model?
SCALE_FACTOR: 1

# Multiplier to put the mocap data into the same scale as the data. Eg, if the
# mocap data is known to be in millimeters and the model is in meters, this is
# .001
MOCAP_SCALE_FACTOR: 1

# If you have reason to believe that the initial offsets are correct for particular keypoints,
# you can regularize those sites using this with M_REG_COEF.
SITES_TO_REGULARIZE:
- part_0

RENDER_FPS: 200

N_SAMPLE_FRAMES: 1

# If you have reason to believe that the initial offsets are correct for particular keypoints,
# you can regularize those sites using _SITES_TO_REGULARIZE.
M_REG_COEF: 1

TIME_BINS: 0.02
8 changes: 4 additions & 4 deletions configs/stac/demo.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
fit_path: "demo_fit.p"
ik_only_path: "demo_ik_only.p"
data_path: "tests/data/test_rodent_mocap_1000_frames.mat"
fit_offsets_path: "synth_fit.p"
ik_only_path: "synth_ik_only.p"
data_path: "tests/data/save_data_AVG_2.nwb"

n_fit_frames: 1
jf514 marked this conversation as resolved.
Show resolved Hide resolved
skip_fit_offsets: False
ik_only: True
skip_ik_only: False

mujoco:
solver: "newton"
Expand Down
2 changes: 1 addition & 1 deletion configs/stac/stac.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
fit_path: "fit.p"
fit_offsets_path: "fit.p"
jf514 marked this conversation as resolved.
Show resolved Hide resolved
ik_only_path: "ik_only.p"
data_path: "tests/data/test_rodent_mocap_1000_frames.nwb"

Expand Down
4 changes: 2 additions & 2 deletions configs/stac/stac_fly_tethered.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
fit_path: "Fruitfly_fit.p"
transform_path: "transform_mocap_fly.h5"
fit_offsets_path: "Fruitfly_fit.p"
ik_only_path: "transform_mocap_fly.h5"
# This data is unpublished, need to speak to Elliot Abe about how to obtain it.
# A possible source is https://datadryad.org/stash/dataset/doi:10.5061/dryad.nzs7h44s4
# See the discussion here for more info:
Expand Down
4 changes: 2 additions & 2 deletions configs/stac/stac_fly_treadmill.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
fit_path: "fly_treadmill_fit.p"
transform_path: "transform_treadmill.p"
fit_offsets_path: "fly_treadmill_fit.p"
ik_only_path: "transform_treadmill.p"
# File is too large to commit
# DL from: https://datadryad.org/stash/dataset/doi:10.5061/dryad.mpg4f4r73
# Actual file: https://datadryad.org/stash/downloads/file_stream/3361804
Expand Down
2 changes: 1 addition & 1 deletion configs/stac/stac_mouse.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
fit_path: "fit_mouse.p"
fit_offsets_path: "fit_mouse.p"
ik_only_path: "ik_only_mouse.p"
data_path: "tests/data/test_mouse_mocap_3600_frames.h5"

Expand Down
259 changes: 259 additions & 0 deletions demos/create_synth_data.ipynb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was the data saved from this notebook? And were there any offsets applied to the synthetic keypoint? I wanted to test whether it finds the ground truth offset when given different initial offsets in the model.yaml file, but it always returns the initial offset without any changes.

This notebook doesn't run as is; the rendering is cell 3 is different for me from the one shared, and the last cell throw an error: KeyError: "Invalid name '0'. Valid names: ['base', 'world']"

Copy link
Contributor Author

@jf514 jf514 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's just not ready for that. I changed to the title to reflect that it's just a smoke test... ie it runs with out crashing.

This PR isn't to provide that level of functionality, sadly. I've created #81 to track the next steps. Please feel free to add any specific requests to that.

As for the data generation, I created a single frame of fake data (not even collected from an actual Mujoco run) just to give an output.

Also, if you think there are any comments that need to be added to the code to make this clear, also lmk.

Large diffs are not rendered by default.

23 changes: 23 additions & 0 deletions models/synth_model.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<mujoco>
<option timestep=".001">
<!-- <flag energy="enable" contact="disable"/> -->
</option>

<default>
<joint type="hinge" axis="0 -1 0"/>
<geom type="capsule" size=".02"/>
</default>

jf514 marked this conversation as resolved.
Show resolved Hide resolved
<worldbody>
<light pos="0 -.4 1"/>
<camera name="fixed" pos="0 -1 0" xyaxes="1 0 0 0 0 1"/>
<body name="base" pos="0 0 .2">
<joint name="root"/>
<geom fromto="0 0 0 0 0 -.25" rgba="1 1 0 1"/>
<!-- <body name="3" pos="0 0 -.25">
<joint/>
<geom fromto="0 0 0 0 0 -.2" rgba="0 0 1 1"/>
</body> -->
</body>
</worldbody>
</mujoco>
1 change: 1 addition & 0 deletions stac_mjx/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ def load_data(cfg: DictConfig, base_path: Union[Path, None] = None):
f"Number of keypoint names ({len(kp_names)}) is not the same as the number of keypoints in data ({data.shape[1]})"
)

print("kpnames ", kp_names)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consider using proper logging instead of print statements.

Debug print statements should be avoided in production code. Consider using Python's logging framework for better control over debug output.

Replace the print statement with a proper debug log:

-    print("kpnames ", kp_names)
+    import logging
+    logging.debug("Keypoint names: %s", kp_names)

This change offers several benefits:

  • Log level control (can be disabled in production)
  • Structured output format
  • Better integration with monitoring systems

Committable suggestion was skipped due to low confidence.


💡 Codebase verification

Print statement appears to be inconsistent with codebase's logging pattern

Looking at the context and print statements across the codebase:

  1. Most print statements in the codebase are used for meaningful progress updates (iterations, timings, errors) or warnings
  2. This particular print statement is a debug-style output without clear purpose
  3. The codebase doesn't have a consistent debug/verbose mode pattern
  4. The print statement appears in the middle of keypoint mapping logic without providing significant value

Recommendations:

  • Remove this print statement as it doesn't align with the codebase's logging pattern
  • If debugging is needed, consider adding proper logging with debug levels
  • For test visibility, use pytest's built-in capture system instead
🔗 Analysis chain

Verify if this print statement is required for E2E tests.

Given this PR's focus on synthetic E2E testing, please clarify if this print statement is intentionally added for test debugging purposes. If so, consider:

  1. Moving it behind a debug flag
  2. Adding it only in test configurations
  3. Using pytest's built-in capture system for test output

Let's check if this print statement is used in tests:


The previous search was too narrow. Let's:

  1. Get more context around the print statement
  2. Check for any debug configuration patterns
  3. Look for similar print statements to understand the logging pattern
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for test files that might rely on this output
rg -l "kpnames" tests/

Length of output: 22


Script:

#!/bin/bash
# Let's check the broader context of this print statement
rg -B 3 -A 3 "kpnames" .

# Also check if there are any test configurations or debug flags
rg -l "DEBUG|VERBOSE|debug_mode|verbose_mode" .

# Look for similar print statements in the codebase
rg "print\(" stac_mjx/

Length of output: 2462

model_inds = [
kp_names.index(src) for src, dst in cfg.model.KEYPOINT_MODEL_PAIRS.items()
]
Expand Down
12 changes: 6 additions & 6 deletions stac_mjx/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ def run_stac(
start_time = time.time()

# Getting paths
fit_path = base_path / cfg.stac.fit_path
fit_offsets_path = base_path / cfg.stac.fit_offsets_path
ik_only_path = base_path / cfg.stac.ik_only_path

xml_path = base_path / cfg.model.MJCF_PATH
Expand All @@ -69,21 +69,21 @@ def run_stac(
logging.info(f"Running fit. Mocap data shape: {fit_data.shape}")
fit_data = stac.fit_offsets(fit_data)

logging.info(f"saving data to {fit_path}")
io.save(fit_data, fit_path)
logging.info(f"saving data to {fit_offsets_path}")
io.save(fit_data, fit_offsets_path)

# Stop here if not doing ik only phase
if cfg.stac.skip_ik_only == 1:
logging.info("skipping ik_only()")
return fit_path, None
return fit_offsets_path, None
# FLY_MODEL: The elif below must be commented out for fly_model.
elif kp_data.shape[0] % cfg.model.N_FRAMES_PER_CLIP != 0:
raise ValueError(
f"N_FRAMES_PER_CLIP ({cfg.model.N_FRAMES_PER_CLIP}) must divide evenly with the total number of mocap frames({kp_data.shape[0]})"
)

logging.info("Running ik_only()")
with open(fit_path, "rb") as file:
with open(fit_offsets_path, "rb") as file:
fit_data = pickle.load(file)
offsets = fit_data["offsets"]

Expand All @@ -95,4 +95,4 @@ def run_stac(
)
io.save(ik_only_data, ik_only_path)

return fit_path, ik_only_path
return fit_offsets_path, ik_only_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Variable naming inconsistency detected across codebase

The search results reveal inconsistent variable naming:

  • fit_offsets_path is used in stac_mjx/main.py and configuration files
  • However, other files (run_stac.py, demos/) still use the old fit_path variable name

This inconsistency needs to be addressed in:

  • run_stac.py
  • demos/api_usage.ipynb
  • demos/run_stac_fly_model.py
🔗 Analysis chain

Verify the path variable usage across the codebase

The return statement correctly uses the renamed variable. Since this appears to be part of a larger renaming effort, let's verify the consistency across the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all references to fit_path have been updated to fit_offsets_path
# Search for any remaining instances of the old variable name
rg "fit_path" --type yaml --type python

# Search for the new variable name to confirm proper usage
rg "fit_offsets_path" --type yaml --type python

Length of output: 149


Script:

#!/bin/bash
# Search for any remaining instances of the old variable name without file type restrictions
rg "fit_path"

# Search for the new variable name to confirm proper usage
rg "fit_offsets_path"

Length of output: 1318

2 changes: 1 addition & 1 deletion stac_mjx/stac.py
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,7 @@ def _package_data(self, mjx_model, q, x, walker_body_sites, kp_data, batched=Fal
x = x.reshape(-1, x.shape[-1])
q = q.reshape(-1, q.shape[-1])
else:
offsets = self._offsets
offsets = self._offsets.reshape((-1, 3))

kp_data = kp_data.reshape(-1, kp_data.shape[-1])

Expand Down
2 changes: 1 addition & 1 deletion tests/configs/stac/test_stac.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
fit_path: "fit.p"
fit_offsets_path: "fit.p"
jf514 marked this conversation as resolved.
Show resolved Hide resolved
ik_only_path: "ik_only.p"
data_path: "tests/data/test_rodent_mocap_1000_frames.nwb"

Expand Down
Binary file added tests/data/save_data_AVG_2.nwb
Binary file not shown.
2 changes: 1 addition & 1 deletion tests/unit/test_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,5 @@ def test_load_configs(config):
assert isinstance(cfg, DictConfig)

# Assert that the resulting configs contain the expected data
assert cfg.stac.fit_path == "fit.p"
assert cfg.stac.fit_offsets_path == "fit.p"
assert cfg.model.N_FRAMES_PER_CLIP == 360