Delta indices simplification #791

IliaLarchenko · 2025-02-28T16:30:17Z

What this does

This PR removes unnecessary back-and-forth conversion between delta_indices and delta_timestamps.

In the model configuration file, we define delta_indices for actions and observations. For example, let's look at observations in Diffusion:

@property
def observation_delta_indices(self) -> list:
    return list(range(1 - self.n_obs_steps, 1))

Then in datasets/factory.py we transform them into the delta timestamps by dividing them by fps:

delta_timestamps[key] = [i / ds_meta.fps for i in cfg.observation_delta_indices]

Eventually, in lerobot_dataset.py we use get_delta_indices from dataset/utils.py to transform them back by multiplying them by fps:

def get_delta_indices(delta_timestamps: dict[str, list[float]], fps: int) -> dict[str, list[int]]:
    delta_indices = {}
    for key, delta_ts in delta_timestamps.items():
        delta_indices[key] = [round(d * fps) for d in delta_ts]

    return delta_indices

In the end, we use delta_indices, which we set up at the very beginning.

Basically we define delta_indices -> transform them to delta_timestapms -> trainsform them back to delta_indices

All of this happens before we query data from the dataset. When the actual query happens we generate query_indices from delta_indices and also query_timestamps from delta_indices and use them for the actual data query.

I think that the transformation delta_indices -> delta_timestapms -> delta_indices is completely unnecessary. Because we define delta_indices in the policy and use delta_indices inside the dataset.

I removed all unnecessary transformations, all tests related to functions that were doing these transformations, and fixed examples.

Removing these steps doesn't have much impact on performance but lets us clean up the code. And also opens up some other opportunities. As we directly use delta_indices defined in the policy we can use something more flexible than a simple list.

How it was tested

This PR doesn't break anything that is using train.py all policies work fine without any changes. All configs already use delta_indices so don't need to change anything.
But it can break someone's custom training pipeline because LerobotDataset expects delta_indices instead of delta_timestapms parameter now but it is very easy to fix as you can always get delta_indices by multiplying delta_timestapms by fps.

How to checkout & try? (for the reviewer)

You can try to train any policy, eg:

python lerobot/scripts/train.py \
    --output_dir=outputs/train/diffusion_pusht \
    --policy.type=diffusion \
    --dataset.repo_id=lerobot/pusht \
    --seed=100000 \
    --env.type=pusht \
    --batch_size=64 \
    --steps=200000 \
    --eval_freq=25000 \
    --save_freq=25000 \
    --wandb.enable=true

…amps

aliberts

The best code is no code.
This looks good, thanks! Left a few minor comments.

@Cadene I know you had some opinion about keeping the spec in delta_timestamps rather than delta_indices. I think the simplification here is worth it, besides right now we don't really have any use case or workflow where delta_timestamps is necessary over having delta_indices, AFAIK.

On a side note, I'm curious about how we should handle unsynced features like cameras with different fps, or variable refresh rate sensors. This is not directly an issue with using indices over timestamps but related nonetheless. Happy to get your opinion about this @IliaLarchenko @Cadene

lerobot/common/datasets/lerobot_dataset.py

lerobot/common/datasets/factory.py

Co-authored-by: Simon Alibert <[email protected]>

for more information, see https://pre-commit.ci

Co-authored-by: Simon Alibert <[email protected]>

IliaLarchenko · 2025-03-02T03:17:16Z

We still have a main dataset level fps and timestamps for each index and we check that all of them are valid and inside tolerance bounds:

lerobot/lerobot/common/datasets/lerobot_dataset.py

Lines 505 to 509 in 8b2effa

    
           # Check timestamps 
        
           timestamps = torch.stack(self.hf_dataset["timestamp"]).numpy() 
        
           episode_indices = torch.stack(self.hf_dataset["episode_index"]).numpy() 
        
           ep_data_index_np = {k: t.numpy() for k, t in self.episode_data_index.items()} 
        
           check_timestamps_sync(timestamps, episode_indices, ep_data_index_np, self.fps, self.tolerance_s)

We do the same check when we save an episode:

lerobot/lerobot/common/datasets/lerobot_dataset.py

Lines 883 to 889 in 8b2effa

    
           check_timestamps_sync( 
        
               episode_buffer["timestamp"], 
        
               episode_buffer["episode_index"], 
        
               ep_data_index_np, 
        
               self.fps, 
        
               self.tolerance_s, 
        
           )

Then for actually querying video frames we use timestamps:

lerobot/lerobot/common/datasets/lerobot_dataset.py

Lines 731 to 735 in 8b2effa

    
           if len(self.meta.video_keys) > 0: 
        
               current_ts = item["timestamp"].item() 
        
               query_timestamps = self._get_query_timestamps(current_ts, query_indices) 
        
               video_frames = self._query_videos(query_timestamps, ep_idx) 
        
               item = {**video_frames, **item}

And when we actually decode video we check it again:

lerobot/lerobot/common/datasets/video_utils.py

Lines 104 to 114 in 8b2effa

    
           is_within_tol = min_ < tolerance_s 
        
           assert is_within_tol.all(), ( 
        
               f"One or several query timestamps unexpectedly violate the tolerance ({min_[~is_within_tol]} > {tolerance_s=})." 
        
               "It means that the closest frame that can be loaded from the video is too far away in time." 
        
               "This might be due to synchronization issues with timestamps during data collection." 
        
               "To be safe, we advise to ignore this item during training." 
        
               f"\nqueried timestamps: {query_ts}" 
        
               f"\nloaded timestamps: {loaded_ts}" 
        
               f"\nvideo: {video_path}" 
        
               f"\nbackend: {backend}" 
        
           )

Cadene · 2025-03-02T10:22:25Z

Thanks for your work. Indeed our code can be improved and simplified.

However, I didnt understand your argument for changing the interface from timestamps to indices. I agree it simplifies things downstream, but for me it is at the expense of expressivity. Maybe the solution would be to simplify the code downstream, instead of upstream. For instance, the policy should not have indices defined, but timestamps.

Here is my argument for keeping timestamps as input: it is more expressive than indices. In this example, if we were to use indices, even with this comment, we have a hard time understanding what -75 , -50, -25, etc. mean. Instead we have to map these indices into timestamps, because it's more meaningful for us.

    # loads 8 state vectors: 1.5 seconds before, 1 second before, ... 200 ms, 100 ms, and current frame
    "observation.state": [-1.5, -1, -0.5, -0.20, -0.10, 0],  # <-- timestamps
    "observation.state": [-75, -50, -25, -10, -5, 0],  # <-- indices

It's also important to reason in timestamps, because at 10 fps, an index of -10 corresponds to -1 second, but at 50 fps, it's - 20ms which doesnt correspond at all to the same temporal learning for your model. By forcing the interface of dataset to be in timestamps, it's easier to not make this mistakes.

I think there are some issues in our API, but the best way to take the correct decisions is to make progress on other features such as training on multiple datasets with various fps. It will automatically raise some issues regarding timestamps and indices.

As of now, I would prefer to not make this change yet.

What do you think?

IliaLarchenko · 2025-03-02T11:19:40Z

I don’t have a strong preference for indices over timestamps.

Currently, timestamps are only used in examples with a manual training loop. In practice, if you train a model with any policy, you specify indices, and inside the dataset, you use indices to query items.

Regardless of the approach, you need to account for FPS when specifying either delta_indices or delta_timestamps. Timestamps are more intuitive in your example, but multiplying them by FPS makes them just as clear:

"observation.state": [dt * fps for dt in [-1.5, -1, -0.5, -0.20, -0.10, 0]]

For observations, timestamps feel more natural, while for actions, indices make more sense. For example, if you want to predict the next 2 seconds of actions, you have to account for FPS anyway:

Using timestamps:
```
[i / fps for i in range(fps * 2)]
```
Using indices:
```
range(fps * 2)
```

So ultimately, they’re not that different, but my approach skips a couple of intermediate steps.

For multi-dataset scenarios, this gets even trickier. If one dataset has 10 FPS and another has 50 FPS, it's unclear what the best way to handle them is.

My initial goal was to remove unnecessary conversions and allow policies to define more flexible delta_indices as an iterable. This way, you could use something like the iterator below to sample different indices dynamically:

class RandomizedDeltaIndices:
    def __iter__(self):
        return iter([
            random.randint(-25, -15),
            random.randint(-15, -5),
            *range(-4, 1)
        ])

However, this approach can be a bit messy. An alternative is passing a delta_indices post-processing function, like in this commit: fb6e9e7) . But defining a function as a dataset parameter isn’t ideal either. I don't have a perfect solution yet.

Cadene · 2025-03-02T15:28:59Z

A thought: For delta timestamps given as input to a dataset which don't match its fps, we can do some interpolation for state and action, and returns the closest frames for the vision modalities.

Thus, in the multi-dataset case, we can keep the same delta timestamps for both.

Then for training your DOT policy with some random delta timestamps, we can provide a "data augmentation" callable function to the dataset.

In any case, I think the delta timestamps interface to LeRobotDataset is to be prefered over delta indices.

IliaLarchenko · 2025-03-03T16:01:47Z

Got it. So, I can close this PR. And later when DOT policy is integrated I can try to come up with some nice solutions for delta_indices augmentaitons.

IliaLarchenko and others added 7 commits February 28, 2025 22:25

Removed unnecessary conversion between delta indices and delta timest…

33236af

…amps

Removed tests for removed functions

0200428

Removed unused factories

b80fb50

Fixed test datasets

67e6fc7

Updated examples

5e4f330

Updated readme

7286fde

Merge branch 'main' into delta_indices_simplification

7923335

aliberts reviewed Mar 1, 2025

View reviewed changes

IliaLarchenko and others added 5 commits March 2, 2025 09:56

Update type hints in lerobot/common/datasets/lerobot_dataset.py

f4fa2c8

Co-authored-by: Simon Alibert <[email protected]>

Update type hints in lerobot/common/datasets/factory.py

748df4b

Co-authored-by: Simon Alibert <[email protected]>

Update type hints in lerobot/common/datasets/factory.py

c3b6c34

Co-authored-by: Simon Alibert <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

8a30b70

for more information, see https://pre-commit.ci

Update type hints in lerobot/common/datasets/lerobot_dataset.py

8b2effa

Co-authored-by: Simon Alibert <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delta indices simplification #791

Delta indices simplification #791

IliaLarchenko commented Feb 28, 2025

aliberts left a comment •

edited

Loading

IliaLarchenko commented Mar 2, 2025

Cadene commented Mar 2, 2025 •

edited

Loading

IliaLarchenko commented Mar 2, 2025

Cadene commented Mar 2, 2025 •

edited

Loading

IliaLarchenko commented Mar 3, 2025

Delta indices simplification #791

Are you sure you want to change the base?

Delta indices simplification #791

Conversation

IliaLarchenko commented Feb 28, 2025

What this does

How it was tested

How to checkout & try? (for the reviewer)

aliberts left a comment • edited Loading

Choose a reason for hiding this comment

IliaLarchenko commented Mar 2, 2025

Cadene commented Mar 2, 2025 • edited Loading

IliaLarchenko commented Mar 2, 2025

Cadene commented Mar 2, 2025 • edited Loading

IliaLarchenko commented Mar 3, 2025

aliberts left a comment •

edited

Loading

Cadene commented Mar 2, 2025 •

edited

Loading

Cadene commented Mar 2, 2025 •

edited

Loading