Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: DIDEC dataset does not load #980

Open
2 tasks
SiQube opened this issue Feb 27, 2025 · 0 comments
Open
2 tasks

bug: DIDEC dataset does not load #980

SiQube opened this issue Feb 27, 2025 · 0 comments
Labels
bug Something isn't working essential important

Comments

@SiQube
Copy link
Member

SiQube commented Feb 27, 2025

Current Behavior

DIDEC dataset does NOT load.

Expected Behavior

DIDEC dataset does load into memory.

Minimum acceptance criteria

Failure Information (for bugs)

the DIDEC dataset currently fails because there is currently no parser for BeGaze datasets.

Steps to Reproduce

install pymovements from source via

git clone [email protected]:aeye-lab/pymovements
cd pymovements
pip install -e .

then execute the following script:

import pymovements as pm


dataset = pm.Dataset('DIDEC', path='data')
dataset.download()
dataset.load()

the dataset does download and extract but then unfortunately errors:

polars.exceptions.ComputeError: could not parse `# Message: UE-keypress N` as dtype `f64` at column 'column_4' (column number 4)

more details here:

```console Traceback (most recent call last): File "/home/siqube/lab/aeye-lab/pymovements/tt.py", line 11, in dataset.load() File "/home/siqube/lab/aeye-lab/pymovements/src/pymovements/dataset/dataset.py", line 131, in load self.load_gaze_files( File "/home/siqube/lab/aeye-lab/pymovements/src/pymovements/dataset/dataset.py", line 207, in load_gaze_files self.gaze = dataset_files.load_gaze_files( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/src/pymovements/dataset/dataset_files.py", line 257, in load_gaze_files gaze_df = load_gaze_file( ^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/src/pymovements/dataset/dataset_files.py", line 374, in load_gaze_file gaze_df = from_csv( ^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/src/pymovements/gaze/io.py", line 218, in from_csv gaze_data = pl.read_csv(file, **read_csv_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/venv/lib/python3.12/site-packages/polars/_utils/deprecation.py", line 92, in wrapper return function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/venv/lib/python3.12/site-packages/polars/_utils/deprecation.py", line 92, in wrapper return function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/venv/lib/python3.12/site-packages/polars/_utils/deprecation.py", line 92, in wrapper return function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/venv/lib/python3.12/site-packages/polars/io/csv/functions.py", line 503, in read_csv df = _read_csv_impl( ^^^^^^^^^^^^^^^ File "/home/siqube/lab/aeye-lab/pymovements/venv/lib/python3.12/site-packages/polars/io/csv/functions.py", line 649, in _read_csv_impl pydf = PyDataFrame.read_csv( ^^^^^^^^^^^^^^^^^^^^^ polars.exceptions.ComputeError: could not parse `# Message: UE-keypress N` as dtype `f64` at column 'column_4' (column number 4)

The current offset in the file is 163557 bytes.

You might want to try:

  • increasing infer_schema_length (e.g. infer_schema_length=10000),
  • specifying correct dtype with the dtypes argument
  • setting ignore_errors to True,
  • adding # Message: UE-keypress N to the null_values list.

Original error: remaining bytes non-empty

</details>

`skip_rows` was abused to skip the header, but the header sizes differ from file to file, which presumably lead to silently failures, so this error is good.

## Context

Please provide any relevant information about your setup.
This is important in case the issue is not reproducible except for under certain conditions.

* Project Version / Commit: 26747e8beef6751c74453932d6ac2447052fd703
* Operating System: Ubuntu 24.04

## Checklist

- [x] I am running the latest version
- [x] I checked the documentation and found no answer
- [x] I checked to make sure that this issue has not already been filed
- [x] I have provided sufficient information for the team
@SiQube SiQube added the bug Something isn't working label Feb 27, 2025
@SiQube SiQube changed the title DIDEC dataset does not load bug: DIDEC dataset does not load Feb 27, 2025
@dkrako dkrako added the essential important label Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working essential important
Projects
None yet
Development

No branches or pull requests

2 participants