TypeError: 'float' object cannot be interpreted as an integer #2224

OlegBEZb · 2024-12-10T19:38:55Z

Description

I run kedro viz and as soon as I press onto one of the datasets persisted in memory, the application fails with the TypeError: 'float' object cannot be interpreted as an integer

Context

This issue makes all the consecutive datasets not available for a preview. Apart from that, there is no direct way to understand which exact column of the dataset persisted as parquet doesn't fit.

Steps to Reproduce

Create a pipeline of one node producing a dataframe. Dataframe may contain questionable pureness of the columns but definitely serialisable to parquet. kedro run pipeline doesn't throw any errors and the dataset actually exists in the data folder and easily readable using catalog.load
run kedro viz
press on the dataset in the application

Expected Result

Preview is available

Actual Result

Long error starting with

ERROR    Exception in ASGI application                                                           httptools_impl.py:414
                                                                                                                                          
                             ╭───────────────────────── Traceback (most recent call last) ─────────────────────────╮                      
                             │ /Users/username/Library/Caches/pypoetry/virtualenvs/venv_name │                      
                             │ 3.11/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py:409 in   │                      
                             │ run_asgi

and ending with

/Users/username/Library/Caches/pypoetry/virtualenvs/venv_name │                      
                             │ 3.11/lib/python3.11/site-packages/pydantic/type_adapter.py:527 in dump_python       │                      
                             │                                                                                     │                      
                             │   524 │   │   Returns:                                                              │                      
                             │   525 │   │   │   The serialized object.                                            │                      
                             │   526 │   │   """                                                                   │                      
                             │ ❱ 527 │   │   return self.serializer.to_python(                                     │                      
                             │   528 │   │   │   instance,                                                         │                      
                             │   529 │   │   │   mode=mode,                                                        │                      
                             │   530 │   │   │   by_alias=by_alias,                                                │                      
                             ╰─────────────────────────────────────────────────────────────────────────────────────╯                      
                             TypeError: 'float' object cannot be interpreted as an integer

Your Environment

Include as many relevant details as possible about the environment you experienced the bug in:

Web browser system and version: Version 131.0.6778.86 (Official Build) (arm64)
Operating system and version: macOS 14.7.1 (23H222)
NodeJS version used (if relevant):
Kedro version used (if relevant): 0.19.10
Python version used (if relevant): 3.11.10

Checklist

Include labels so that we can categorise your issue

The text was updated successfully, but these errors were encountered:

datajoely · 2024-12-11T09:18:31Z

So I'm not sure if this is the same issue, but I remember once that infinity is a valid float in python but not JSON, it's possible something related is going on

OlegBEZb · 2024-12-11T17:18:32Z

@datajoely thank you very much for your answer! I've investigated the dataset I had and actually found that the issue happens with the timestamp of type dtype='datetime64[ns]' containing 'NaT' values.

Here is the minimal dataframe which reproduces the error on my side:

test_df = pd.DataFrame({
    'timestamp_col': [pd.Timestamp('2024-12-11 18:00:00'), pd.NaT]
})
test_df.to_parquet('data/01_raw/dataset_name.parquet')

Produces exactly the same error:

TypeError: 'float' object cannot be interpreted as an integer

datajoely · 2024-12-12T13:14:40Z

So that's interesting - but does that mean you were able to save the data that Kedro-Viz was trying to read?

OlegBEZb · 2024-12-12T16:14:55Z

@datajoely, yes. Kedro pipeline does this without any changes. I've also tried to save such a dataframe using pyre pandas from the example above - both work. I haven't specified any configuration for the parquet, but I have fastparquet and pyarrow installed in the venv. So should use pyarrow with snappy compression by default, but I haven't checked into the depth if this may cause any issues

OlegBEZb added the Issue: Bug Report label Dec 10, 2024

rashidakanchwala added the Community label Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: 'float' object cannot be interpreted as an integer #2224

TypeError: 'float' object cannot be interpreted as an integer #2224

OlegBEZb commented Dec 10, 2024

datajoely commented Dec 11, 2024

OlegBEZb commented Dec 11, 2024

datajoely commented Dec 12, 2024

OlegBEZb commented Dec 12, 2024

TypeError: 'float' object cannot be interpreted as an integer #2224

TypeError: 'float' object cannot be interpreted as an integer #2224

Comments

OlegBEZb commented Dec 10, 2024

Description

Context

Steps to Reproduce

Expected Result

Actual Result

Your Environment

Checklist

datajoely commented Dec 11, 2024

OlegBEZb commented Dec 11, 2024

datajoely commented Dec 12, 2024

OlegBEZb commented Dec 12, 2024