Tilize/untilize compute kernels are buggy #16860

nardoTT · 2025-01-17T17:48:21Z

Describe the bug
The following tilize/untilize compute kernels seem to be buggy for uint32 data type:

tilize_block in tt_metal/include/compute_kernel_api/tilize.h
untilize_block in tt_metal/include/compute_kernel_api/untilize.h

To Reproduce
The following tests use the kernels above and fail with assertion error:

@pytest.mark.parametrize("shape", [[32, 288]])
def test_untilize_with_unpadding_uint32(shape, device):
    torch.manual_seed(2005)
    input_a = torch.randint(1, 64, shape, dtype=torch.int32)
    input_tensor = ttnn.from_torch(input_a, device=device, layout=ttnn.TILE_LAYOUT, dtype=ttnn.uint32)
    output_tensor = ttnn.untilize_with_unpadding(input_tensor, [3,279])
    output_tensor = ttnn.to_torch(output_tensor)
    assert_with_pcc(input_a[:4, :280], output_tensor)

@pytest.mark.parametrize("shape", [[32, 512]])
def test_untilize_uint32(shape, device):
    torch.manual_seed(2005)
    input_a = torch.randint(1, 64, shape, dtype=torch.int32)
    input_tensor = ttnn.from_torch(input_a, device=device, layout=ttnn.TILE_LAYOUT, dtype=ttnn.uint32)
    output_tensor = ttnn.untilize(input_tensor)
    output_tensor = ttnn.to_torch(output_tensor)
    assert_with_pcc(input_a, output_tensor)

@pytest.mark.parametrize("shape", [[15, 15]])
def test_tilize_with_val_padding_uint32(shape, device):
    torch.manual_seed(2005)
    input_a = torch.randint(0, 64, shape, dtype=torch.int32)
    input_tensor = ttnn.from_torch(input_a, device=device, layout=ttnn.ROW_MAJOR_LAYOUT, dtype=ttnn.uint32)
    output_tensor = ttnn.tilize_with_val_padding(input_tensor, [32,32], 70)
    output_tensor = ttnn.to_torch(output_tensor)
    assert_with_pcc(input_a, output_tensor)


@pytest.mark.parametrize("shape", [[32, 32]])
def test_tilize_uint32(shape, device):
    torch.manual_seed(2005)
    input_a = torch.randint(0, 64, shape, dtype=torch.int32)
    input_tensor = ttnn.from_torch(input_a, device=device, layout=ttnn.ROW_MAJOR_LAYOUT, dtype=ttnn.uint32)
    output_tensor = ttnn.tilize(input_tensor)
    output_tensor = ttnn.to_torch(output_tensor)
    assert_with_pcc(input_a, output_tensor)

Additional context
Note: You need to add DataType::UINT32 to this line to test uint32 type in tilize

tt-metal/ttnn/cpp/ttnn/operations/data_movement/tilize/device/tilize_op.cpp

Line 25 in af79262

    
           input_tensor_a.get_dtype() == DataType::BFLOAT16 or input_tensor_a.get_dtype() == DataType::FLOAT32,

The text was updated successfully, but these errors were encountered:

nardoTT added the bug Something isn't working label Jan 17, 2025

sraizada-tt mentioned this issue Jan 20, 2025

uInt16 untilize pcc bug #15379

Open

ntarafdar added the LLK label Jan 20, 2025

ntarafdar assigned prajaramanTT Jan 20, 2025

ntarafdar added the P1 label Jan 20, 2025

ntarafdar assigned ttmtrajkovic and unassigned prajaramanTT Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tilize/untilize compute kernels are buggy #16860

Tilize/untilize compute kernels are buggy #16860

nardoTT commented Jan 17, 2025

Tilize/untilize compute kernels are buggy #16860

Tilize/untilize compute kernels are buggy #16860

Comments

nardoTT commented Jan 17, 2025