Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading Table Metadata: NumRows is not populated #339

Open
coxley opened this issue Jul 16, 2024 · 0 comments
Open

Reading Table Metadata: NumRows is not populated #339

coxley opened this issue Jul 16, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@coxley
Copy link

coxley commented Jul 16, 2024

What happened?

When reading a table's metadata via the Go SDK, it appears to always be zero even if rows have recently been inserted.

I see that this was brought up in #41 and reportedly fixed in #46 (which equates to version 0.1.12), but it doesn't work even when pinning the version to back then. #249 mentions a similar issue, but didn't include enough details to reproduce so it stalled.

The SDK entry-point is here: https://pkg.go.dev/cloud.google.com/go/bigquery#Table.Metadata

Explicitly setting a TableMetadataView, such as BasicMetadataView or FullMetadataView doesn't change the outcome.

What did you expect to happen?

I'd expect the NumRows value in metadata to match the behavior of normal BigQuery.

How can we reproduce it (as minimally and precisely as possible)?

Below is a fully reproducible example using a standard Go test + testcontainers to init the emulator. Expects that docker is available.

To setup the test, do the following:

# Create temporary directory and go module for dependency fetching
cd $(mktemp -d)
go mod init test

# Write below test file

go mod tidy
go test .
// main_test.go (or whatever you'd like)
package main

import (
	"context"
	"fmt"
	"sync"
	"testing"

	"cloud.google.com/go/bigquery"
	"github.com/stretchr/testify/require"
	"github.com/testcontainers/testcontainers-go"
	"github.com/testcontainers/testcontainers-go/wait"
	"google.golang.org/api/iterator"
	"google.golang.org/api/option"
)

const (
	bqPort      = "9050/tcp"
	testProject = "project"
)

// startBigQuery spins up a test container and blocks until it's ready. Only one
// container can be started per test binary.
var startBigQuery = sync.OnceValues(func() (testcontainers.Container, error) {
	ctx := context.Background()
	return testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
		ContainerRequest: testcontainers.ContainerRequest{
			Image: "ghcr.io/goccy/bigquery-emulator:latest",
			Cmd: []string{
				"--project=" + testProject,
			},
			ExposedPorts: []string{bqPort},
			WaitingFor:   wait.ForExposedPort(),
		},
		Started: true,
	})
})

func TestBreak(t *testing.T) {
	ctx := context.Background()

	// Startup emulator and create client
	container, err := startBigQuery()
	require.NoError(t, err)

	addr, err := container.Endpoint(ctx, "")
	require.NoError(t, err)

	endpoint := "http://" + addr
	t.Logf("bigquery endpoint: %s", endpoint)
	client, err := bigquery.NewClient(
		ctx,
		testProject,
		option.WithoutAuthentication(),
		option.WithEndpoint(endpoint),
	)
	require.NoError(t, err)
	t.Log("bigquery client: connected")

	// Bootstrap BQ datasets
	ds := client.Dataset("main")
	err = ds.Create(ctx, nil)
	require.NoError(t, err)
	t.Logf(
		"main dataset created: %s",
		ignoreErr(ds.Identifier(bigquery.StandardSQLID)),
	)

	t.Cleanup(func() {
		require.NoError(t, ds.Delete(ctx))
	})

	table := ds.Table("data")
	err = table.Create(ctx, &bigquery.TableMetadata{
		Schema: bigquery.Schema{
			&bigquery.FieldSchema{Name: "value", Type: bigquery.NumericFieldType},
		},
	})

	require.NoError(t, err)
	t.Logf(
		"table created: %s",
		ignoreErr(table.Identifier(bigquery.StandardSQLID)),
	)

	// Insert 10 rows with value:n set
	rows := []saver{}
	for n := range 10 {
		rows = append(rows, saver{n})
	}
	inserter := table.Inserter()
	require.NoError(t, inserter.Put(ctx, rows))
	t.Log("rows: inserted")

	// Query all rows with SQL and confirm that the expected number exists
	q := client.Query(fmt.Sprintf("SELECT * FROM %s", ignoreErr(table.Identifier(bigquery.StandardSQLID))))

	it, err := q.Read(ctx)
	require.NoError(t, err)

	var cnt int
	for {
		row := []bigquery.Value{}
		err := it.Next(&row)
		if err == iterator.Done {
			break
		}
		require.NoError(t, err)
		cnt++
	}
	require.EqualValues(t, 10, cnt)

	// Query table metadata and assert that the number of rows matches
	md, err := table.Metadata(ctx)
	require.NoError(t, err)
	require.EqualValues(t, md.NumRows, 10)
}

type saver struct {
	value int
}

func (s *saver) Save() (map[string]bigquery.Value, string, error) {
	return map[string]bigquery.Value{"value": s.value}, "", nil
}

func ignoreErr[T any](v T, err error) T {
	if err != nil {
		panic(err)
	}
	return v
}

Anything else we need to know?

No response

@coxley coxley added the bug Something isn't working label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant