Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation around calling .free #522

Open
drewbitt opened this issue Apr 25, 2024 · 4 comments
Open

Improve documentation around calling .free #522

drewbitt opened this issue Apr 25, 2024 · 4 comments
Milestone

Comments

@drewbitt
Copy link

Reproduction

import { tableFromJSON, tableToIPC } from "apache-arrow";
import * as Parquet from "parquet-wasm";

// Sample data
const testData = [
  { id: 1, name: "John" },
  { id: 2, name: "Jane" },
];

// Create an Arrow table from the test data
const arrowTable = tableFromJSON(testData);
console.log(arrowTable);

// Create a Parquet Table from the Arrow table
const wasmTable = Parquet.Table.fromIPCStream(tableToIPC(arrowTable, "stream"));
console.log(wasmTable);

// Write the Parquet table to a buffer
const writerProperties = new Parquet.WriterPropertiesBuilder().build();
const parquetData = Parquet.writeParquet(wasmTable, writerProperties);

// Attempt to free the Parquet Table
wasmTable.free();
Output

tsx json-parquet-2.ts

Table {
  schema: Schema {
    fields: [ [Field], [Field] ],
    metadata: Map(0) {},
    dictionaries: Map(1) { 0 => [Utf8] },
    metadataVersion: 4
  },
  batches: [ RecordBatch { schema: [Schema], data: [Data] } ],
  _offsets: Uint32Array(2) [ 0, 2 ]
}
Table { __wbg_ptr: 2369000 }
/Users/drewbitt/Repos/Pantomath/benchmarking/node_modules/parquet-wasm/node/parquet_wasm.js:3359
    throw new Error(getStringFromWasm0(arg0, arg1));
          ^

Error: null pointer passed to rust
    at module.exports.__wbindgen_throw (/Users/drewbitt/Repos/x/benchmarking/node_modules/parquet-wasm/node/parquet_wasm.js:3359:11)
    at wasm://wasm/014c002a:wasm-function[6573]:0x405d03
    at wasm://wasm/014c002a:wasm-function[6574]:0x405d10
    at wasm://wasm/014c002a:wasm-function[3297]:0x3a06de
    at wasm://wasm/014c002a:wasm-function[4074]:0x3c8bd1
    at Table.free (/Users/drewbitt/Repos/x/benchmarking/node_modules/parquet-wasm/node/parquet_wasm.js:2095:14)
    at <anonymous> (/Users/drewbitt/Repos/x/benchmarking/json-parquet-2.ts:23:11)
    at Object.<anonymous> (/Users/drewbitt/Repos/x/benchmarking/json-parquet-2.ts:23:16)
    at Module._compile (node:internal/modules/cjs/loader:1376:14)
    at Object.S (/Users/drewbitt/.local/share/mise/installs/npm-tsx/4.7.1/lib/node_modules/tsx/dist/cjs/index.cjs:1:1292)

Node.js v20.11.0

I'm not very well aligned in this space, so let me know if this is expected for some reason. Thanks!

@kylebarron
Copy link
Owner

kylebarron commented Apr 25, 2024

Yeah... this part can be confusing. The tl;dr is that writeParquet frees the table itself. We should probably clarify this in the function's docstring

Functions exported from rust through wasm-bindgen can either take inputs by reference or by value, and the latter consumes the input object. Here, writeParquet takes the input table by value, and so consumes its data.

You can always check the __wbg_ptr property of a wasm object to check whether the data has been freed or not. If the pointer is 0, it's a null pointer and the data has already been freed.

> let wasm = require('parquet-wasm/node')
> let properties = new wasm.WriterPropertiesBuilder().build()
undefined
> properties.__wbg_ptr
2621480
> properties.free()
undefined
> properties.__wbg_ptr
0

@drewbitt
Copy link
Author

Thank you! That was helpful

I think adding that to the docstring and not erroring when this happens - stopping all execution - would be nice to have. A console.warn would be more suitable.

@kylebarron
Copy link
Owner

not erroring when this happens - stopping all execution - would be nice to have

That's not something I can control. That's part of the auto-generated bindings by rust's wasm-bindgen.

@drewbitt drewbitt closed this as not planned Won't fix, can't repro, duplicate, stale Apr 25, 2024
@kylebarron
Copy link
Owner

Let's keep this open as a reminder to improve the documentation here

@kylebarron kylebarron reopened this Apr 25, 2024
@kylebarron kylebarron added this to the 0.7.0 milestone Aug 12, 2024
@kylebarron kylebarron changed the title Crash on .free() Improve documentation around calling .free Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants