Skip to content
This repository has been archived by the owner on Sep 28, 2022. It is now read-only.

need to fix perf issue with integer import in importbatch.go #263

Open
jaffee opened this issue Mar 19, 2020 · 0 comments
Open

need to fix perf issue with integer import in importbatch.go #263

jaffee opened this issue Mar 19, 2020 · 0 comments

Comments

@jaffee
Copy link
Member

jaffee commented Mar 19, 2020

Detailed in a TODO comment as usual.

		// TODO(jaffee) I think this may be very inefficient. It looks
		// like we're copying the `ids` and `values` slices over
		// themselves (an O(n) operation) for each nullIndex so this
		// is effectively O(n^2). What we could do is iterate through
		// ids and values each once, while simultaneously iterating
		// through nullindices and keeping track of how many
		// nullIndices we've passed, and so how far back we need to
		// copy each item.
		//
		// It was a couple weeks ago that I wrote this code, and I
		// vaguely remember thinking about this, so I may just be
		// missing something now. We should benchmark on what should
		// be a bad case (an int field which is mostly null), and see
		// if the improved implementation helps a lot.

Now I've actually run into it:

		// Update: I ran into this on a largish batch size (4M) with a
		// very small percentage of nils (0.5%) - was very obvious in
		// the CPU profile
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant