-
Notifications
You must be signed in to change notification settings - Fork 19
[DB-40533] Add support for new type VECTOR(<dim>,DOUBLE) #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Add encode/decode support for new VECTOR datatype. A result column of VECTOR(<dim>, DOUBLE) shows as type datatype.VECTOR_DOUBLE in the metadata description. It is returned as datatype.Vector with subtype datatype.Vector.DOUBLE which can be used as a list. For a parameter to be detected as VECTOR(<dim>,DOUBLE) it also needs to use the datatype.Vector datatype. Alternatively, binding a string is also possible - which currently includes binding a list of numbers which will then be converted to string. This commit also adds tests for the described behavior.
- remove whitespace line - add documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider my comments on detecting the server version in the test.
Otherwise it looks OK to me.
pynuodb/datatype.py
Outdated
@@ -149,10 +151,37 @@ def __cmp__(self, other): | |||
return -1 | |||
|
|||
|
|||
class Vector(list): | |||
"""A specific type for SQL VECTOR(<dim>, DOUBLE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not quite right, is it? A Vector element doesn't have to be a DOUBLE (I mean, it does now but it might not in the future)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It is supposed to be usable with other element types as well. Will fix that.
tests/nuodb_types_test.py
Outdated
# only activate this tests if tested against version 8 or above | ||
cursor.execute("select cast(substring_index(release_ver, '.', 1) as int)" | ||
" from system.nodes limit 1") | ||
row = cursor.fetchone() | ||
database_major_version = row[0] | ||
if database_major_version < 8: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a problem with this, per se.
I think it would be cleaner to obtain this information once somewhere in conftest.py and make it available there, for example maybe in the DATABASE_FIXTURE since that already creates a connection.
The current DATABASE_FIXTURE is the connection arguments but you could add this there and then extract it again in the nuodb_base.NuoBase._setup().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into that. If it isn't extremely hard (which I doubt) I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed that.
tests/nuodb_types_test.py
Outdated
cursor.execute("select cast(substring_index(release_ver, '.', 1) as int)" | ||
" from system.nodes limit 1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be safer and easier to use geteffectiveversion() and compare based on that integer value, rather than parsing the version string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just didn't know we have that function. Much better idea to use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to that function now. The version numbering makes it harder to see this is about version 8, but appart from that it is more readable and less brittle that way.
- correct documentation for VECTOR type - check system version using GETEFFECTIVEPLATFORMVERSION() - add system version information to separate field in test class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few small cleanups you can implement, or not, as you like. Thanks for this work!!
tests/conftest.py
Outdated
effective_version = row[0] | ||
system_information['effective_version'] = effective_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little funny to me to assign a new local variable effective_version, just to then re-assign it again... I spent some seconds looking for where else this new variable was used. But it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that style of coding with names for stuff. But this time I might have gone too far. Will shorten it a bit :-)
tests/nuodb_base.py
Outdated
|
||
# Verify the database is up and has a running TE | ||
dbname = database['database'] | ||
dbname = database['connect_args']['database'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use self.conect_args['database']
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to make this a simple substitution for review purposes. Will change that, there is no benefit to doing it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datatype.__all__ should include Vector.
As user of pynuodb would need/want to use pynuodb.Vector not pynuodb.datatype.Vector
Good point. |
Add encode/decode support for new VECTOR datatype.
A result column of VECTOR(, DOUBLE) shows as type datatype.VECTOR_DOUBLE in the metadata description.
It is returned as datatype.Vector with subtype
datatype.Vector.DOUBLE which can be used as a list.
For a parameter to be detected as VECTOR(,DOUBLE) it also needs to use the datatype.Vector datatype. Alternatively, binding a string is also possible - which currently includes binding a list of numbers which will then be converted to string.
This commit also adds tests for the described behavior.