Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use #exec_insert over #exec_query because there's no field selections #36

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

daemonsy
Copy link

What's up?

Hi there,

Thanks for the gem! It's awesome. While I was trying it, I got an error on calling worker.save!.

NoMethodError: undefined method `fields' for nil:NilClass
	from /Users/saw/.gem/ruby/2.3.6/gems/activerecord-3.2.22.9/lib/active_record/connection_adapters/mysql2_adapter.rb:219:in `exec_query'
	from (irb):2
	from /Users/saw/.gem/ruby/2.3.6/gems/railties-3.2.22.9/lib/rails/commands/console.rb:47:in `start'
	from /Users/saw/.gem/ruby/2.3.6/gems/railties-3.2.22.9/lib/rails/commands/console.rb:8:in `start'
	from /Users/saw/.gem/ruby/2.3.6/gems/railties-3.2.22.9/lib/rails/commands.rb:41:in `<top (required)>'
	from script/rails:6:in `require'
	from script/rails:6:in `<main>'

Investigation

It seems that the basic skeleton code of Connection#exec_query expects a result set that has field selections. However, since we're doing a bulk insert, it doesn't seem like this will work.

def exec_query(sql, name = 'SQL', binds = [])
  result = execute(sql, name)
  ActiveRecord::Result.new(result.fields, result.to_a)
end

However, there is an Connection#exec_insert method that does not expect return results.

I suspect exec_query is used because of return_primary_keys where we call a SELECT after the insert.

What this does

  • Use exec_query when there primary keys are required
  • Use exec_insert otherwise

@daemonsy
Copy link
Author

Hi @jamis a small bump, hopefully you will have a chance to take a look in the coming days. Have a good weekend.

Copy link
Collaborator

@mberlanda mberlanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @daemonsy! Did you manage to reproduce it on rails 4 and 5 as well?

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb#L115-L118

def exec_insert(sql, name = nil, binds = [], pk = nil, sequence_name = nil)
  sql, binds = sql_for_insert(sql, pk, nil, sequence_name, binds)
  exec_query(sql, name, binds)
end

AFAIK return primary key is supported for postgresql adapters only : https://github.com/jamis/bulk_insert/blob/master/README.md#return-primary-keys-postgresql-postgis

Would it work for mysql2 adapter?

@daemonsy
Copy link
Author

It seems that at least it didn't error on the mysql2 adapter.

I'll test it on Rails 4/5 and also Postgres and let you know what I find

@mberlanda
Copy link
Collaborator

It should be fine for postgresql I guess.
For MySQL is a little bit more complicated since you cannot emulate the behavior of postgresql's returning out of the box:

Returning equivalent in MySQL
https://stackoverflow.com/q/9027008/5687152

For a single insertion it would make sense to retrieve LAST_INSERT_ID().

However for bulk insert is more complicates:

https://stackoverflow.com/q/7333524/5687152

InnoDB guarantees sequential numbers for AUTO INCREMENT when doing bulk inserts, provided innodb_autoinc_lock_mode is set to 0 (traditional) or 1 (consecutive). Consequently you can get the first ID from LAST_INSERT_ID() and the last by adding ROW_COUNT()-1.

Even if we would like to implement this combination gets quite hard in cases like the one I am facing with a MySQL DB where the autoincrement offset is changing according to the table.

More details on the keywords from an old (and still used a lot) MySQL version doc:

https://dev.mysql.com/doc/refman/5.6/en/information-functions.html#function_last-insert-id

I am not excluding that there is a smart and straightforward way to make pg logic available in this cases. (E.g. it could make sense to use this combination if you can easily retrieve the autoincrement offset and you retrieve the row diff and the last id within the same transaction as the insert)

@jamis
Copy link
Owner

jamis commented Aug 1, 2018

Sorry to be slow to chime in here. Yeah, the "return primary keys" is Postgres-only, and as @mberlanda has pointed out, it's non-trivial to implement in MySQL (or, AFAIK, other databases in general).

I think the right way forward, here, is to only use exec_query when @return_primary_keys is true, and the adapter is Postgres. Otherwise, it should use execute or exec_insert.

(Interestingly, the default implementation of exec_insert calls exec_query...though perhaps not in the MySQL adapter?)

@mberlanda
Copy link
Collaborator

mberlanda commented Aug 2, 2018

I have checked the current implementation of exec_query for MySQL adapter on the ActiveRecord master branch:

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/mysql/database_statements.rb

def exec_query(sql, name = "SQL", binds = [], prepare: false)
  if without_prepared_statement?(binds)
    execute_and_free(sql, name) do |result|
      ActiveRecord::Result.new(result.fields, result.to_a) if result
    end
  else
    exec_stmt_and_free(sql, name, binds, cache_stmt: prepare) do |_, result|
      ActiveRecord::Result.new(result.fields, result.to_a) if result
    end
  end
end

This is preventing from raising NoMethodError: undefined method 'fields' for nil:NilClass. However, since no else is provided the result returned will be nil.

I will suggest to create a custom error and raise when return primary_key is provided with adapters other than pg. I can do a quick pr!

@daemonsy
Copy link
Author

daemonsy commented Aug 2, 2018

Hey @mberlanda @jamis sorry I haven't dug into the code, I see that Mauro is on top of it already. To take a step back, the issue I uncovered was that undefined method fields' for nil:NilClass` and it is caused bulk insert returning nothing (mysql) and that caused the error instantiating the result set.

At that point, my thinking is just that exec_query is used when you except results and exec_insert doesn't expect results.

Following Mauro's investigation and reading the code for the different adapters, that seems to be case. In PG's case, exec_query just naively instantiates results. In MySQL's case (copied from above), it has a guard clause.

**NB: PG's exec_insert seem to have something that deals with the return_primary_key option. I'll take a look. **

I agree with @jamis 's comment, we should just use the two methods accordingly. I can update the PR to make sure it works with both PG + (return primary keys) and MySQL.

exec_query

PG

def exec_query(sql, name = "SQL", binds = [], prepare: false)
  execute_and_clear(sql, name, binds, prepare: prepare) do |result|
    types = {}
    fields = result.fields
    fields.each_with_index do |fname, i|
      ftype = result.ftype i
      fmod  = result.fmod i
      types[fname] = get_oid_type(ftype, fmod, fname)
    end
    ActiveRecord::Result.new(fields, result.values, types)
  end
end

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L80

MySQL

def exec_query(sql, name = "SQL", binds = [], prepare: false)
  if without_prepared_statement?(binds)
    execute_and_free(sql, name) do |result|
      ActiveRecord::Result.new(result.fields, result.to_a) if result
    end
  else
    exec_stmt_and_free(sql, name, binds, cache_stmt: prepare) do |_, result|
      ActiveRecord::Result.new(result.fields, result.to_a) if result
    end
  end
end

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/mysql/database_statements.rb#L31

@daemonsy
Copy link
Author

Sorry I haven't acted on this. Will try to rally sometime to do it within this week.

@mberlanda
Copy link
Collaborator

ok! Don't worry! I hope I will have the time to finish soon #37 to allow testing against the real database without mocking the adapter. In this case we should be able to validate that this has the expected behaviour for mysql2 adapter 😄

Copy link
Collaborator

@mberlanda mberlanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @daemonsy I updated the test matrix to validate any future change against database instances. Could you please validate if this change is still needed in v1.9.0?
In case, you may rebase your change and it should be safe after CI passes. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants