Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-35475 Assertion `!rec_offs_nth_extern(offsets1, n)' failed in cmp_rec_rec_simple_field #3656

Open
wants to merge 1 commit into
base: 10.11
Choose a base branch
from

Conversation

Thirunarayanan
Copy link
Member

  • The Jira issue number for this PR is: MDEV-35475

Description

Problem:

InnoDB wrongly stores the primary key field in externally stored off page during bulk insert operation. This leads to assert failure.

Solution:

row_merge_buf_blob(): Should store the primary key fields inline. Store the variable length field data externally based on the row format of the table.
g.

How can this PR be tested?

./mtr innodb.alter_copy_bulk innodb.insert_into_empty

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any part of this applicable to ALTER TABLE…ALGORITHM=INPLACE in 10.5 or 10.6? Did you try to modify the test case to exercise that code path?

mysql-test/suite/innodb/r/insert_into_empty,4k.rdiff Outdated Show resolved Hide resolved
storage/innobase/row/row0merge.cc Outdated Show resolved Hide resolved
@Thirunarayanan Thirunarayanan force-pushed the 10.11-MDEV-35475 branch 3 times, most recently from 5f9333a to 124ae9b Compare November 26, 2024 08:36
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this had better be tested also with ROW_FORMAT=REDUNDANT and ROW_FORMAT=COMPACT, possibly by means of innodb_default_row_format.inc.

storage/innobase/row/row0merge.cc Outdated Show resolved Hide resolved
storage/innobase/row/row0merge.cc Show resolved Hide resolved
storage/innobase/row/row0merge.cc Outdated Show resolved Hide resolved
…p_rec_rec_simple_field

Problem:
=======
InnoDB wrongly stores the primary key field in externally
stored off page during bulk insert operation. This leads
to assert failure.

Solution:
========
row_merge_buf_blob(): Should store the primary key fields
inline. Store the variable length field data externally
based on the row format of the table.

row_merge_buf_write(): check whether the record size exceeds
the maximum record size.

row_merge_copy_blob_from_file(): Construct the tuple based on
the variable length field
dberr_t err= os_file_write(
IORequestWrite, "(bulk insert)", blob_file->fd,
field->data, blob_file->offset, len);
(byte*)field->data + local_len, blob_file->offset,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static_cast<const byte*>(field->data) + local_len would be safer.

Comment on lines +1076 to 1087
memcpy(data, field->data, local_len);
/* Write zeroes for first 8 bytes */
memset(data, 0, 8);
memset(data + local_len, 0, 8);
/* Write offset for next 8 bytes */
mach_write_to_8(data + 8, val);
mach_write_to_8(data + local_len + 8, val);
/* Write length of the blob in 4 bytes */
mach_write_to_4(data + 16, len);
blob_file->offset+= field->len;
mach_write_to_4(data + local_len + 16, field->len - local_len);
blob_file->offset+= (field->len - local_len);
blob_file->n_rec++;
dfield_set_data(field, data, BTR_EXTERN_FIELD_REF_SIZE);
dfield_set_data(field, data,
local_len + BTR_EXTERN_FIELD_REF_SIZE);
dfield_set_ext(field);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could invoke dfield_set_data() and dfield_set_ext() at the start and then do data += local_len to reduce the diff and to possibly simplify the code, in case the common subexpression elimination (CSE) phase of the compiler might miss the repeated data + local_len.

Comment on lines 1114 to 1118
for (ulint i= 0; i < n_fields; i++)
{
dfield_t *field= &entry->fields[i];
if (dfield_is_null(field) || field->len <= 2000)
dfield_t *field= &fields[i];
if (i < index->n_uniq || dfield_is_null(field))
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just adjust the initial value of the loop variable?

for (auto i= index->first_user_field(); i < n_fields; i++)

Comment on lines +1162 to +1164
if (!blob_heap) {
blob_heap = mem_heap_create(100);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was the initial size 100 bytes determined? If it is too small, a subsequent mem_heap_alloc() might just waste that block and allocate a new one. If it is too large, the end of the block would go wasted.

Comment on lines 3562 to 3566
{
const uint blob_prefix= dict_table_has_atomic_blobs(index->table)
? 0
: REC_ANTELOPE_MAX_INDEX_COL_LEN;
for (ulint i = 0; i < dtuple_get_n_fields(tuple); i++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the loop start as follows?

  for (auto i= index->first_user_field(); i < tuple->n_fields; i++)

No clustered index field before that may be stored externally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants