Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to helgrind error for accessing pending_work_item_count_ptr #395

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nishudeshu
Copy link
Contributor

The pending_work_item_count_ptr is not used when threadpool_schedule_work is used. pending_work_item_count_ptr is always initialized to NULL in threadpool_create and when pending_work_item_count_ptr is initialized to NULL in reallocate task array, an exlusive lock is used. So, removing the NULL assignment in the threadpool_schedule_work.

@nishudeshu
Copy link
Contributor Author

/azp run Azure-C-Pal

Copy link

azure-pipelines bot commented Oct 22, 2024

Azure Pipelines successfully started running 1 pipeline(s).

#Resolved

@nishudeshu
Copy link
Contributor Author

nishudeshu commented Oct 22, 2024

@nishudeshu nishudeshu enabled auto-merge (squash) October 22, 2024 23:00
@@ -237,7 +237,7 @@ MOCKABLE_FUNCTION(, int, threadpool_schedule_work, THANDLE(THREADPOOL), threadpo

**SRS_THREADPOOL_LINUX_07_048: [** If reallocating the task array fails, `threadpool_schedule_work` shall fail and return a non-zero value. **]**

**SRS_THREADPOOL_LINUX_07_049: [** `threadpool_schedule_work` shall initialize `pending_work_item_count_ptr` with `NULL` then copy the work function and work function context into insert position in the task array and assign `0` to the return variable to indicate success. **]**
**SRS_THREADPOOL_LINUX_07_049: [** `threadpool_schedule_work` shall copy the work function and work function context into insert position in the task array and assign `0` to the return variable to indicate success. **]**
Copy link
Member

@dcristoloveanu dcristoloveanu Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign 0 to the return variable to indicate success

and return 0. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this recurring and it seems that it is not going away, so shall wait for it. We can discuss in person what the problem is, but I think it is better to deal with it once and for all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dan,

Thanks for the comments. The changes you are mentioning was done in the old commits on request of other reviewers.
https://github.com/Azure/c-pal/pull/373/files/04ed1648d62fe6742bd7056d17c3a43a8175e18a#r1775937979
Please check above link to see the comments.

Since this story is really to unblock anyone using the old threadpool_schedule_work or someone facing failing tests, I sent the code review with minimal changes.

Can we address your other comments in the threadpool_create_work_item work that we will have to do later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishudeshu The short answer is no, we should address these comments now. Kicking them down the road does not buy us anything and there is a high chance they would remain like this in the code, which would be less than ideal.

Strictly on the 2 points:

  1. assign 0 to the return variable to indicate success => I have explained why we think 'shall return 0' is better. Changing this is trivial and we should not spend lots of energy in debating this. I assume you understood the reasoning already in the previous code review, but if that's not the case, I can explain it again.

  2. The comments about pending_work_item_count_ptr in the code make it clear that even if you fix the current helgrind failure, other issues exist and they will surface again. Let's address them now, when the context is in front of us!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcristoloveanu

  1. I can do that. I have tagged the previous comments above where I was asked to make the changes since returning 0 or NULL indicates an immediate return which may not be always true in case of a cleanup before return at the end of the code.
  2. What other issues exist?

Copy link
Member

@dcristoloveanu dcristoloveanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@dcristoloveanu
Copy link
Member

dcristoloveanu commented Oct 23, 2024

volatile_atomic int32_t *pending_work_item_count_ptr;

A comment next to this indicating what it represents would be very good.
I started at reviewing the change which removes the setting of this field to NULL, but I realized I do not really get what the field contains.


Refers to: linux/src/threadpool_linux.c:75 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@dcristoloveanu
Copy link
Member

dcristoloveanu commented Oct 23, 2024

                    /* Codes_SRS_THREADPOOL_LINUX_05_040: [ threadpool_work_func shall save pending_work_item_count_ptr is not NULL in is_pending_work_item_count_ptr_not_null. ]*/

Honestly speccing this level of detail where we say we set a local variable to a certain value is extreme and not really helping. We want to spec at level of the behavior of the public APIs of the module, not at the internals of variable assignment.

We can discuss about it in person tomorrow.


Refers to: linux/src/threadpool_linux.c:222 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@dcristoloveanu
Copy link
Member

dcristoloveanu commented Oct 23, 2024

                        /* Codes_SRS_THREADPOOL_LINUX_05_045: [ threadpool_work_func shall send wake up signal to single listener for the address in pending_work_item_count_ptr. ]*/

What guarantees that between line 223 (when we computed that pending_work_item_count_ptr is not NULL) and line 232 (when we're accessing it), the value of pending_work_item_count_ptr has not changed?


Refers to: linux/src/threadpool_linux.c:233 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@dcristoloveanu
Copy link
Member

dcristoloveanu commented Oct 23, 2024

                        interlocked_decrement(threadpool->task_array[current_index].pending_work_item_count_ptr);

We mark with (void) the fact that we ignore on purpose return values in the code. This is one such case where interlocked_decrement returns a value, but we ignore it by design.


Refers to: linux/src/threadpool_linux.c:232 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@dcristoloveanu
Copy link
Member

dcristoloveanu commented Oct 23, 2024

                    /* Codes_SRS_THREADPOOL_LINUX_05_042: [ If the is_pending_work_item_count_ptr_not_null is TRUE then: ]*/

TRUE us a Windows #define, when using bool one uses true and false (lower case).
In any case, please see the other comment that I don't think we should spec how we set this local variable and how we check it's value. Too much coupling here to implementation details.


Refers to: linux/src/threadpool_linux.c:226 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@dcristoloveanu
Copy link
Member

dcristoloveanu commented Oct 23, 2024

                interlocked_increment(&threadpool_work_item->pending_work_item_count);

(void)


Refers to: linux/src/threadpool_linux.c:914 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@dcristoloveanu
Copy link
Member

                interlocked_increment(&threadpool_work_item->pending_work_item_count);

Here and everywhere applicable.


In reply to: 2430570839


Refers to: linux/src/threadpool_linux.c:914 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@nishudeshu
Copy link
Contributor Author

                        /* Codes_SRS_THREADPOOL_LINUX_05_045: [ threadpool_work_func shall send wake up signal to single listener for the address in pending_work_item_count_ptr. ]*/

At line 223, we are checking if pending_work_item_count_ptr is NULL or not. It is NULL for the threadpool_schedule_work which is an old implementation.
It is NOT NULL for the new implementation aka threadpool_schedule_work_item which is guaranteed to be not NULL because that function assigns it a value at line 915 in an exclusive lock.


In reply to: 2430567049


Refers to: linux/src/threadpool_linux.c:233 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@nishudeshu
Copy link
Contributor Author

                    /* Codes_SRS_THREADPOOL_LINUX_05_040: [ threadpool_work_func shall save pending_work_item_count_ptr is not NULL in is_pending_work_item_count_ptr_not_null. ]*/

Done


In reply to: 2430565113


Refers to: linux/src/threadpool_linux.c:222 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

@nishudeshu
Copy link
Contributor Author

                        interlocked_decrement(threadpool->task_array[current_index].pending_work_item_count_ptr);

Done


In reply to: 2430567973


Refers to: linux/src/threadpool_linux.c:232 in d2f4b01. [](commit_id = d2f4b01, deletion_comment = False)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants