-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LG-15655: Offload reCAPTCHA annotations to worker job (Part 1 of 2) #11883
base: main
Are you sure you want to change the base?
Conversation
changelog: Internal, Performance, Offload reCAPTCHA annotations to worker job
It is considered a good practice to keep the job code light - https://github.com/toptal/active-job-style-guide?tab=readme-ov-file#business-logic-in-jobs. Doesn't it make sense to do so and call the |
app/jobs/recaptcha_annotate_job.rb
Outdated
key: -> { "#{self.class.name}-#{queue_name}-#{arguments.last[:assessment_id]}" }, | ||
) | ||
|
||
def perform(assessment_id:, reason: nil, annotation: nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed this in RecapatchaAnnotation
as well - why do we allow this to be nil
if we are always supplying a reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thought process here was that both are optional arguments in code because both are optional parameters of the annotation API.
I don't have a strong feeling to not make it required based on current usage, though I could imagine that it might cause confusion down the line if someone didn't realize they didn't have to provide a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you, but I feel we should take a stand here - I can't imagine sending an annotation without a reason. I remember struggling with this when implementing the previous story.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, one of the values in https://cloud.google.com/recaptcha/docs/reference/rest/v1/projects.assessments/annotate#reason is REASON_UNSPECIFIED Unspecified reason. Do not use.
(sic)
That's a good thought. I think we'd have to find another way to generate the logged hash value, but that should be reasonable to inline at the analytics call site I think. |
I am actually not seeing anything in https://github.com/bensheldon/good_job?tab=readme-ov-file#concurrency-controls about the last one winning. Can you point me to it? |
The "last wins" isn't overtly documented, but it's the behavior I'm expecting out of the |
On closer testing, it actually appears to be the opposite behavior of what we want: Namely, I tested by pausing the worker job execution while I accumulated multiple annotations (initiating and passing 2FA), checking the results by:
|
After some closer examination of the documentation and code internals of Allowed default behavior:
Behavior with
I also tested this locally, verifying that it's processed in order and that subsequent enqueued jobs don't start until after the first job is enqueued. I tested this with a bit of hacky debugging code, combined with the approach in the previous comment of "queueing up" jobs before starting diff --git a/app/jobs/recaptcha_annotate_job.rb b/app/jobs/recaptcha_annotate_job.rb
index 9ac0cc218b..09b476e905 100644
--- a/app/jobs/recaptcha_annotate_job.rb
+++ b/app/jobs/recaptcha_annotate_job.rb
@@ -13,2 +13,5 @@ class RecaptchaAnnotateJob < ApplicationJob
def perform(assessment_id:, reason:, annotation: nil)
+ puts 'run - %s - %s' % [reason, assessment_id]
+ sleep 1
+ puts 'after run - %s - %s' % [reason, assessment_id]
RecaptchaAnnotator.annotate(assessment_id:, reason:, annotation:) With this and a breakpoint at |
Another scenario I'll want to test, prompted by
Scenario:
|
🎫 Ticket
LG-15655
🛠 Summary of changes
Implements a new worker job to send annotations for Google reCAPTCHA Enterprise.
The intent is to avoid these external API requests slowing user response times in the main application. Since these annotations are already treated as "fire and forget", they're a prime candidate to worker-ize, particularly since we don't rely on any result of the job.
The primary challenge with this is ensuring that annotations for a particular assessment are handled in order in a "last wins" approach. We don't necessarily need all annotations to be sent, as long as the last annotation enqueued is the one to be sent. This is achieved using GoodJob's concurrency helpers with a maximum
enqueue_limit
of1
(see docs).The changes here only include the implementation of the job, to avoid deployment issues. A follow-up pull request will update the
RecaptchaAnnotator
class to enqueue jobs.📜 Testing Plan
Verify that build passes.