How to model batch email sending #7678

dodyg · 2022-04-08T07:53:39Z

dodyg
Apr 8, 2022

I am trying to figure out how to handle medium size notification requirement. I need to create a notification module that send emails to about 1000-1500 people regularly.

My current thinking is that I should create a grain that takes a list of email messages and the message itself and then activate a reminder to work off the list of sending emails until it done.

IEmailNotificaton.SendEmailAsync(List<Email> emails, Message msg)

My thinking is that I can just call it method and it will return back immediately because it does nothing but update the internal state of the grain. The reminder then can do the actual process of sending the email.

Will this work? Is there any problem in using this approach?

Answered by JorgeCandeias

Apr 11, 2022

TLDR; Then your idea is a fair starting point. Using a singleton grain that manages the batch work makes it easier to also manage the sending rate and even parallelize it without overflowing the gateway allowance. Keep it simple, make it work, and evolve from there.

Note that sending emails is 99% serialization cost, 1% logic cost or less, so that's what we want to optimize in the Orleans cluster. If the number of emails your system needs to send is low enough to not care about this, then ignore the wall of text below.

Long story:

It will be beneficial if the email gateway you are using will return some known error code with a retry-after header on overflow without banning the sender afte…

View full answer

JorgeCandeias · 2022-04-11T13:01:25Z

JorgeCandeias
Apr 11, 2022

What are the behaviour restrictions of the target SMTP sender? Does it have a rate limit, for example?

0 replies

dodyg · 2022-04-11T19:00:00Z

dodyg
Apr 11, 2022
Author

Yes it will have a rate limit.

The same limitation also applies to SMS Gateway or WhatsApp API.

0 replies

JorgeCandeias · 2022-04-11T23:27:08Z

JorgeCandeias
Apr 11, 2022

TLDR; Then your idea is a fair starting point. Using a singleton grain that manages the batch work makes it easier to also manage the sending rate and even parallelize it without overflowing the gateway allowance. Keep it simple, make it work, and evolve from there.

Note that sending emails is 99% serialization cost, 1% logic cost or less, so that's what we want to optimize in the Orleans cluster. If the number of emails your system needs to send is low enough to not care about this, then ignore the wall of text below.

Long story:

It will be beneficial if the email gateway you are using will return some known error code with a retry-after header on overflow without banning the sender after a few immediate attempts (regular SMTP doesn't do this but HTTP APIs may). If so, instead of a singleton, you can use a [StatelessWorker(1)], which will make this workload automatically scale with the number of silos you spin up, plus avoid the network trip to the singleton grain and hence avoid promoting hotspots and wasting resources on unnecessary serialization. This means each stateless worker instance will:

handle emails generated from callers in its own silo;
keep sending emails until it gets back-pressure from the gateway.
wait until the retry-after clears
continue sending until the next back-pressure event or nothing else to send

If the gateway doesn't provide a retry-after header, some linear or exponential backoff logic can work as well, so long as the sender doesn't get banned for retrying while overflown.

Another scaling alternative to the above is to use a small range of random keys with a regular grain type, to spread a few fixed instances around the cluster, and let callers randomly target one. Doing this allows the grain instances to use Orleans persistence, which is something stateless workers cannot use. We don't save on the serialization cost by doing this, but at least we avoid hotspots.

I won't suggest the option of having one grain instance per address, due to the unnecessary data redundancy and activation overkill for such batch like work.

In addition:

The sending of even a small number of emails can well take over 30s (the default Orleans responsiveness timeout), so you may need to model the internal sending work as a long-running task. This means:
The SendEmailAsync(List<Email> emails, Message msg) method just puts that work into some internal structure and returns immediately to the caller;
The method needs to persist that internal structure for recovery in case the grain or the silo dies off.
- So in the case of the singleton or range key grains, you can use Orleans persistence, and only migrate to a database if it becomes justified.
- In case of Stateless Workers, you'll need to rely on a database or other explicit storage to manage some shared recovery scheme.
The work happens internally in some batch-y and/or parallelized way. The TPL dataflow is very good for these kinds of things. We can also use Orleans timers to keep the code simple and less aggressive on resources, and they can also act as a self-rate limiter.
For the singleton or range key grains, you need to have some noop reminder to keep the grain(s) alive until the work is done, or even forever, if you don't mind leaving them there for incoming workloads.

If your email gateway does not offer safe back-pressure without the risk of sender banning then you'll need to control the rate upfront. Easy to do with the singleton approach, less with the other approaches, as you'll need some other singleton grain to share updated rate usage among all the workers. Or, you can rely on a well calculated Orleans timer period (e.g. [allowance] / [shards]) to make the shards rate control themselves.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to model batch email sending #7678

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to model batch email sending #7678

dodyg Apr 8, 2022

Replies: 3 comments

JorgeCandeias Apr 11, 2022

dodyg Apr 11, 2022 Author

JorgeCandeias Apr 11, 2022

dodyg
Apr 8, 2022

JorgeCandeias
Apr 11, 2022

dodyg
Apr 11, 2022
Author

JorgeCandeias
Apr 11, 2022