Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Buffer debloating blog post #524

Open
wants to merge 6 commits into
base: asf-site
Choose a base branch
from

Conversation

akalash
Copy link

@akalash akalash commented Apr 5, 2022

No description provided.

Copy link

@smattheis smattheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section 1 (What is the article about?)

  • "instabilities", do you mean "imbalances" or "varying conditions" (instabilities refer rather to failures than to stable conditions)
  • why "minimum possible overhead", do you mean "minimum latency" (typical duo is "maximum throughput" and/vs "minimum latency")
  • simplify example: instead of record1, record2 etc. It's not clear to me what the example should illustrate.
  • What is the problem that buffer debloating tries to solve? (Also it is not mentioned that "buffer debloating" refers to network buffers and how they can bloat.)

Section 2

  • For Java non-networking/Netty affiliated people, make clear that the term buffer is here used as a data structure to store records but also as the unit that is transfered via network.
  • I don't understand "network buffer is a good tread off". You mean trade-off ... a buffer cannot be a trade-off.

Section 3 (What else can go wrong?)

  • Here, you describe the actual problem. This is too late. The problem (on some abstract level) should be described in the summary at the top. Up to here, I actually don't know what it's all about. Then, step by step break it down and decsribe aspects like the relation to backpressure. At the finest level, describe the concrete mechanics as an example but explain on an abstract level. This makes it possible to understand. (Example: The problem is X. The root cause of X is Y and how it interacts with Z. For example, as illustrated, if y1 and y2 happen, z1 and z2 cause y3 ... and finally X.)

Section 4 & 5

  • Seems okay so far. Maybe some details which we can address later.

General:

  • Remove title "What is this article about?"
  • The text is very verbose. It should be more concise.
  • Add spaces before parentheses
  • Don't use "you" form if not really necessary and in this text "you" is not necessary (instead speak about the application or the system not about the reader "you")
  • Also don't switch between forms (I vs. you), avoid both

@akalash
Copy link
Author

akalash commented Apr 12, 2022

"instabilities", do you mean "imbalances" or "varying conditions" (instabilities refer rather to failures than to stable conditions)

Let's discuss it later but I think that instabilities is more correct word since I mean situations when conditions/environment are not stable(for example, it can be related to losing constant losing and establishing network connections)

why "minimum possible overhead", do you mean "minimum latency" (typical duo is "maximum throughput" and/vs "minimum latency")

latency in general is not so important here. The latency for aligned checkpoints and memory usage for unaligned checkpoints are important. So I try to find somehow the correct word for both of these cases. Perhaps, something like "maximum throughput" vs "minimum effective memory"( or "memory overhead")

For Java non-networking/Netty affiliated people, make clear that the term buffer is here used as a data structure to store records but also as the unit that is transfered via network.

It is exactly what is described in Network buffer, isn't it? -- Flink gather records into the network buffer which is the smallest unit of sending data to subtasks. Perhaps, it makes sense to rephrase it but I think the explanation is exactly as you described.

@smattheis
Copy link

With instabilities you're in fact right because the parameters are not stable. Nevertheless, I find it confusing because instability usually refers to some "collapse" scenario which is not the main aspect here, right? I find "varying conditions" a bit more precise here because varying conditions require to adapt which buffer debloating tries to approach.

"gather records" and "smallest unit of sending data" is a bit confusing/non-technical. For example, Flink buffers outbound records into so called network buffers which are, roughly speaking, byte arrays of limited size and represent the smallest unit of data that is then sent via network.

Copy link

@smattheis smattheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It like it and the structure looks really good. I have some comments below and I think the rest is then a bit tuning formulations. :)

Summary

  • Looks good.

Network stack

  • Looks good.

Buffer debloating

  • Introduce the term "buffer debloating" earlier in this section, e.g., after the explanation about "time instead of amount of data".
  • Question here: For the case of backpressure and no backpressure, what is exactly meant? Is it backpressure to the upstream task or from downstream task? I assume the first, but I would suggest to make this explicit here.

How does it work?

  • I really like that you open up the section from a top level perspective: two parameters, current implementation ajusts via buffer size only.
  • Is it important to note that, in theory, one could adjust both parameters? Or is there a conceptual reason to adjust only buffer size? If not, I would remove that sentence.
  • What I would add here is the implementation detail that it doesn't change the actual buffer size but limits the usage of each buffer to some adaptable maximum. As a consequence, it does not reduce the actual heap memory usage but reduces the amount of data and, hence, comes with the mentioned benefits.
  • Maybe introduce numbered lists to describe steps, e.g.:
    1. Consistently calculate throughput.
    2. Calculate buffer size
    3. Adapt smoothly.
  • I would remove the last sentence "It is why some specifc cases can be ...". I understand that you introduce the next section but it's not necessary. The headline of the next section does that.

Usage

  • Looks good but why is 'Multiple inputs and unions' section from the linked manual missing?

@akalash
Copy link
Author

akalash commented Apr 29, 2022

Question here: For the case of backpressure and no backpressure, what is exactly meant? Is it backpressure to the upstream task or from downstream task? I assume the first, but I would suggest to make this explicit here.

I didn't get the question. It is backpressure to the upstream task from downstream task(there is no any or there). So it means that the current task can not proceed due to output buffers being full. Maybe I need to clarify something but I don't really understand what exactly.

Is it important to note that, in theory, one could adjust both parameters? Or is there a conceptual reason to adjust only buffer size? If not, I would remove that sentence.

I think it is important to know that since if the user know about two parameters(buffer size, buffer numbers) it is obvious question what and how buffer debloating change. Ideally, we should change both these parameters but by simplicity reason(at least as first implementation) we decided to change only buffer size. Perhaps, we can just simplify this sentence like Currently, the buffer debloating manages the memory usage by adjusting only the size of the network buffer while the number of network buffers remains always the constant

Looks good but why is 'Multiple inputs and unions' section from the linked manual missing?

We actually already implemented that so we should fix the original documentation instead. at least we need to check what was implemented already. I will do that.

@smattheis
Copy link

To checkpointing:

Question here: For the case of backpressure and no backpressure, what is exactly meant? Is it backpressure to the upstream task or from downstream task? I assume the first, but I would suggest to make this explicit here.

I didn't get the question. It is backpressure to the upstream task from downstream task(there is no any or there). So it means that the current task can not proceed due to output buffers being full. Maybe I need to clarify something but I don't really understand what exactly.

You're right I find my question confusing myself. What I actually mean is, why do you distinguish here backpressure and no backpressure scenarios? That's not so intuitive to me because the benefits could theoretically (I admit only theoretically and probably never in practice) be visible also if there is no backpressure, i.e., if input buffers have high fill level whilst never being full with no backpressure such that buffer debloating limits data in the buffers and reduces checkpoint time and checkpoint size likewise. Nevertheless, if I get your idea right I would suggest to have a little transition like and make your intuition explicit, e.g.:

The benefits of buffer debloating are best visible when backpressure occurs. The reason is that no backpressure means that the downstream task is processing input faster than (or equally fast as) data arrives such that its input buffer is barely filled and, consequently, buffer debloating has almost no visible effect.

This is the opposite when backpressure occurs as it means input buffers are frequently filled. Since buffer debloating limits the amount of data in the input buffer, we observe the following benefits with regard to checkpointing:

...

To buffer debloating mechanism:

I think it is important to know that since if the user know about two parameters(buffer size, buffer numbers) it is obvious question what and how buffer debloating change. Ideally, we should change both these parameters but by simplicity reason(at least as first implementation) we decided to change only buffer size.

I would add a side phrase with exactly that.

For simplicity, buffer debloating currently only caps the maximal used buffer size ...

and at the end of the paragraph mention

Nevertheless, the benefits of buffer debloating with regard to checkpointing are effective as described before.

Other comments:

I think you accidentally reverted some changes, e.g., in the summary: 2f5be59

Copy link

@smattheis smattheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think remaining are some typos and grammar issues. Besides that, we need few corrections in the GIFs:

  • buffer-debloat.gif: Keep the arrows with buffer size update longer because they disappear too fast.
  • checkpoint-barrier.png: It's a static image. Is that as you wanted it?
  • simple_problem.gif is currently included twice in img/blog/ and second in content/img/blog. Also the label "Network/Local connection" oscillates with and without a linebreak. And the label "Waiting for network" is not at the same height for the subtasks.

@MartijnVisser
Copy link
Contributor

@akalash Didn't you want to ultimately complete this blog post?

@akalash
Copy link
Author

akalash commented Feb 24, 2023

@akalash Didn't you want to ultimately complete this blog post?

Oh, it is so a good point. I surely have to find time for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants