-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create chunk-size-calculation.md #2874
Open
gerner-spryker
wants to merge
26
commits into
master
Choose a base branch
from
chunk-size-calculator
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 20 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
3335413
Create chunk-size-calculation.md
gerner-spryker 456bbd0
Update chunk-size-calculation.md
gerner-spryker d288115
Update chunk-size-calculation.md
gerner-spryker d514e15
Update chunk-size-calculation.md
gerner-spryker 626e70d
Update queue.md
gerner-spryker fdb0dae
Update queue.md
gerner-spryker 27497a7
Create basic-chunk-size-calculation.md
gerner-spryker e550f46
Update queue.md
gerner-spryker b33dbc9
Update chunk-size-calculation.md
gerner-spryker 6eaa48d
Update basic-chunk-size-calculation.md
gerner-spryker e70e817
Create advanced-chunk-size-calculation.md
gerner-spryker 5eb3f5d
Update basic-chunk-size-calculation.md
gerner-spryker 70263e7
Update basic-chunk-size-calculation.md
gerner-spryker 382e7be
Update advanced-chunk-size-calculation.md
gerner-spryker 3873b5c
Update basic-chunk-size-calculation.md
gerner-spryker b37ebe8
Update advanced-chunk-size-calculation.md
gerner-spryker 67996e8
Update chunk-size-calculation.md
gerner-spryker fd41ee1
Update basic-chunk-size-calculation.md
gerner-spryker 935bc8b
Update advanced-chunk-size-calculation.md
gerner-spryker 365c0da
Create expert-chunk-size-calculation.md
gerner-spryker ae480d3
Update expert-chunk-size-calculation.md
gerner-spryker 1fb17d3
Merging chunk-size-calculator documents
gerner-spryker a5bdb53
Update chunk-size-calculation.md
gerner-spryker 6510679
Update chunk-size-calculation.md
gerner-spryker b42e6be
Update chunk-size-calculation.md
gerner-spryker 15bd7f2
Update chunk-size-calculation.md
gerner-spryker File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
51 changes: 51 additions & 0 deletions
51
.../backend-development/data-manipulation/queue/advanced-chunk-size-calculation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
--- | ||
title: Advanced Chunk Size Calculation | ||
description: Gives an overview over the advanced chunk size calculation | ||
last_updated: Oct 25, 2024 | ||
template: concept-topic-template | ||
redirect_from: | ||
- /docs/dg/dev/backend-development/data-manipulation/queue/chunk-size-calculation.html | ||
related: | ||
- title: Basic Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/basic-chunk-size-calculation.html | ||
- title: Expert Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/expert-chunk-size-calculation.html | ||
- title: Queue | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/queue.html | ||
|
||
--- | ||
|
||
## Advanced Chunk Size Calculation | ||
|
||
The **Advanced Chunk Size Calculator** builds upon the Basic level by allowing to fine-tune chunk sizes based on custom hardware limitations and custom performance metrics. | ||
|
||
The **Advanced Chunk Size Calculator** is available [here](link to google spreadsheet). | ||
|
||
### Problem Overview | ||
|
||
While the **Basic Chunk Size Calculator** offers a starting point, it doesn’t account for the nuances of resource allocations (like container sizes, CPU, and memory limits) or performance-sensitive variables such as application warm-up time and event message size. The **Advanced Chunk Size Calculator** addresses these issues by incorporating these additional metrics, enabling a more precise configuration that enhances stability and performance in production environments. | ||
|
||
### Input Parameters | ||
|
||
To calculate the correct queue chunk sizes, developers must provide the following information based on the specific production environment: | ||
- **Scheduler and Worker Setup**: Provide details on any non-standard configurations, such as environments with multiple containers or distinct worker distributions within containers, if your scheduler setup differs from the boilerplate defaults. | ||
- **Resource Configuration**: Provide information on your hardware setup, including instance types, CPU, and memory limits for services like Persistence, Storage, Search, or Message Broker, to allow the **Advanced Calculator** to optimize chunk sizes based on actual resource availability. | ||
- **Detailed Product Configuration**: Provide specific metrics related to products, the highest-traffic entity. This supports more precise chunk sizing for products without requiring an in-depth understanding of Publish & Synchronize. | ||
- **Event and Message Processing Metrics**: Provide expected event processing metrics, including deviations from default settings such as message trigger rates, custom application warm-up times, event size limits, and data division rate multipliers, to enable configuration adjustments that align with real-world performance. | ||
|
||
### Output | ||
|
||
Once the required data is entered into the **Basic Chunk Size Calculator** and **Advanced Chunk Size Calculator**, it will compute the optimal queue chunk sizes for each queue. Developers will need to configure these queue chunk sizes in the project to align with the calculated values. | ||
|
||
> For instructions on how to set up chunk sizes for the queues, [click here](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/queue.html#configuration-for-chunk-size). | ||
|
||
|
||
### Important Notes | ||
|
||
- The **Advanced Chunk Size Calculator** allows to further configure chunk sizes based on custom hardware limitations and custom performance metrics. | ||
- For systems that require individual configuration of queues and detailed customisation of message setups, consider using the **Expert Chunk Size Calculator**. | ||
- Always ensure that the chunk sizes provided by the calculator are properly configured to avoid system performance issues. | ||
|
||
--- | ||
|
||
For more detailed information about the different levels of the **Chunk Size Calculator**, see the [overview here](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/chunk-size-calculation.html). |
54 changes: 54 additions & 0 deletions
54
...dev/backend-development/data-manipulation/queue/basic-chunk-size-calculation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
--- | ||
title: Basic Chunk Size Calculation | ||
description: Gives an overview over the basic chunk size calculation | ||
last_updated: Oct 25, 2024 | ||
template: concept-topic-template | ||
redirect_from: | ||
- /docs/dg/dev/backend-development/data-manipulation/queue/chunk-size-calculation.html | ||
related: | ||
- title: Advanced Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/advanced-chunk-size-calculation.html | ||
- title: Expert Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/expert-chunk-size-calculation.html | ||
- title: Queue | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/queue.html | ||
|
||
--- | ||
|
||
## Basic Chunk Size Calculation | ||
|
||
The **Basic Chunk Size Calculator** is designed to help developers configure the correct chunk sizes for the project based on the traffic and data patterns in the system. This tool simplifies the setup process for out-of-the-box and low-customised webshops, ensuring that the system can handle high-traffic entities efficiently without over-consuming resources. | ||
|
||
The **Basic Chunk Size Calculator** is available [here](link to google spreadsheet). | ||
|
||
### Problem Overview | ||
|
||
In an e-commerce environment, certain business entities generate a large volume of update events due to frequent refreshes and high data volume. These **high traffic entities** account for the majority of the traffic within the **publish and synchronize**. Misconfiguring chunk sizes for these entities can lead to inefficient resource consumption, system lags, or overloads. The **Basic Chunk Size Calculator** offers a straightforward way to address this by determining the appropriate chunk size for each queue based on the production environment’s data profile. | ||
|
||
### Input Parameters | ||
|
||
To calculate the correct queue chunk sizes, developers must provide the following information based on their specific production environment: | ||
|
||
- **High Traffic Entities**: Provide the total count of each high-traffic entity (e.g., products, prices, offers) across all stores, and estimate the daily refresh rate (percentage or count of entities updated daily) for ongoing system operations. | ||
- **Stores and Locales**: Provide the total number of stores in the system and the maximum number of supported locales across all stores, as these factors impact chunk size calculation for data distribution. | ||
> For more information on stores and locales in our system, [click here](https://docs.spryker.com/docs/pbc/all/dynamic-multistore/202410.0/base-shop/dynamic-multistore-feature-overview.html). | ||
- **Publish and Synchronize Setup**: The **publish and synchronize** processes entity data updates, and the worker setup plays a crucial role in determining how this is managed. Developers need to specify how project workers are set up in relation to stores. | ||
> For more information on workers, tasks, and how they are related to stores, [click here](https://docs.spryker.com/docs/pbc/all/dynamic-multistore/202410.0/base-shop/dynamic-multistore-feature-overview.html). | ||
- **Number of Tasks Per Worker**: Provide the **number of tasks per worker**. This value is essential to calculating how resources are distributed among tasks. Note that there is no additional help or explanation for determining this number, as it is specific to each setup. | ||
> For more information on workers, tasks, and how they are related to stores, [click here](https://docs.spryker.com/docs/pbc/all/dynamic-multistore/202410.0/base-shop/dynamic-multistore-feature-overview.html). | ||
|
||
### Output | ||
|
||
Once the required data is entered into the **Basic Chunk Size Calculator**, it will compute the optimal chunk sizes for each queue used by the system. These queues handle different business entities, and setting the right queue chunk size ensures efficient processing and resource allocation. Developers will need to configure these queue chunk sizes. | ||
|
||
> For instructions on how to set up chunk sizes for the queues, [click here](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/queue.html#configuration-for-chunk-size). | ||
|
||
### Important Notes | ||
|
||
- The **Basic Chunk Size Calculator** is designed for systems that follow a standard, out-of-the-box configuration. If your system is more customized, consider using the **Advanced** or **Expert Chunk Size Calculator** for fine-tuning. | ||
- This calculator only requires a basic understanding of the system's entity data and store structure. For more complex metrics like memory usage or container performance, the advanced calculators may be necessary. | ||
- Always ensure that the chunk sizes provided by the calculator are properly configured to avoid system performance issues. | ||
|
||
--- | ||
|
||
For more detailed information about the different levels of the **Chunk Size Calculator**, see the [overview here](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/chunk-size-calculation.html). |
66 changes: 66 additions & 0 deletions
66
docs/dg/dev/backend-development/data-manipulation/queue/chunk-size-calculation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
--- | ||
title: Chunk Size Calculation | ||
description: Describes the challenges and solutions of selecting proper chunk sizes for project requirements | ||
last_updated: Oct 25, 2024 | ||
template: concept-topic-template | ||
redirect_from: | ||
- /docs/dg/dev/backend-development/data-manipulation/queue/queue.html#concepts | ||
related: | ||
- title: Basic Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/basic-chunk-size-calculation.html | ||
- title: Advanced Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/advanced-chunk-size-calculation.html | ||
- title: Expert Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/expert-chunk-size-calculation.html | ||
|
||
--- | ||
|
||
## Chunk Size Calculation | ||
|
||
In an e-commerce framework, selecting the correct queue chunk size for processing data is critical for ensuring optimal performance and efficient resource usage. The challenge arises from the diversity of business entities, stores, and locales, each with different memory and CPU requirements for denormalization. Furthermore, the frequency of updates, data sizes, and the specific configurations of various system services (e.g., Redis, Elasticsearch, RabbitMQ) make it difficult to determine the appropriate chunk size for each queue. | ||
|
||
Without proper queue chunk size configuration, projects can either overconsume resources, leading to crashes or lag, or underutilize resources, resulting in slow performance. This is where the **Chunk Size Calculator** comes in—offering a solution that helps developers fine-tune queue chunk sizes based on the specific characteristics of the project. | ||
|
||
### What is the Chunk Size Calculator? | ||
|
||
The **Chunk Size Calculator** is a tool that helps developers determine the appropriate queue chunk sizes for processing data across different queues in a resource-efficient way. It ensures that memory and CPU usage are optimized and prevents the system from being overwhelmed during data processing tasks. This tool is designed to handle the variability in entity sizes, stores, locales, and update frequencies, giving developers confidence that project will run smoothly in production environments. | ||
|
||
### The Three Levels of the Chunk Size Calculator | ||
|
||
The **Chunk Size Calculator** is divided into three levels: **Basic**, **Advanced**, and **Expert**. Each level is designed to accommodate different degrees of project complexity and customization. Below is an overview of each: | ||
|
||
#### 1. Basic Chunk Size Calculator | ||
|
||
The **Basic Chunk Size Calculator** is designed for small to medium B2C projects with minimal customization. It assumes that the default configuration of business entities, stores, and locales is sufficient, and that the resource consumption patterns are predictable. | ||
|
||
With the Basic calculator, developers only need to provide a minimal set of inputs, such as store configuration and high traffic entity counts. The calculator uses these inputs to recommend chunk sizes for each queue. This is ideal for developers who are working with out-of-the-box setups and need a simple, reliable way to configure project. | ||
|
||
**When to use**: This is the default starting point for any project. If your project has not been heavily customized, this calculator will give you the necessary queue chunk sizes with minimal effort. | ||
|
||
Find more details on the [Basic Chunk Size Calculation](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/basic-chunk-size-calculation.html) page. | ||
|
||
#### 2. Advanced Chunk Size Calculator | ||
|
||
The **Advanced Chunk Size Calculator** builds upon the basic level, requiring developers to have a deeper understanding of the services that make up the project. In addition to understanding the basic chunk size concepts, developers will need to account for service elements like: Persistence, Storage, Search, Message Broker and Scheduler. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
The Advanced calculator addresses resource allocations or performance-sensitive variables, enabling a more precise configuration that enhances stability and performance in production environments. While the calculator provides recommendations, developers will need to input detailed configuration data to fine-tune the project. | ||
|
||
**When to use**: The Advanced calculator is suited for projects that have been moderately customized or when developers need more precise control over performance and resource usage. | ||
|
||
Find more details on the [Advanced Chunk Size Calculation](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/advanced-chunk-size-calculation.html) page. | ||
|
||
#### 3. Expert Chunk Size Calculator | ||
|
||
The **Expert Chunk Size Calculator** is designed for highly customized projects where one or more entities deviate significantly from the norm. This includes scenarios where entities are unusually large or where the project handles massive data volumes that require frequent updates. | ||
|
||
In this case, developers need a deep understanding of how the project’s components interact, including the scheduler’s worker, and the memory distribution across workers, and tasks. The Expert calculator gives full visibility into all performance metrics, allowing developers to tweak each queue individually. | ||
|
||
Despite the complexity, the calculator still ensures that the entire system remains balanced, preventing one queue from consuming too many resources at the expense of others. | ||
|
||
**When to use**: The Expert calculator is reserved for production environments with complex, heavily customized setups that require fine-tuning of every performance metric. | ||
|
||
Find more details on the [Expert Chunk Size Calculation](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/expert-chunk-size-calculation.html) page. | ||
|
||
### Summary | ||
|
||
The **Chunk Size Calculator** provides developers with a powerful tool for optimizing their system’s resource consumption. Each level of the calculator is designed to address different use cases, from simple out-of-the-box configurations to highly customized, complex environments. Developers are encouraged to start with the **Basic Chunk Size Calculator** and, if necessary, progress to the **Advanced** or **Expert** calculators as the complexity of their project grows. |
63 changes: 63 additions & 0 deletions
63
...ev/backend-development/data-manipulation/queue/expert-chunk-size-calculation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
--- | ||
title: Expert Chunk Size Calculation | ||
description: Gives an overview over the expert chunk size calculation | ||
last_updated: Oct 25, 2024 | ||
template: concept-topic-template | ||
redirect_from: | ||
- /docs/dg/dev/backend-development/data-manipulation/queue/chunk-size-calculation.html | ||
related: | ||
- title: Basic Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/basic-chunk-size-calculation.html | ||
- title: Advanced Chunk Size Calculation | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/advanced-chunk-size-calculation.html | ||
- title: Queue | ||
link: docs/dg/dev/backend-development/data-manipulation/queue/queue.html | ||
|
||
--- | ||
|
||
## Expert Chunk Size Calculation | ||
|
||
### Overview | ||
|
||
The **Expert Chunk Size Calculator** is designed for developers working with heavily customized entities within the project. Whether it's a single entity or multiple entities, when these entities are significantly customized in terms of size, relationships, or data complexity, this calculator provides the granular control needed to fine-tune each queue's performance. | ||
|
||
As the complexity of an entity increases, so does the denormalization time, which can slow down the entire system. Developers using the **Expert Chunk Size Calculator** must have a solid understanding of how containerization works in the project, how resources like memory and CPU are distributed among containers, workers, and tasks, as well as the limitations of the receiving side services (such as search and storage) and the provider systems (like the database). The message broker, which delivers messages and imposes throughput limits, is also a critical component of the overall system architecture that needs to be considered. | ||
|
||
The **Expert Chunk Size Calculator** is available [here](link to google spreadsheet). | ||
|
||
### Problem Statement | ||
|
||
In highly customized systems, Basic or Advanced queue chunk size configurations may not suffice. Complex entities with large data sets and relationships demand more fine-tuned control over how tasks are processed, how resources are allocated, and how messages are handled. The **Expert Chunk Size Calculator** is needed to provide detailed, queue-by-queue configuration for developers who need to optimize the project's performance under these conditions. | ||
|
||
### Input Parameters | ||
|
||
The **Expert Chunk Size Calculator** requires a wide range of detailed inputs to properly configure chunk sizes. Developers need to provide in-depth information about the production environment, including: | ||
|
||
- **Entity Customization**: The size and cardinality of the entities, which affects how much memory and CPU is consumed during the denormalization process. | ||
- **Message Handling**: Specific configuration data regarding the size of messages that will be processed by the system and the limits imposed by the message broker and receiving systems. | ||
|
||
The expert calculator offers the ability to set individual performance and resource consumption metrics for each queue, making it possible to precisely optimize the entire **publish and synchronize** process. | ||
|
||
### Output | ||
|
||
The result of the **Expert Chunk Size Calculator** is a set of optimized queue chunk sizes for each individual queue in the project. | ||
|
||
> For instructions on how to set up chunk sizes for the queues, [click here](https://docs.spryker.com/docs/dg/dev/backend-development/data-manipulation/queue/queue.html#configuration-for-chunk-size). | ||
|
||
### Important Notes | ||
|
||
- The **Expert Chunk Size Calculator** is intended for projects that have significant customizations at the entity level. If your system follows a more standard setup, consider using the **Basic** or **Advanced Chunk Size Calculators**. | ||
- This calculator requires an in-depth understanding of how system components interact, including containerization, message brokers, search and storage, and resource distribution across workers and tasks. | ||
- For systems that require individual configuration of queues and detailed customization of message handling, consider using the **Expert Chunk Size Calculator**. | ||
|
||
### Additional Knowledge Required | ||
|
||
To effectively use the **Expert Chunk Size Calculator**, developers must have a strong grasp of several key concepts related to resource management and system architecture. | ||
|
||
#### 1. Container-Worker-Task Resource Relationship | ||
|
||
This section will cover how resources (memory, CPU, etc.) are allocated between containers, workers, and tasks. It will explain how container boundaries are defined and the importance of understanding how these resources are distributed across the system to maintain healthy processing. | ||
|
||
#### 2. Publish and Synchronize Queues | ||
|
||
This section will explain how queues work in the publish and synchronize middleware, how they process multiple entities, and how factors like entity size and denormalization times impact CPU and memory consumption. Understanding these relationships is key to optimizing each queue’s performance through the expert calculator. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.