Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(article): aded article exploring asynchronous programming 1 #390

Merged
merged 2 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
Python provides a range of tools and libraries for performing programming, which are designed to cater to various needs and levels of complexity. It is crucial to have an understanding of these tools in order to develop applications that are both efficient and responsive. Now let’s delve into the details of the tools, for this purpose.

== Multithreading

Introduction to Multithreading
Multithreading allows concurrent execution of multiple threads within a single process. Threads share the same memory space, making it suitable for I/O bound tasks.

Threading Module
The threading module facilitates working with threads in Python. Let's create a simple example to demonstrate multithreading:

[, code]
----
import threading

def print_numbers():
for i in range(1, 6):
print(f"Number {i}")

def print_letters():
for letter in 'abcde':
print(f"Letter {letter}")

if __name__ == "__main__":
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_letters)

thread1.start() # Start the thread to run the 'print_numbers' function concurrently
thread2.start() # Start another thread to run the 'print_letters' function concurrently

thread1.join() # Wait for thread1 to complete its task before moving on
thread2.join() # Wait for thread2 to complete its task before moving on


# Number 1
# Letter a
# Letter b
# Number 2
# Number 3
# Number 4
# Number 5
# Letter c
# Letter d
# Letter e
----

It's important to understand that while you have two threads (#thread1# and #thread2#) that are executing concurrently, the order in which they execute their tasks is not guaranteed. This lack of ordering is due to the nature of threading and how operating systems schedule threads for execution. Here's why you see the output in a seemingly random order:

1. Thread Scheduling: The operating system's thread scheduler determines when and in what order threads run. Threads can be preempted and paused at any time, and the scheduler decides which thread to execute next based on factors like thread priorities and time slices.

2. Non-Atomic Print Operation: The print operation itself is not atomic, meaning it's not a single, uninterrupted action. When you print something, it involves multiple steps like acquiring the console output lock, formatting the output, and releasing the lock. Between these steps, other threads can run, which can result in interleaved output.

For example, consider this possible order of execution:

1. #thread1# starts and prints "Number 1".
2. #thread2# starts and prints "Letter a" and "Letter b".
3. #thread1# continues and prints "Number 2", "Number 3", "Number 4", "Number 5".
4. #thread2# continues and prints "Letter c", "Letter d", "Letter e".

=== Example: Downloading Files Concurrently with Multithreading
In this example, we'll use multithreading to download files concurrently from different URLs:

[, code]
----

import threading
import requests

def download_file(url, filename):
response = requests.get(url)
with open(filename, "wb") as file:
file.write(response.content)
print(f"Downloaded {filename}")

if __name__ == "__main__":
urls = [
"https://github.com/wkhtmltopdf/wkhtmltopdf/archive/refs/tags/0.12.6.zip",
"https://github.com/wkhtmltopdf/wkhtmltopdf/archive/refs/tags/0.12.2.1.zip",
"https://github.com/wkhtmltopdf/wkhtmltopdf/archive/refs/tags/0.12.3.1.zip",
]
threads = []
for i, url in enumerate(urls):
thread = threading.Thread(target=download_file, args=(url, f"file_{i}.txt"))
thread.start()
threads.append(thread)

for thread in threads:
thread.join()


# Downloaded https://github.com/wkhtmltopdf/wkhtmltopdf/archive/refs/tags/0.12.3.1.zip
# Downloaded https://github.com/wkhtmltopdf/wkhtmltopdf/archive/refs/tags/0.12.2.1.zip
# Downloaded https://github.com/wkhtmltopdf/wkhtmltopdf/archive/refs/tags/0.12.6.zip
----

If you observed that downloading files without using threads (i.e., in a sequential or synchronous manner) took less time than downloading them with threads, there are a few potential reasons for this counterintuitive behavior:

1. Global Interpreter Lock (GIL): Python has a Global Interpreter Lock (GIL) that allows only one thread to execute Python bytecode at a time, even on multi-core processors. This means that in a multithreaded Python program, threads can be limited by the GIL, especially if the tasks involve CPU bound operations. In the case of downloading files, which is typically I/O bound (waiting for data to be transferred over the network), using threads may not provide a significant advantage, and it could even introduce some overhead.

2. Network Bound: If the download speed is limited by the network bandwidth, using multiple threads might not lead to a significant improvement because the bottleneck is the network speed, not the CPU. In such cases, the overhead of managing multiple threads can outweigh any potential gains.

3. Thread Overhead: Creating and managing threads in Python comes with some overhead. If the tasks are relatively simple, such as downloading files, the overhead of creating and managing threads can outweigh the benefits of concurrency.

4. Resource Contention: When using threads, there can be contention for system resources like CPU and memory. If the system becomes saturated with threads, context switching and resource contention may slow down the overall performance.

5. Thread Management: The example code provided for multithreading may not be optimized for maximum concurrency. In a real-world scenario, optimizing thread management, such as using thread pools or asyncio for I/O bound operations, can yield better results.

=== Multiprocessing

Introduction to Multiprocessing
Multiprocessing allows parallel execution of multiple processes, each with its own memory space. It's suitable for CPU-bound tasks.

Multiprocessing Module
The multiprocessing module supports multiprocessing in Python. Let's create an example demonstrating multiprocessing:

[, code]
----
import multiprocessing

def worker(number):
result = number * number
print(f"Result: {result}")

if __name__ == "__main__":
processes = []
for i in range(1, 6):
process = multiprocessing.Process(target=worker, args=(i,))
processes.append(process)
process.start()

for process in processes:
process.join()


# Result: 4
# Result: 9
# Result: 1
# Result: 25
# Result: 16
----

Explanation: In this multiprocessing example, we're creating multiple processes to perform a CPU-bound task, which is calculating the square of a number. However, the order in which the results are printed may not necessarily match the order of the input values.

This is because the individual processes run concurrently and independently of each other. They may complete their tasks in a different order, depending on factors like the CPU's availability and scheduling. As a result, the printed results can appear in a random or unordered fashion.

Multiprocessing is ideal for parallelizing CPU bound tasks to leverage multiple CPU cores effectively. However, it doesn't guarantee a specific order of execution or results, as the processes run in parallel and their completion times can vary.

=== Event Loop (Asyncio)
Introduction to Asynchronous I/O
Asynchronous I/O enables non-blocking concurrency. The event loop manages tasks, making it suitable for I/O-bound operations.

Asyncio Module
The asyncio module provides tools for asynchronous programming. Let's create an example to illustrate asyncio:

[, code]
----
import asyncio

async def print_numbers():
for i in range(1, 6):
print(f"Number {i}")
await asyncio.sleep(1)

async def print_letters():
for letter in 'abcde':
print(f"Letter {letter}")
await asyncio.sleep(1)

async def main():
task1 = asyncio.create_task(print_numbers())
task2 = asyncio.create_task(print_letters())

await task1
await task2

if __name__ == "__main__":
asyncio.run(main())



# Number 1
# Letter a
# Number 2
# Letter b
# Number 3
# Letter c
# Number 4
# Letter d
# Number 5
# Letter e
----

In the code, you are using asyncio to create two asynchronous tasks (#print_numbers# and #print_letters#) and running them concurrently. Each task includes an #await asyncio.sleep(1)# statement, which effectively suspends the execution of the task for 1 second before continuing.

1. The #main# coroutine is executed when you run the program.

2. Inside main, you create two tasks: #task1# (for print_numbers) and #task2# (for print_letters). These tasks are started concurrently.

3. #task1# starts executing the print_numbers coroutine. It prints "Number 1" and then hits the #await asyncio.sleep(1)# line. While it sleeps, the event loop continues.

4. Simultaneously, #task2# starts executing the print_letters coroutine. It prints #"Letter a"# and then awaits for 1 second.

5. After 1 second, task1 resumes execution, printing #"Number 2"# and then sleeping again.

6. #task2# also resumes after 1 second, printing #"Letter b"# and then sleeping.

7. This pattern continues until both #task1# and #task2# have completed their respective loops. The await statements within each coroutine introduce pauses, allowing the other task to make progress while the first one sleeps.

As a result, you see an interleaved output where #"Number"# and #"Letter"# lines are mixed together because both tasks are running concurrently and asynchronously, each yielding to the event loop during the #await asyncio.sleep(1)# calls. This allows for a more responsive and non-blocking execution of tasks in an event-driven manner, which is one of the key benefits of asyncio.


=== Conclusion

In this article we have introduced the principles of programming in Python using three primary methods: multithreading, multiprocessing and asyncio. We have discussed how each approach caters to scenarios, such as improving tasks that involve input/output operations handling intensive tasks or efficiently managing network operations. As you continue your journey into the world of programming in Python these fundamental techniques will serve as tools for creating responsive and high performing applications.

In our articles of the "Mastering Asynchronous Programming, in Python" series we will delve deeper into strategies, best practices and real world applications of these asynchronous Python programming techniques. Whether you want to optimize web scraping processes, develop web services or improve data processing pipelines, our comprehensive series will provide you with the knowledge and skills needed to excel in the field of Python development. Stay tuned for content and practical examples that will help take your Python programming skills to new heights.
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"title": "Exploring Asynchronous Programming Approaches in Python (Mastering Asynchronous Programming in Python)",
"order": 81,
"domains": ["dev_quality_assurance"],
"language": "en",
"bgImg": "assets/articles/exploring-asynchronous-programming-approaches-in-python-mastering-asynchronous-programming-in-python/Main_Asynchronous_programming.png",
"authorImg": "assets/articles/exploring-asynchronous-programming-approaches-in-python-mastering-asynchronous-programming-in-python/Erik_Sultanaliev.png" ,
"author": "Erik Sultanaliev",
"position": "Software Developer ",
"date": "Mon Sep 11 2023 10:45:55 GMT+0000 (Coordinated Universal Time)",
"seoDescription": "Python Concurrency Techniques"
}