Skip to content
This repository was archived by the owner on Nov 14, 2024. It is now read-only.

Commit f4af4e5

Browse files
execution engine
1 parent f8dc476 commit f4af4e5

File tree

1 file changed

+73
-77
lines changed

1 file changed

+73
-77
lines changed

_posts/2024-06-14-remote-code-execution-engine.md

Lines changed: 73 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,33 @@
11
---
2-
title: "Distributed Remote Code Execution Engine"
2+
title: "A Distributed Code Execution Engine with Scala and Pekko"
33
date: 2024-06-14
44
header:
55
image: "https://res.cloudinary.com/dkoypjlgr/image/upload/f_auto,q_auto:good,c_auto,w_1200,h_300,g_auto,fl_progressive/v1715952116/blog_cover_large_phe6ch.jpg"
66
tags: [scala,pekko,pekko-http,pekko-stream,pekko-cluster,docker,docker-compose,scala3]
7-
excerpt: "Practical guide to building the distributed remote code execution engine in Scala and Pekko"
7+
excerpt: "A Practical guide to building the distributed remote code execution engine in Scala and Pekko"
88
---
99

1010
_by [Anzori (Nika) Ghurtchumelia](https://github.com/ghurtchu)_
1111

12-
## 1. Introduction
13-
14-
After a long hiatus, I am back with renewed passion and energy, eager to delve deeper into the Scala ecosystem. This time, I am committed to building something tangible and useful with the tools available. Let's embark on this exciting journey of exploration and learning together.
12+
{% include video id="1uP6FTUn8_E" provider="youtube" %}
1513

16-
The greatest benefit of small side projects is the unique knowledge boost which can potentially be handy later in career.
14+
## 1. Introduction
1715

18-
In this article we will attempt to build the remote code execution engine - the backend platform for websites such as [Hackerrank](https://hackerrank.com), [Leetcode](https://leetcode.com) and others.
16+
In this article we will attempt to build the remote code execution engine - the backend platform for websites such as [Hackerrank](https://hackerrank.com), [LeetCode](https://leetcode.com) and others.
1917

20-
If, for some reason you're unfamiliar with the websites mentioned above, the basic usage flow is described below:
18+
For such a platform, the basic usage flow is:
2119
- Client sends code
2220
- Backend runs it and responds with output
2321

24-
There you go, sounds simple, right?
22+
Sounds simple, right? Right?...
2523

26-
Right, right...
24+
Can you imagine how many things can go wrong here? The possibilities for failure are endless! However, we should address at least some of them.
2725

28-
Can you imagine how many things can go wrong here? It's the devil smirking in the corner, knowing, that the possibilities for failure are endless, however, we should address at least some of them.
29-
30-
To give you a quick idea: a separate blog post can be written only about the security, not to mention scalability, extensibility and a few other compulsory properties to make it production ready.
26+
We can probably write a separate blog post about the security, scalability, extensibility and a few other compulsory properties to make it production ready.
3127

3228
The goal isn't to build the best one, nor it is to compete with the existing ones.
3329

34-
Put simply, the goal of this project is to get familiar with `Pekko` and its modules such as `pekko-http`, `pekko-stream`, `pekko-cluster` and a few interesting concepts revolving around actor model concurrency, such as:
30+
Put simply, the goal of this project is to get familiar with [Apache Pekko](https://pekko.apache.org) (the open source version of Akka) and its modules such as `pekko-http`, `pekko-stream`, `pekko-cluster` and a few interesting concepts around actor model concurrency, such as:
3531
- cluster nodes and formation
3632
- cluster aware routers
3733
- remote worker actors
@@ -43,9 +39,11 @@ Put simply, the goal of this project is to get familiar with `Pekko` and its mod
4339

4440
Let's get started then, shall we?
4541

42+
> _Hey, it's Daniel here. Apache Pekko is great, and Nika has done a great job showcasing its features in a compact project we can explore exhaustively in this article. If you need to get started with Pekko, I've covered Pekko (and Akka) in a comprehensive bundle of courses about [Akka/Pekko Actors](https://rockthejvm.com/p/akka-essentials), [Akka/Pekko Streams](https://rockthejvm.com/p/akka-streams), and [Akka/Pekko HTTP](https://rockthejvm.com/p/akka-http), all of which are used in this article. Check them out if you're interested._
43+
4644
## 2. Project Structure - here
4745

48-
I recommend checking out [the project on GitHub](https://github.com/ghurtchu/braindrill/) and following along that way.
46+
To make the best of this article, I recommend checking out [the project on GitHub](https://github.com/ghurtchu/braindrill/) and following the code while reading this article, as many things will make sense along the way.
4947

5048
We will use `Scala 3.4.1`, `sbt 1.9.9`, `docker`, `pekko` and its modules to complete our project.
5149

@@ -64,7 +62,8 @@ The initial project skeleton looks like the following:
6462
- `Dockerfile` blueprint for running the app inside container
6563
- `README.md` instructions for setting up the project locally
6664

67-
Nothing fancy, let's move on `build.sbt`:
65+
Let's start with `build.sbt`:
66+
6867
```scala
6968
ThisBuild / scalaVersion := "3.4.1"
7069

@@ -120,14 +119,13 @@ addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "2.1.1") // SBT plugin for using
120119

121120
## 3. Project Architecture
122121

123-
After a few iterations I came up with the architecture that can be horizontally scaled, if required.
124-
Ideally, such projects must be scaled easily as long as the load is increased.
122+
After a few iterations I came up with the architecture that can be horizontally scaled, if required. Ideally, such projects should be scaled easily as long as the load is increased.
125123

126-
For that we use tools such as Kubernetes or other container orchestration platforms. To make local development and deployment simpler we'll be using docker containers. More precisely we'll be using `docker-compose` to run a few containers together so that they form the cluster.
124+
For that we use tools such as Kubernetes or other container orchestration platforms. To make local development and deployment simpler we'll be using Docker containers. More precisely we'll be using `docker-compose` to run a few containers together so that they form the cluster.
127125

128-
`docker-compose` doesn't support scalability out of the box because it's static, it means that we can't magically add new `worker` node to the running system. Again, for that we'd use Kubernetes, but it is out of the scope of this project.
126+
`docker-compose` doesn't support scalability out of the box because it's static, it means that we can't magically add new `worker` nodes to the running system. Again, for that we'd use Kubernetes, but it is out of the scope of this project.
129127

130-
We have a `master` node and its role is to be the task distributor among `worker` nodes.
128+
We have a `master` node and its role is to be the task distributor among `worker` nodes.
131129

132130
`http` is exposed on `master` node, acting as a gateway to outside world.
133131

@@ -517,8 +515,8 @@ transformation {
517515
load-balancer = 3
518516
}
519517
```
520-
Here, it simply means that each node will have 32 worker actors and master node will have 3 load balancer actors.
521-
In real world, choosing those numbers would depend on multiple variables that must be collected and analyzed in production.
518+
Here, it means that each node will have 32 worker actors and master node will have 3 load balancer actors.
519+
In the real world, choosing those numbers would depend on multiple variables that must be collected and analyzed in production.
522520
In my opinion, those numbers are optimized based on empirical evidence rather than theoretical results.
523521

524522
### 4.2 Serialization
@@ -877,34 +875,34 @@ object CodeExecutor {
877875
msg match {
878876
case In.Execute(compiler, file, dockerImage, replyTo) =>
879877
ctx.log.info(s"{}: executing submitted code", self)
880-
val asyncExecuted: Future[In.Executed] = for
881-
// timeout --signal=SIGKILL 2 docker run --rm --ulimit cpu=1 --memory=20m -v engine:/data -w /data rust rust /data/r.rust
882-
ps <- run(
883-
"timeout",
884-
"--signal=SIGKILL",
885-
"2", // 2 second timeout which sends SIGKILL if exceeded
886-
"docker",
887-
"run",
888-
"--rm", // remove the container when it's done
889-
"--ulimit", // set limits
890-
"cpu=1", // 1 processor
891-
"--memory=20m", // 20 M of memory
892-
"-v", // bind volume
893-
"engine:/data",
894-
"-w", // set working directory to /data
895-
"/data",
896-
dockerImage,
897-
compiler,
898-
s"${file.getPath}"
899-
)
878+
val asyncExecuted: Future[In.Executed] = for {
879+
// timeout --signal=SIGKILL 2 docker run --rm --ulimit cpu=1 --memory=20m -v engine:/data -w /data rust rust /data/r.rust
880+
ps <- run(
881+
"timeout",
882+
"--signal=SIGKILL",
883+
"2", // 2 second timeout which sends SIGKILL if exceeded
884+
"docker",
885+
"run",
886+
"--rm", // remove the container when it's done
887+
"--ulimit", // set limits
888+
"cpu=1", // 1 processor
889+
"--memory=20m", // 20 M of memory
890+
"-v", // bind volume
891+
"engine:/data",
892+
"-w", // set working directory to /data
893+
"/data",
894+
dockerImage,
895+
compiler,
896+
s"${file.getPath}"
897+
)
900898
// error and success channels as streams
901899
(successSource, errorSource) = src(ps.getInputStream) -> src(ps.getErrorStream)
902900
((success, error), exitCode) <- successSource
903901
.runWith(readOutput) // join success, error and exitCode
904902
.zip(errorSource.runWith(readOutput))
905903
.zip(Future(ps.waitFor))
906904
_ = Future(file.delete) // remove file in the background to free up the memory
907-
yield In.Executed(
905+
} yield In.Executed(
908906
output = if success.nonEmpty then success else error,
909907
exitCode = exitCode,
910908
replyTo = replyTo
@@ -913,7 +911,7 @@ object CodeExecutor {
913911
ctx.pipeToSelf(asyncExecuted) {
914912
case Success(executed) =>
915913
ctx.log.info("{}: executed submitted code", self)
916-
executed.exitCode match
914+
executed.exitCode match {
917915
case 124 | 137 =>
918916
In.ExecutionFailed(
919917
"The process was aborted because it exceeded the timeout",
@@ -925,6 +923,7 @@ object CodeExecutor {
925923
replyTo
926924
)
927925
case _ => In.ExecutionSucceeded(executed.output, replyTo)
926+
}
928927
case Failure(exception) =>
929928
ctx.log.warn("{}: execution failed due to {}", self, exception.getMessage)
930929
In.ExecutionFailed(exception.getMessage, replyTo)
@@ -1103,26 +1102,27 @@ object ClusterSystem {
11031102
val node = cluster.selfMember
11041103
val cfg = ctx.system.settings.config
11051104

1106-
if node hasRole "worker" then
1105+
if (node hasRole "worker") {
11071106
val numberOfWorkers = Try(cfg.getInt("transformation.workers-per-node")).getOrElse(50)
11081107
// actor that sends StartExecution message to local Worker actors in a round robin fashion
11091108
val workerRouter = ctx.spawn(
1110-
behavior = Routers
1111-
.pool(numberOfWorkers) {
1112-
Behaviors
1113-
.supervise(Worker().narrow[StartExecution])
1114-
.onFailure(SupervisorStrategy.restart)
1115-
}
1116-
.withRoundRobinRouting(),
1117-
name = "worker-router"
1109+
behavior = Routers
1110+
.pool(numberOfWorkers) {
1111+
Behaviors
1112+
.supervise(Worker().narrow[StartExecution])
1113+
.onFailure(SupervisorStrategy.restart)
1114+
}
1115+
.withRoundRobinRouting(),
1116+
name = "worker-router"
11181117
)
11191118
// actors are registered to the ActorSystem receptionist using a special ServiceKey.
11201119
// All remote worker-routers will be registered to ClusterBootstrap actor system receptionist.
11211120
// When the "worker" node starts it registers the local worker-router to the Receptionist which is cluster-wide
11221121
// As a result "master" node can have access to remote worker-router and receive any updates about workers through worker-router
11231122
ctx.system.receptionist ! Receptionist.Register(Worker.WorkerRouterKey, workerRouter)
1124-
1125-
if node hasRole "master" then
1123+
}
1124+
1125+
if (node hasRole "master") {
11261126
given system: ActorSystem[Nothing] = ctx.system
11271127

11281128
given ec: ExecutionContextExecutor = ctx.executionContext
@@ -1139,21 +1139,21 @@ object ClusterSystem {
11391139
name = s"load-balancer-$n"
11401140
)
11411141
}
1142-
1143-
val route =
1144-
pathPrefix("lang" / Segment) { lang =>
1145-
post {
1146-
entity(as[String]) { code =>
1147-
val loadBalancer = Random.shuffle(loadBalancers).head
1148-
val asyncResponse = loadBalancer
1149-
.ask[ExecutionResult](StartExecution(code, lang, _))
1150-
.map(_.value)
1151-
.recover(_ => "something went wrong")
1152-
1153-
complete(asyncResponse)
1154-
}
1155-
}
1156-
}
1142+
}
1143+
1144+
val route =
1145+
pathPrefix("lang" / Segment) { lang =>
1146+
post {
1147+
entity(as[String]) { code =>
1148+
val loadBalancer = Random.shuffle(loadBalancers).head
1149+
val asyncResponse = loadBalancer
1150+
.ask[ExecutionResult](StartExecution(code, lang, _))
1151+
.map(_.value)
1152+
.recover(_ => "something went wrong")
1153+
complete(asyncResponse)
1154+
}
1155+
}
1156+
}
11571157

11581158
val host = Try(cfg.getString("http.host")).getOrElse("0.0.0.0")
11591159
val port = Try(cfg.getInt("http.port")).getOrElse(8080)
@@ -1332,11 +1332,12 @@ object Simulator extends App {
13321332

13331333
private def percentile(data: ArrayBuffer[Long], p: Double): Long =
13341334
if data.isEmpty then 0
1335-
else
1335+
else {
13361336
val sortedData = data.sorted
13371337
val k = (sortedData.length * (p / 100.0)).ceil.toInt - 1
13381338

13391339
sortedData(k)
1340+
}
13401341

13411342
enum Code(val value: String) {
13421343
case MemoryIntensive extends Code(Python.MemoryIntensive)
@@ -1465,8 +1466,3 @@ The design choices made in this project ensure that our remote code execution en
14651466
Building this distributed system with Scala 3 and Apache Pekko has been an enlightening experience. We've harnessed the power of actor-based concurrency, cluster management, and containerization to create a resilient and secure remote code execution engine. This project exemplifies how modern technologies can be integrated to solve complex problems in a scalable and efficient manner.
14661467

14671468
Whether you're looking to implement a similar system or seeking insights into distributed computing with Scala and Pekko, we hope this blog post has provided valuable knowledge and inspiration.
1468-
1469-
Additionally, you can check out:
1470-
- [Video demo](https://www.youtube.com/watch?v=sMlJC7Kr330) which includes running the `Simulator.scala`
1471-
1472-
Thank you for following along!

0 commit comments

Comments
 (0)