You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+16-12
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,16 @@
1
-
# LoHan ICDE 2025 Artifact
1
+
# LoHan
2
2
3
-
This artifact provides a guide to replicate the primary experiments in this paper. You can follow this repository to reproduce the experimental results about LoHan's maximum trainable model sizes, batch sizes and throughput in our paper. The documentation and auto-run script mainly focus on reproducing results in Subsection V-B and you can adjust the code to reproduce results in other sections.
3
+
LoHan is a <ins>Lo</ins>w-cost <ins>H</ins>igh-perform<ins>an</ins>ce framework for large model fine-tuning. This repository now includes efficient data-parallel fine-tuning code (Ratel, ICDE 2025) and more exciting features are coming soon!
4
4
5
-
## Environment Setup
5
+
## Ratel ICDE 2025 Artifact
6
6
7
-
### SSD Configuration
7
+
This artifact provides a guide to replicate the primary experiments in this paper. You can follow this repository to reproduce the experimental results about Ratel's maximum trainable model sizes, batch sizes and throughput in our paper. The documentation and auto-run script mainly focus on reproducing results in Subsection V-B and you can adjust the code to reproduce results in other sections.
8
8
9
-
LoHan aggregates the I/O bandwidth of multiple SSDs by configuring a RAID array for efficient model states and activation offloading. Therefore, we provide a script to configure this array.
9
+
### Environment Setup
10
+
11
+
#### SSD Configuration
12
+
13
+
Ratel aggregates the I/O bandwidth of multiple SSDs by configuring a RAID array for efficient model states and activation offloading. Therefore, we provide a script to configure this array.
10
14
11
15
First, modify the `make_raid.sh` to meet your own needs. The script in this repo is used to configure the drives `/dev/nvme0n1` to `/dev/nvme11n1` into an array. You can adjust the line 23 to change the drives you want to set up.
12
16
@@ -16,10 +20,10 @@ After configuring the script, you can run the script to set up the RAID array. Y
# If there are different CUDA versions, you should specify the CUDA version
@@ -31,15 +35,15 @@ pip install six==1.16.0
31
35
pip install scikit-learn
32
36
```
33
37
34
-
## Running LoHan
38
+
###Running Ratel
35
39
36
-
We provide a script to run LoHan. You can adjust the script to reproduce the results.
40
+
We provide a script to run Ratel. You can adjust the script to reproduce the results.
37
41
38
42
```shell
39
43
bash run.sh
40
44
```
41
45
42
-
### Limiting the Memory Size
46
+
####Limiting the Memory Size
43
47
44
48
Experiments in Subsection V-B require adjusting the main memory capacity. Instead of manually adding and removing the machine's DRAM, you can consider pinning the main memory via huge page so that these memory spaces cannot be utilized by Ratel.
45
49
@@ -71,11 +75,11 @@ Hugepagesize: 2048 kB
71
75
Hugetlb: 2097152 kB
72
76
```
73
77
74
-
### Benchmark Results
78
+
####Benchmark Results
75
79
76
80
Please refer to [here](evaluation_data.md) for our raw evaluation data in our paper that might help for your reproduing.
77
81
78
-
## Acknowledgement
82
+
###Acknowledgement
79
83
80
84
Some of the code in this project is modified from the [DeepSpeed](https://github.com/microsoft/DeepSpeed) repository, we appreciate the contributions of the original repository authors.
0 commit comments