Skip to content

DeepSeek V2/V3 implementation refactored to allow non-MLA and MLA #20178

DeepSeek V2/V3 implementation refactored to allow non-MLA and MLA

DeepSeek V2/V3 implementation refactored to allow non-MLA and MLA #20178

Job Run time
11m 48s
2m 34s
1m 3s
2m 34s
2m 59s
5m 37s
4m 43s
2m 40s
2m 19s
1m 23s
11m 8s
2m 18s
9m 26s
11m 20s
11m 35s
4m 14s
4m 8s
38m 40s
1m 7s
6m 2s
1m 46s
5m 5s
5m 25s
5m 32s
6m 26s
13m 10s
8m 11s
6m 25s
4m 30s
10m 47s
0s
7m 14s
6m 19s
4m 31s
0s
6m 17s
6m 19s
3m 10s
2m 26s
2m 2s
0s
4h 3m 13s