Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug: P2B task graph edge bind lbi #8052

Merged
merged 33 commits into from
Apr 28, 2022
Merged

fix bug: P2B task graph edge bind lbi #8052

merged 33 commits into from
Apr 28, 2022

Conversation

chengtbf
Copy link
Contributor

@chengtbf chengtbf commented Apr 19, 2022

fix: Oneflow-Inc/libai#232

这个 BUG 的原因是: swin Variable Op 后面有一个 B2P 的 boxing,该 boxing 会插入 zero boxing task node。但是在 :NaiveB2PSubTskGphBuilder 中, B2P 构造的 task edge 没有 add lbi。由于之前的 case 里没有 B2P 的单测,导致个 BUG 隐藏到现在才暴露。

@chengtbf chengtbf requested a review from oneflow-ci-bot April 19, 2022 08:11
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 19, 2022 17:15
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 20, 2022 00:25
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

✔️ OneFlow resnet50 time: 128.8ms (= 12880.8ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 140.7ms (= 14069.8ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.09 (= 140.7ms / 128.8ms)

OneFlow resnet50 time: 80.6ms (= 8055.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 86.2ms (= 8623.1ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.07 (= 86.2ms / 80.6ms)

OneFlow resnet50 time: 56.7ms (= 11332.4ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 59.4ms (= 11888.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.05 (= 59.4ms / 56.7ms)

OneFlow resnet50 time: 43.8ms (= 8768.9ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 46.4ms (= 9270.8ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.06 (= 46.4ms / 43.8ms)

OneFlow resnet50 time: 39.0ms (= 7791.9ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 36.0ms (= 7194.9ms / 200, input_shape=[1, 3, 224, 224])
❌ Relative speed: 0.92 (= 36.0ms / 39.0ms)

OneFlow swin dataloader time: 0.259s (= 51.770s / 200, num_workers=1)
PyTorch swin dataloader time: 0.152s (= 30.376s / 200, num_workers=1)
Relative speed: 0.587 (= 0.152s / 0.259s)

OneFlow swin dataloader time: 0.068s (= 13.500s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.266s / 200, num_workers=4)
Relative speed: 0.612 (= 0.041s / 0.068s)

OneFlow swin dataloader time: 0.038s (= 7.505s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.538s / 200, num_workers=8)
Relative speed: 0.605 (= 0.023s / 0.038s)

✔️ OneFlow resnet50 time: 136.2ms (= 13624.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 167.1ms (= 16706.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.23 (= 167.1ms / 136.2ms)

OneFlow resnet50 time: 91.6ms (= 9163.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.4ms (= 10443.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.14 (= 104.4ms / 91.6ms)

OneFlow resnet50 time: 64.7ms (= 12947.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.9ms (= 15788.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.22 (= 78.9ms / 64.7ms)

OneFlow resnet50 time: 54.5ms (= 10893.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.9ms (= 14373.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 71.9ms / 54.5ms)

OneFlow resnet50 time: 49.2ms (= 9840.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.6ms (= 13321.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 66.6ms / 49.2ms)

@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@github-actions
Copy link
Contributor

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8052/

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: NVIDIA GeForce GTX 1080 

❌ OneFlow resnet50 time: 129.5ms (= 12946.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.6ms (= 14361.3ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.11 (= 143.6ms / 129.5ms)

OneFlow resnet50 time: 78.1ms (= 7812.5ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.0ms (= 8397.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.07 (= 84.0ms / 78.1ms)

OneFlow resnet50 time: 54.7ms (= 10943.4ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 56.2ms (= 11244.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.03 (= 56.2ms / 54.7ms)

OneFlow resnet50 time: 43.2ms (= 8636.0ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 41.5ms (= 8305.2ms / 200, input_shape=[2, 3, 224, 224])
❌ Relative speed: 0.96 (= 41.5ms / 43.2ms)

OneFlow resnet50 time: 39.2ms (= 7845.5ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 39.6ms (= 7926.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.01 (= 39.6ms / 39.2ms)

OneFlow swin dataloader time: 0.247s (= 49.386s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.226s / 200, num_workers=1)
Relative speed: 0.612 (= 0.151s / 0.247s)

OneFlow swin dataloader time: 0.068s (= 13.639s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.343s / 200, num_workers=4)
Relative speed: 0.612 (= 0.042s / 0.068s)

OneFlow swin dataloader time: 0.038s (= 7.606s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.568s / 200, num_workers=8)
Relative speed: 0.601 (= 0.023s / 0.038s)

❌ OneFlow resnet50 time: 144.9ms (= 14494.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 169.9ms (= 16990.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 169.9ms / 144.9ms)

OneFlow resnet50 time: 102.0ms (= 10201.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 110.9ms (= 11088.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.09 (= 110.9ms / 102.0ms)

OneFlow resnet50 time: 75.2ms (= 15038.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 88.7ms (= 17738.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 88.7ms / 75.2ms)

OneFlow resnet50 time: 64.3ms (= 12850.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.2ms (= 15044.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.17 (= 75.2ms / 64.3ms)

OneFlow resnet50 time: 55.5ms (= 11106.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.1ms (= 13828.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.25 (= 69.1ms / 55.5ms)

@github-actions
Copy link
Contributor

CI failed when running job: cuda-module. PR label automerge has been removed

@chengtbf chengtbf removed the request for review from oneflow-ci-bot April 28, 2022 01:30
@chengtbf chengtbf enabled auto-merge (squash) April 28, 2022 01:31
@chengtbf chengtbf disabled auto-merge April 28, 2022 01:31
@chengtbf chengtbf enabled auto-merge (squash) April 28, 2022 01:36
@chengtbf chengtbf disabled auto-merge April 28, 2022 01:36
@chengtbf chengtbf merged commit 16c3a44 into master Apr 28, 2022
@chengtbf chengtbf deleted the dev_cc_fix_task_edge branch April 28, 2022 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug graph graph mode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Swin graph 3d 并行,打开 acc grad 报错
4 participants