Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError in assert rois.size(0) > 0 #42

Open
zhanwenchen opened this issue Feb 27, 2023 · 0 comments
Open

AssertionError in assert rois.size(0) > 0 #42

zhanwenchen opened this issue Feb 27, 2023 · 0 comments
Assignees
Labels
bug Something isn't working gsc Old Codebase

Comments

@zhanwenchen
Copy link
Owner

zhanwenchen commented Feb 27, 2023

lab2: gsc 20230226220526_vctree_semantic_sgcls_4GPU_lab1_1e3
also in lab3: 20230226220656_vctree_semantic_predcls_4GPU_lab1_1e3/

no rels_new for rel_og=has
no rels_new for rel_og=with
no rels_new for rel_og=in front of
no rels_new for rel_og=in front of
3209: Augmentation: 1 => 1
3209: Augmentation: 1 => 1
no rels_new for rel_og=on
no rels_new for rel_og=on
no rels_new for rel_og=on
no rels_new for rel_og=with
no rels_new for rel_og=on
no rels_new for rel_og=on
3209: Augmentation: 1 => 1
Traceback (most recent call last):
  File "/home/pct4et/gsc/tools/relation_train_net.py", line 665, in <module>
    main()
  File "/home/pct4et/gsc/tools/relation_train_net.py", line 650, in main
    train(cfg, local_rank, args.distributed, logger, experiment)
  File "/home/pct4et/gsc/tools/relation_train_net.py", line 327, in train
    loss_dict = model(images, targets)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1008, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 969, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 76, in forward
    _, result, detector_losses = self.roi_heads(features, proposals, targets, logger, boxes_global=boxes_global)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 69, in forward
    x, detections, loss_relation = self.relation(features, detections, targets, logger, boxes_global=boxes_global)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/roi_heads/relation_head/relation_head.py", line 99, in forward
    union_features = self.union_feature_extractor(features, proposals, rel_pair_idxs)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/roi_heads/relation_head/roi_relation_feature_extractors.py", line 99, in forward
    union_vis_features = self.feature_extractor.pooler(x, union_proposals) # union_proposals: 16 * [651..., 650..., 110...] # union_vis_features: torch.Size([5049, 256, 7, 7]) # TODO: need to borrow pooler's 5 layers to 1 reduction. so have a global union feature pooler
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/poolers.py", line 142, in forward
    assert rois.size(0) > 0
AssertionError
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 64810 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 64811 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 64812 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 3 (pid: 64813) of binary: /home/pct4et/miniconda3/envs/gsc/bin/python
Traceback (most recent call last):
  File "/home/pct4et/miniconda3/envs/gsc/bin/torchrun", line 33, in <module>
    sys.exit(load_entry_point('torch==1.12.1', 'console_scripts', 'torchrun')())
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
    return f(*args, **kwargs)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
    run(args)
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
    elastic_launch(
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
@zhanwenchen zhanwenchen added bug Something isn't working gsc Old Codebase labels Feb 27, 2023
@zhanwenchen zhanwenchen self-assigned this Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gsc Old Codebase
Projects
None yet
Development

No branches or pull requests

1 participant