Replies: 1 comment 2 replies
-
It seems you do not configure
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I tried to use the command
"
bash ./tools/benchmarks/mmdetection/mim_dist_train_c4.sh configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712ls.py work_dirs/selfsup/densecl_resnet50_8xb32-coslr-200e_in1k/epoch_200.pth 1
"
And the config I used is:
"
base = 'mmdet::pascal_voc/faster-rcnn_r50-caffe-c4_ms-18k_voc0712.py'
data_preprocessor = dict(
type='DetDataPreprocessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_size_divisor=32)
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
backbone=dict(
frozen_stages=-1,
norm_cfg=norm_cfg,
norm_eval=False,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
roi_head=dict(
shared_head=dict(
type='ResLayerExtraNorm',
norm_cfg=norm_cfg,
norm_eval=False,
style='pytorch'),
bbox_head=dict(num_classes=2)))
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='RandomChoiceResize',
scales = [(666, 240), (666, 256), (666,272), (666, 288),
(666, 304), (666, 320), (666, 336), (666, 352),
(666, 368), (666, 384), (666, 400)],
keep_ratio=True),
dict(type='RandomFlip', prob=0.5),
dict(type='PackDetInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', scale=(666, 400), keep_ratio=True),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
]
dataset_type = 'VOCDataset'
data_root = '/media/ls/disk1/DOTA/VOCdevkit/'
train_dataloader = dict(
batch_size=2,
num_workers=1,
sampler=dict(type='InfiniteSampler', shuffle=True),
dataset=dict(
delete=True,
type='VOCDataset',
data_root=data_root,
ann_file='VOC2007/ImageSets/Main/trainval.txt',
data_prefix=dict(sub_data_root='VOC2007/'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
pipeline=train_pipeline,
))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline,data_root=data_root,))
test_dataloader = val_dataloader
train_cfg = dict(delete=True, type='EpochBasedTrainLoop', max_epochs=24, val_interval=4)
#max_iter = 824
param_scheduler = [
dict(
type='LinearLR', start_factor=0.001, by_epoch=False, begin=0,
end=1000),
dict(
type='MultiStepLR',
begin=0,
end=24,
by_epoch=True,
milestones=[16, 22],
gamma=0.1)
]
val_evaluator = dict(type='VOCMetric', metric='mAP', eval_mode='11points')
test_evaluator = val_evaluator
default_hooks = dict(checkpoint=dict(by_epoch=True, interval=4))
log_processor = dict(by_epoch=True)
custom_imports = dict(
imports=['mmselfsup.evaluation.functional.res_layer_extra_norm'],
allow_failed_imports=False)
"
However the training process stuck at the epoch1 all the time , and the traning epoch couldn't run into the next epoch as below.
The log file is like "mmengine - INFO - Epoch(train) [1][2400/824]", which is "24000" is already over "824". The training process should have went to the "Epoch(train) [2]".
I tried to check the config file, but I couldn't find what caused the error.
May I get some advice? Thanks in advanced.
Beta Was this translation helpful? Give feedback.
All reactions