Skip to content

i met some errors when run train step 1 in colab  #83

Open
@tengfei86

Description

@tengfei86

@title Train (Step 1)

GPUS=0
NAME="scene_0"
EXP_NAME="base"
ROOT_DIRECTORY=f"all_sequences/{NAME}/{NAME}"
MODEL_SAVE_PATH=f"ckpts/all_sequences/{NAME}"
LOG_SAVE_PATH=f"logs/test_all_sequences/{NAME}"
WEIGHT_PATH=f"ckpts/all_sequences/{NAME}/{EXP_NAME}/{NAME}.ckpt"
CONFIG_DIRECTORY=f"configs/{NAME}/{EXP_NAME}.yaml"
MASK_DIRECTORY=f"all_sequences/{NAME}/{NAME}_masks_0 all_sequences/{NAME}/{NAME}_masks_1"
FLOW_DIRECTORY=f"all_sequences/{NAME}/{NAME}_flow"

!python train.py --root_dir $ROOT_DIRECTORY
--model_save_path $MODEL_SAVE_PATH
--log_save_path $LOG_SAVE_PATH
--gpus $GPUS
--encode_w --annealed
--config $CONFIG_DIRECTORY
--exp_name $EXP_NAME
--mask_dir $MASK_DIRECTORY
--flow_dir $FLOW_DIRECTORY

'''
2023-12-27 09:43:01.998540: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-27 09:43:01.998604: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-27 09:43:02.000020: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-27 09:43:03.068253: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/content/CoDeF/train.py", line 557, in
main(hparams)
File "/content/CoDeF/train.py", line 552, in main
trainer.fit(system, ckpt_path=hparams.ckpt_path)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
call._call_and_handle_interrupt(
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 102, in launch
return function(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 950, in _run
call._call_setup_hook(self) # allow user to setup lightning_module in accelerator environment
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 94, in _call_setup_hook
_call_lightning_module_hook(trainer, "setup", stage=fn)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 157, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/content/CoDeF/train.py", line 244, in setup
self.val_dataset = dataset(split='val', **kwargs)
File "/content/CoDeF/datasets/video_dataset.py", line 33, in init
self.read_meta()
File "/content/CoDeF/datasets/video_dataset.py", line 99, in read_meta
input_image = cv2.imread(all_images_path[0])
IndexError: list index out of range
'''

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions