I tried to use dcp+fsdp for training, but I got an error when executing optimizer.step(). I found the relevant unit test in pytorch: test_e2e_save_and_load.py ...
from torch.testing._internal.common_utils import TEST_WITH_ASAN from torch.testing._internal.inductor_utils import GPU_TYPE, RUN_CPU, RUN_GPU # Make the helper files ...