[perf] feat: support meta device init and parallel load for fsdp (#123)
This PR supports: - meta device init (which keeps the shared parameters) - parallel pre-trained weight init for FSDP from huggingface checkpoint --------- Co-authored-by: zhiqi.0 <zhiqi.0@bytedance.com>
Showing
Please
register
or
sign in
to comment