[bugfix] PRIME filter overlong propmts & padding side incorrect & use xformers (#570)
### Description - fix filter_overlong_prompts setting in PRIME - fix padding side incorrect for Qwen in PRIME - When I utilize PRIME recipe to train Qwen series models, I got “*ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to call tokenizer.padding_side = 'left' before tokenizing the input.*” So I set `use_cache = False` when calling model to calculate output logits. - fix CUDA error with vllm v0.6.3 - When I run PRIME, I may get an error — *CUDA error: an illegal memory access was encountered*. According to https://github.com/vllm-project/vllm/issues/10389, I set `VLLM_ATTENTION_BACKEND=XFORMERS` .
Showing
Please
register
or
sign in
to comment