we already support this feature for legacy models and qwen2 since [PR](https://github.com/alibaba/ChatLearn/pull/92). but models of mcore format might break.
we already support this feature for legacy models and qwen2 since PR. but models of mcore format might break.