Conversation
|
这里想请教下,如果想针对新的音色进行模型精调,是只训练spk_emb矩阵嘛?还是需要同时训练spk_emb,gpt相关模块呀? |
|
@gafield-liu 训练效果确实不太行,可能得调一调训练参数。我现在的只是随便写的 |
这里应该缺少了语音embedding的提取模块,随机初始化的话音色精调出来效果不行~ |
|
Thank you so much for your hard work and the fine-tuning. I found this project just a day ago, and I’m happy to say I was able to fine-tune without any errors using VDAE and GPTSpeakers I just tried the new update Merge branch '2noise'. today to Fine-tuning DVAE worked fine, but I got an error when trying to fine-tune GPT. Here’s the error message i get ChatTTS\utils\finetune\model.py", line 204, in get_hidden_states_and_labels I really appreciate all your work and would be grateful for any help with this error. Thanks again for your time! |
|
@fumiama Hi, just a status update that I've just got plenty of free time to work on this PR. Will have updates these days. I'll continue working on improving the training performance. |
Appreciate. I will do it at your next push that you fix the test. |
|
@fumiama The reason of failure is the test file import What's your suggestion about the compatibility? Shall we still support python<3.12 and uses Overall, my codes requires |
Well, if there's nothing MUST require |
|
@fumiama I suggest deprecating support for python 3.8, which doesn't support native typing As a reference, pytorch requires python>=3.9 since 2.5 |
Maybe you should use |
|
Will revert to python 3.8 style later. My current codes are heavily relying on match, |
Thanks for your understanding. Maybe you can split this PR into some independent parts and open a few PRs as long as those parts complete in order to avoid the sync-upstream work due to long time modification. |
|
I found in dvae.py @torch.inference_mode() is used to decorate the forward() function (line 260). However, the forward() function is used during finetune process, which results in gradient broken after loss.backward(). To be specific, when I finetune the decoder module, the variable decoder_mel_specs is the hidden mel output but decoder_mel_specs.require_grad = false. So I really want to know if I need to remove @torch.inference_mode() from dvae.py or just keep it there. |
|
@1803170327 I haven’t worked on this project for too long, but I remember this branch is not reproducing with good performance yet. So it’s possible to have some code issue inside for sure. (I think maintainers have refactored the codes a lot, don’t know if my branch is still compatible) Appreciate if you can fix and reproduce some nice results. Feel free to fork and continue the task |
|
I somehow remember that I met with that issue long long ago and I did the same thing as you. Otherwise, there won’t be any gradient passing backwards |
Thanks for your reply. I will try to remove @torch.inference_mode() and do more experiments. |
|
Feel free to change the code. Always open to receive PRs. |
Add fine-tuning scripts. The commands are provided at the top of each file.
There are a few items to note:
utilsdirectory or put scripts intoexamplesfolder).https://github.com/2noise/ChatTTS/blob/0bef943d192cd1dd4067f83e16a93f19889b9a87/ChatTTS/utils/finetune/dataset.py
cc @fumiama