We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者您好,非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题: 论文中提到,learnable token的学习利用了diffusion模型,开源工程中和diffusion相关的是DynamiCrafter这个类;不过在工程中看到,DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关,而这两个配置文件只和eval脚本有关。同时,在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里,有这样一行代码:
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
` if has_image_input: video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1]) video_embeds_in = self.input_resampler(video_embeds_comp) video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1]) input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1]) elif not self.freeze_input_resampler: video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype) video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake) input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
`
想请教一下,在训练脚本中,是否没有包含tokenizer的训练或微调code?仅是基于已有tokenizer和detokenizer对LLM进行进一步微调?
多谢。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
作者您好,非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题:
论文中提到,learnable token的学习利用了diffusion模型,开源工程中和diffusion相关的是DynamiCrafter这个类;不过在工程中看到,DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关,而这两个配置文件只和eval脚本有关。同时,在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里,有这样一行代码:
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
` if has_image_input:
video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1])
video_embeds_in = self.input_resampler(video_embeds_comp)
video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1])
input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1])
elif not self.freeze_input_resampler:
video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype)
video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake)
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake
`
想请教一下,在训练脚本中,是否没有包含tokenizer的训练或微调code?仅是基于已有tokenizer和detokenizer对LLM进行进一步微调?
多谢。
The text was updated successfully, but these errors were encountered: