Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有关DynamiCrafter的问题 #3

Open
Charles-ux-bit opened this issue Dec 28, 2024 · 0 comments
Open

有关DynamiCrafter的问题 #3

Charles-ux-bit opened this issue Dec 28, 2024 · 0 comments

Comments

@Charles-ux-bit
Copy link

作者您好,非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题:
论文中提到,learnable token的学习利用了diffusion模型,开源工程中和diffusion相关的是DynamiCrafter这个类;不过在工程中看到,DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关,而这两个配置文件只和eval脚本有关。同时,在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里,有这样一行代码:

input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake

` if has_image_input:
video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1])
video_embeds_in = self.input_resampler(video_embeds_comp)
video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1])
input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1])
elif not self.freeze_input_resampler:
video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype)
video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake)
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake

`

想请教一下,在训练脚本中,是否没有包含tokenizer的训练或微调code?仅是基于已有tokenizer和detokenizer对LLM进行进一步微调?

多谢。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant