有关DynamiCrafter的问题 #3

Charles-ux-bit · 2024-12-28T10:53:59Z

作者您好，非常感谢Divot的开源工作。想请教有关DynamiCrafter的问题：
论文中提到，learnable token的学习利用了diffusion模型，开源工程中和diffusion相关的是DynamiCrafter这个类；不过在工程中看到，DynamiCrafter只和Divot_detokenizer_stage1.yaml、Divot_detokenizer_stage2.yaml两个配置文件有关，而这两个配置文件只和eval脚本有关。同时，在src/models_clm/models.py:208行的ContinuousLVLM_Video_Comp_Gen类里，有这样一行代码：

input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake

` if has_image_input:
video_embeds_comp = video_embeds.reshape(bz, -1, video_embeds.shape[-1])
video_embeds_in = self.input_resampler(video_embeds_comp)
video_embeds_in = video_embeds_in.reshape(bz, num_clips, -1, video_embeds_in.shape[-1])
input_embeds[ids_cmp_mask] = video_embeds_in[embeds_cmp_mask].reshape(-1, video_embeds_in.shape[-1])
elif not self.freeze_input_resampler:
video_embeds_comp_fake = torch.randn(bz, self.input_resampler.num_queries, self.input_resampler.input_dim).to(input_embeds.device, dtype=input_embeds.dtype)
video_embeds_in_fake = self.input_resampler(video_embeds_comp_fake)
input_embeds[:, :self.input_resampler.num_queries] = input_embeds[:, :self.input_resampler.num_queries] + 0.0 * video_embeds_in_fake

`

想请教一下，在训练脚本中，是否没有包含tokenizer的训练或微调code？仅是基于已有tokenizer和detokenizer对LLM进行进一步微调？

多谢。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

有关DynamiCrafter的问题 #3

有关DynamiCrafter的问题 #3

Charles-ux-bit commented Dec 28, 2024

有关DynamiCrafter的问题 #3

有关DynamiCrafter的问题 #3

Comments

Charles-ux-bit commented Dec 28, 2024