Where did the features of the datasets come from? #39

Youthfeng123 · 2022-04-11T07:41:39Z

Hi there, after reading the paper and code, I found that the ms-tcn takes features of video as input. Then I load a feature into a numpy variable, seeing the data is a matrix with shape (2048,n).
Here is my confuse: do the features in the datasets could be transformed into video? Or they are features that extracted from other backbone? What is that extractor?
Looking forward to your reply.

Youthfeng123 · 2022-04-11T08:24:57Z

I read the paper again, knowing that you use the video features extracted from I3D model. So could you please tell me that which extractor did you use? Two-Stream I3D Kinetics-pretraining or miniKinetics-pretraining?

shenjiyuan123 · 2022-04-20T03:08:11Z

You can check this issue, there may have the answer you want. #34
From my perspective, I think they use the pretrained model of two-stream I3D Kinetics.

Youthfeng123 · 2022-04-20T03:34:15Z

You can check this issue, there may have the answer you want. #34 From my perspective, I think they use the pretrained model of two-stream I3D Kinetics.

Hi~ thanks for your reply. I 've checked your proposal roughly, and noticed that one of the responder put a repo about feature extractor. Have you tried it before? If I choose a video shotted by myself, can I obtain a feature with shape (2048,n)? If not, it's okay~ thanks a lot anyway~

shenjiyuan123 · 2022-04-20T03:43:36Z

You can check this issue, there may have the answer you want. #34 From my perspective, I think they use the pretrained model of two-stream I3D Kinetics.

Hi~ thanks for your reply. I 've checked your proposal roughly, and noticed that one of the responder put a repo about feature extractor. Have you tried it before? If I choose a video shotted by myself, can I obtain a feature with shape (2048,n)? If not, it's okay~ thanks a lot anyway~

Hi. To be honesty, I've just recorded the data and am going to try this feature extractor. I also have the concern about whether the feature extractor repo can work because the last commit is about 5 years ago (cry.). I hope that maybe we can communicate frequently afterward when we have some results.

Youthfeng123 · 2022-04-20T03:52:34Z

You can check this issue, there may have the answer you want. #34 From my perspective, I think they use the pretrained model of two-stream I3D Kinetics.

Hi~ thanks for your reply. I 've checked your proposal roughly, and noticed that one of the responder put a repo about feature extractor. Have you tried it before? If I choose a video shotted by myself, can I obtain a feature with shape (2048,n)? If not, it's okay~ thanks a lot anyway~

Hi. To be honesty, I've just recorded the data and am going to try this feature extractor. I also have the concern about whether the feature extractor repo can work because the last commit is about 5 years ago (cry.). I hope that maybe we can communicate frequently afterward when we have some results.

Wow, It seems that we are in the same situation! Yeah of course I hope we can communicate this problem frequently too. I'm going to try that repo too, and wish luck to both of us.

shenjiyuan123 · 2022-04-20T03:55:34Z

You can check this issue, there may have the answer you want. #34 From my perspective, I think they use the pretrained model of two-stream I3D Kinetics.

Hi~ thanks for your reply. I 've checked your proposal roughly, and noticed that one of the responder put a repo about feature extractor. Have you tried it before? If I choose a video shotted by myself, can I obtain a feature with shape (2048,n)? If not, it's okay~ thanks a lot anyway~

Hi. To be honesty, I've just recorded the data and am going to try this feature extractor. I also have the concern about whether the feature extractor repo can work because the last commit is about 5 years ago (cry.). I hope that maybe we can communicate frequently afterward when we have some results.
Wow, It seems that we are in the same situation! Yeah of course I hope we can communicate this problem frequently too. I'm going to try that repo too, and wish luck to both of us.

Fighting~

bqdeng · 2022-04-25T01:31:53Z

WOW! I'm also facing this problem now. If you two have a breakthrough, please come back and share it! I will thank you very much!

shenjiyuan123 · 2022-04-25T04:12:41Z

WOW! I'm also facing this problem now. If you two have a breakthrough, please come back and share it! I will thank you very much!

Hi! So glad to hear that you are also trying to do the same thing as me. I just want to share my latest progress with you. I have tried a new repo, which also have the function for extracting the features. The only different thing is that the output dim is [n, 768] and you need to make a transpose of it in order to satisfy the input requirement of ASFormer. And also, if you want to use this method, you can check this for the more detail ttlmh/Bridge-Prompt#3. Hope this can really help you!

Youthfeng123 · 2022-04-26T02:13:51Z

WOW! I'm also facing this problem now. If you two have a breakthrough, please come back and share it! I will thank you very much!

Hi! So glad to hear that you are also trying to do the same thing as me. I just want to share my latest progress with you. I have tried a new repo, which also have the function for extracting the features. The only different thing is that the output dim is [n, 768] and you need to make a transpose of it in order to satisfy the input requirement of ASFormer. And also, if you want to use this method, you can check this for the more detail ttlmh/Bridge-Prompt#3. Hope this can really help you!

Hi there, it seems that we are not along. I did have tried https://github.com/ahsaniqbal/Kinetics-FeatureExtractor , but I failed when I try to install that c++project, if any of you installed this, Please teach me how, thanks! Then I found this repo: https://github.com/VividLe/ExtractVideoFeature ,and I also extract rgb and flow features successfully, but this code does some downsample in time, so the output dim would be [n_frame-chunk_size,1024], it means that you should fix it with padding some zero tensor into the video. So glad to hear that you also got the features, I will try your method as well.

Youthfeng123 · 2022-04-26T02:17:54Z

WOW! I'm also facing this problem now. If you two have a breakthrough, please come back and share it! I will thank you very much!

Hi! So glad to hear that you are also trying to do the same thing as me. I just want to share my latest progress with you. I have tried a new repo, which also have the function for extracting the features. The only different thing is that the output dim is [n, 768] and you need to make a transpose of it in order to satisfy the input requirement of ASFormer. And also, if you want to use this method, you can check this for the more detail ttlmh/Bridge-Prompt#3. Hope this can really help you!

By the way, how did you transform your ouput? Like adding a fully connected network, but didn't it means that the model need to be trained again?

shenjiyuan123 · 2022-04-26T07:14:28Z

WOW! I'm also facing this problem now. If you two have a breakthrough, please come back and share it! I will thank you very much!

Hi! So glad to hear that you are also trying to do the same thing as me. I just want to share my latest progress with you. I have tried a new repo, which also have the function for extracting the features. The only different thing is that the output dim is [n, 768] and you need to make a transpose of it in order to satisfy the input requirement of ASFormer. And also, if you want to use this method, you can check this for the more detail ttlmh/Bridge-Prompt#3. Hope this can really help you!

By the way, how did you transform your ouput? Like adding a fully connected network, but didn't it means that the model need to be trained again?

For the first comment. To be frank, I didn't try this repo any more. And from your description, I think you can try to maintain the downsample rate during the training process so that the dim would be the same. Because if the video is so long, it definitely needs to do the downsample to decrease the feature matrix size.

For the second comment. Yes, the models need to be trained from scratch. I just simply modify the action segmentation's input dim. Also, a possible way is that you can add a projection at first and load the pre-trained model parameters for the rest of the model and make a fine-tune.

Hope to hear your feedback and success!

habakan · 2022-05-19T03:33:39Z

Hi! I have the same problem.
The following repository is later research in action segmentation and quoted this repository.
https://github.com/yiskw713/asrf
By README, the above repo is using the same feature and we can extract features using this.

Dataset
GTEA, 50Salads, Breakfast
You can download features and G.T. of these datasets from this repository.
Or you can extract their features by yourself using this repository

I'll try to use https://github.com/yiskw713/video_feature_extractor.
Have any of you tried this?

Youthfeng123 · 2022-05-19T08:45:20Z

Hi! I have the same problem. The following repository is later research in action segmentation and quoted this repository. https://github.com/yiskw713/asrf By README, the above repo is using the same feature and we can extract features using this.

Dataset
GTEA, 50Salads, Breakfast
You can download features and G.T. of these datasets from this repository.
Or you can extract their features by yourself using this repository

I'll try to use https://github.com/yiskw713/video_feature_extractor. Have any of you tried this?

Hi, thanks for your proposal! To be honest, I didn’t know this feature extractor until I read your reply, I’ve used the extractor I mentioned in previous answer to do experiments for some time, and I found that the features extracted by that extractor are not good enough, which means that the accuracy wasn’t that high as the record in the paper. I will try your repo, and thank you again.

habakan · 2022-05-19T10:06:48Z

Hi, thanks for your proposal! To be honest, I didn’t know this feature extractor until I read your reply, I’ve used the extractor I mentioned in previous answer to do experiments for some time, and I found that the features extracted by that extractor are not good enough, which means that the accuracy wasn’t that high as the record in the paper. I will try your repo, and thank you again.

Thank you sharing your status!
I'll also re-experment to reproduct paper records, but it may sound difficult from what you've said.
My proposal repo is using pytorch-i3d. So, I think this feature is not completely identical with official features.

shenjiyuan123 · 2022-05-19T14:28:52Z

Thank you sharing your status! I'll also re-experment to reproduct paper records, but it may sound difficult from what you've said. My proposal repo is using pytorch-i3d. So, I think this feature is not completely identical with official features.

Hi! I am also working on the feature extraction process. I am looking forward for your results. If you have any when using this repo, I hope you can share us with the result. Thank you very much!

rickywrq · 2022-07-20T04:07:53Z

Hi, @Youthfeng123 @habakan @shenjiyuan123. I am also exploring I3D lately and find the code from the following repos that you have shared (code1 and code2) work for me.

@Youthfeng123 For the shape (2048,n), my assumption is that when you input 21 video frames (as mentioned here) with the frame size of (224,224), i.e., the input size is (-1,3,21,224,224), you will get shape of (-1,1024,2,1,1) as the output of last AvgPool3d.

Youthfeng123 · 2022-07-21T14:10:08Z

Hi, @Youthfeng123 @habakan @shenjiyuan123. I am also exploring I3D lately and find the code from the following repos that you have shared (code1 and code2) work for me.

@Youthfeng123 For the shape (2048,n), my assumption is that when you input 21 video frames (as mentioned here) with the frame size of (224,224), i.e., the input size is (-1,3,21,224,224), you will get shape of (-1,1024,2,1,1) as the output of last AvgPool3d.

Thanks for your explanation of the tensor's shape, it helps a lot! May I ask which optical flow extractor did you use? I used this repo to extract optical flow of video, since I would like to try more different methods, would you please share yours? Thanks a lot! @littlesi789 @shenjiyuan123 @habakan

rickywrq · 2022-07-21T21:15:11Z

Hi, @Youthfeng123 @habakan @shenjiyuan123. I am also exploring I3D lately and find the code from the following repos that you have shared (code1 and code2) work for me.
@Youthfeng123 For the shape (2048,n), my assumption is that when you input 21 video frames (as mentioned here) with the frame size of (224,224), i.e., the input size is (-1,3,21,224,224), you will get shape of (-1,1024,2,1,1) as the output of last AvgPool3d.

Thanks for your explanation of the tensor's shape, it helps a lot! May I ask which optical flow extractor did you use? I used this repo to extract optical flow of video, since I would like to try more different methods, would you please share yours? Thanks a lot! @littlesi789 @shenjiyuan123 @habakan

Hi, @Youthfeng123. In the I3D paper, the author used tv-l1 to extract optical flow. You may find code implementations from others.

Please be cautious with my explanation above. I could not find if the authors used RGB, or optical flow, or both in the MS-TCN paper (please correct me if they did mention it in the paper). In the original I3D paper, the two flows are averaged at the final prediction stage and the predictions are arraies with a length of 400. So the above explanation for 2048 is my assumption, and unfortunately, I cannot find a way right now to reproduce their extracted features.

habakan · 2022-07-22T01:25:26Z

@littlesi789 @Youthfeng123
Thanks for sharing!
This author comments shows features are extracted by this repo.
And this repo seems to use optical flow tv-l1 implemented by OpenCV(Target code is this).
Considering above, I think MS-TCN use OpticalFlow feature(but I could not find in the paper).

XuanHien304 · 2022-10-31T10:03:09Z

Can I ask that anyone of you extract successfully video features with shape (2048xT)? I cannot install opencv 2.4.13 in the kinetic feature extraction repo.

XuanHien304 · 2022-11-01T01:32:06Z

@littlesi789 @shenjiyuan123 @bqdeng Do you mind answering me :'<

KarolyneFarfan · 2024-04-07T14:30:51Z

Hi guys, I am currently facing the same problem, I would like to know if anyone have made this repo work ? https://github.com/ahsaniqbal/Kinetics-FeatureExtractor/tree/master. I would be really grateful if you can give some advice.

XuanHien304 · 2024-04-17T04:14:32Z

Hi @KarolyneFarfan, I used another repo to extract feature, you can see my project at: https://github.com/XuanHien304/E2E-Action-Segmentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where did the features of the datasets come from? #39

Where did the features of the datasets come from? #39

Youthfeng123 commented Apr 11, 2022 •

edited

Loading

Youthfeng123 commented Apr 11, 2022

shenjiyuan123 commented Apr 20, 2022

Youthfeng123 commented Apr 20, 2022

shenjiyuan123 commented Apr 20, 2022

Youthfeng123 commented Apr 20, 2022

shenjiyuan123 commented Apr 20, 2022

bqdeng commented Apr 25, 2022

shenjiyuan123 commented Apr 25, 2022 •

edited

Loading

Youthfeng123 commented Apr 26, 2022

Youthfeng123 commented Apr 26, 2022

shenjiyuan123 commented Apr 26, 2022

habakan commented May 19, 2022

Youthfeng123 commented May 19, 2022

habakan commented May 19, 2022

shenjiyuan123 commented May 19, 2022

rickywrq commented Jul 20, 2022

Youthfeng123 commented Jul 21, 2022

rickywrq commented Jul 21, 2022

habakan commented Jul 22, 2022

XuanHien304 commented Oct 31, 2022

XuanHien304 commented Nov 1, 2022

KarolyneFarfan commented Apr 7, 2024

XuanHien304 commented Apr 17, 2024

Where did the features of the datasets come from? #39

Where did the features of the datasets come from? #39

Comments

Youthfeng123 commented Apr 11, 2022 • edited Loading

Youthfeng123 commented Apr 11, 2022

shenjiyuan123 commented Apr 20, 2022

Youthfeng123 commented Apr 20, 2022

shenjiyuan123 commented Apr 20, 2022

Youthfeng123 commented Apr 20, 2022

shenjiyuan123 commented Apr 20, 2022

bqdeng commented Apr 25, 2022

shenjiyuan123 commented Apr 25, 2022 • edited Loading

Youthfeng123 commented Apr 26, 2022

Youthfeng123 commented Apr 26, 2022

shenjiyuan123 commented Apr 26, 2022

habakan commented May 19, 2022

Youthfeng123 commented May 19, 2022

habakan commented May 19, 2022

shenjiyuan123 commented May 19, 2022

rickywrq commented Jul 20, 2022

Youthfeng123 commented Jul 21, 2022

rickywrq commented Jul 21, 2022

habakan commented Jul 22, 2022

XuanHien304 commented Oct 31, 2022

XuanHien304 commented Nov 1, 2022

KarolyneFarfan commented Apr 7, 2024

XuanHien304 commented Apr 17, 2024

Youthfeng123 commented Apr 11, 2022 •

edited

Loading

shenjiyuan123 commented Apr 25, 2022 •

edited

Loading