Skip to content

[ENH] TimeXer model from thuml #1797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
Jun 5, 2025
Merged

Conversation

PranavBhatP
Copy link
Contributor

@PranavBhatP PranavBhatP commented Mar 14, 2025

Description

This PR works on #1793 and aims to align and implement the TimeXer model within PTF's design.

Checklist

  • Linked issues (if existing)
  • Amended changelog for large changes (and added myself there as contributor)
  • Added/modified tests
  • Used pre-commit hooks when committing to ensure that code is compliant with hooks. Install hooks with pre-commit install.
    To run hooks independent of commit, execute pre-commit run --all-files

Make sure to have fun coding!

Copy link

codecov bot commented Mar 16, 2025

Codecov Report

Attention: Patch coverage is 89.83516% with 37 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@7305d46). Learn more about missing BASE report.

Files with missing lines Patch % Lines
pytorch_forecasting/models/timexer/_timexer.py 82.97% 24 Missing ⚠️
pytorch_forecasting/models/timexer/sub_modules.py 93.78% 10 Missing ⚠️
...ch_forecasting/models/timexer/_timexer_metadata.py 93.54% 2 Missing ⚠️
pytorch_forecasting/tests/_conftest.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1797   +/-   ##
=======================================
  Coverage        ?   85.71%           
=======================================
  Files           ?       68           
  Lines           ?     6580           
  Branches        ?        0           
=======================================
  Hits            ?     5640           
  Misses          ?      940           
  Partials        ?        0           
Flag Coverage Δ
cpu 85.71% <89.83%> (?)
pytest 85.71% <89.83%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@PranavBhatP
Copy link
Contributor Author

PranavBhatP commented Mar 21, 2025

@fkiraly the implementation of TimeXer seems to be working fine with PTF in its current design and all the functionalities as highlighed in the thuml paper are working here as well. Will proceed with the PTF v2.0 design suggested by @phoeenniixx and hopefully come up with a interface to integrate more of these models.

@PranavBhatP PranavBhatP marked this pull request as ready for review March 22, 2025 19:00
@fkiraly fkiraly moved this from PR in progress to PR under review in Dec 2024 - Mar 2025 mentee projects Mar 24, 2025
@fkiraly
Copy link
Collaborator

fkiraly commented Apr 4, 2025

FYI @agobbifbk, @benHeid

@agobbifbk
Copy link

A couple of comment:

  • I will add somewhere in the code that this is a porting from THUML (credits?). Maybe there and I did't see it
  • The architecture seems to be aligned to the one reported in THUML.
  • I have some doubts in line 450 of _timexer.py prediction = prediction.repeat(1, 1, 1, num_quantiles) it seems that you replicate the prediction instead of increasing the output shape of the architecture at the beginning
  • I see a not marked tick for multi-output prediction, is this model support multi-output? And muliti-output with quantile loss?
  • I don't have enough knowledge to review the connection between data and model

@PranavBhatP
Copy link
Contributor Author

Hi @agobbifbk thanks for the review!

will add somewhere in the code that this is a porting from THUML (credits?). Maybe there and I did't see it

Yes, I think I forgot to credit the original authors. Will do :).

I have some doubts in line 450 of _timexer.py prediction = prediction.repeat(1, 1, 1, num_quantiles) it seems that you replicate the prediction instead of increasing the output shape of the architecture at the beginning

I had similar questions with how this model would handle quantile predictions, the original architecture doesn't seem to be handling this, so I just decided to patch it up this line of code (which might be a bad approach). I am not very aware of what changes I should make to fix this? Could you help me here?

I see a not marked tick for multi-output prediction, is this model support multi-output? And muliti-output with quantile loss?

It is multi-output indeed, the _forecast_multi method is native to the tslib package. Coming to the aspect of multi-output with quantile loss, as you had mentioned in the previous point, there is some difficulty in handling this, we need to make changes for that.

@agobbifbk
Copy link

I had similar questions with how this model would handle quantile predictions, the original architecture doesn't seem to be handling this, so I just decided to patch it up this line of code (which might be a bad approach). I am not very aware of what changes I should make to fix this? Could you help me here?

Sure, usually it is sufficient to increase the number of output channels. Suppose the model ends with something in the shape BxLxC where B is the batch size, L is the output length and C the number of target variables. You have 2 ways: the first is to inizialize the model with a different value of C: usually the quantiles are 3 (0.05, 0.5, 0.95) --> you force the model to give 3C output channels instead of C. In this case you need to check how the quantile loss is implemented and the shape it expects. The other approach, the one we use in DISPTS, is that all the models produce an output of the shape BxLxCxM where M is 1 in case of standard loss or 3 in case of quantile loss.

My suggestion is to start from the implementation of the quantile loss and check the definition (in DSIPTS there is a multioutput version of it, just summing the contribution of each channel) and then play with the output shape of the model!

Let me know if it is sufficient to finish the job :-)

@PranavBhatP
Copy link
Contributor Author

Hi @agobbifbk

Does it make sense to let it be always 4 (and in the case of not usage of quantile loss just have 1 as last dimension)? There are a lot of if-else in the last forward loop, maybe it can be difficult to understand in which position are the information, maybe a comment on the dimension structure can help also to read what is happening

I have added comments to explain the shape of the final output tensors better. Do let me know, if more changes are required.

I think that the frequency parameter should be removed and we need to take into account for categorical data (in your implementation it seems that the categorical variables are not used).

Since this change would probably affect all the future tslib models that I would add to the library, I think it would be better if we defer these changes to a separate issue (after this PR is merged)?

@agobbifbk
Copy link

Ok it seems reasonable to close this and open a new one. You wrote:

(batch_size, prediction_length, n_quantiles)

It seems to me that there is a missing dimension (it should be 4, one for the number of output channels), isn't it?

@PranavBhatP
Copy link
Contributor Author

PranavBhatP commented May 20, 2025

It seems to me that there is a missing dimension (it should be 4, one for the number of output channels), isn't it?

(batch_size, prediction_length, n_quantiles) is the shape of the individual target within the list prediction = [prediction[..., i, :] for i in target_indices] which contains the tensors of all the targets to be predicted. The length of the list is actually the number of output channels here.

Maybe the language of the comment is confusing.

@PranavBhatP
Copy link
Contributor Author

Hi @fkiraly , this PR is ready to be merged.

@PranavBhatP PranavBhatP requested a review from fkiraly May 22, 2025 11:19
@fkiraly fkiraly changed the title [ENH] Implementing TimeXer model from thuml [ENH] TimeXer model from thuml Jun 4, 2025
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated from main - it seems like tests are failing.

Possibly due to the recent changes to the metadata layer (datamodule method)?

@fkiraly fkiraly moved this from PR under review to PR in progress in May - Sep 2025 mentee projects Jun 5, 2025
@fkiraly fkiraly moved this from PR in progress to PR under review in May - Sep 2025 mentee projects Jun 5, 2025
@fkiraly fkiraly added the enhancement New feature or request label Jun 5, 2025
@fkiraly fkiraly merged commit 0e1ac53 into sktime:main Jun 5, 2025
31 of 33 checks passed
@github-project-automation github-project-automation bot moved this from PR under review to Done in Dec 2024 - Mar 2025 mentee projects Jun 5, 2025
@github-project-automation github-project-automation bot moved this from PR under review to Done in May - Sep 2025 mentee projects Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request new network
Development

Successfully merging this pull request may close these issues.

3 participants