Using dist.mode instead of logits.argmax. More compact. #1066

arnaujc91 · 2024-02-29T16:27:36Z

changed all the occurrences where picking an action deterministically

from: using the outputs of the actor network.
to: using the mode of the PyTorch distribution.

This was agreed with @MischaPanch.

Please make sure everything is alright.

tianshou/policy/modelfree/discrete_sac.py

MischaPanch · 2024-02-29T17:23:31Z

@arnaujc91 a test fails due to cryptic float/rounding errors. Could you introduce a rounding to say 5 digits to make sure the test succeeds?

arnaujc91 · 2024-03-02T11:20:16Z

@arnaujc91 a test fails due to cryptic float/rounding errors. Could you introduce a rounding to say 5 digits to make sure the test succeeds?

Please check test modifications.

tianshou/policy/modelfree/redq.py

codecov-commenter · 2024-03-02T11:32:17Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.17%. Comparing base (7c970df) to head (a9a664d).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1066      +/-   ##
==========================================
+ Coverage   88.16%   88.17%   +0.01%     
==========================================
  Files         100      100              
  Lines        8176     8172       -4     
==========================================
- Hits         7208     7206       -2     
+ Misses        968      966       -2

Flag	Coverage Δ
unittests	`88.17% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MischaPanch · 2024-03-02T12:25:04Z

@arnaujc91 pls resolve the conflicts with master

arnaujc91 · 2024-03-02T12:29:52Z

@arnaujc91 pls resolve the conflicts with master

done

changed all the occurrences where an action is selected deterministically - **from**: using the outputs of the actor network. - **to**: using the mode of the PyTorch distribution. --------- Co-authored-by: Arnau Jimenez <[email protected]>

using dist.mode instead of logits.argmax. More compact.

91623cc

MischaPanch reviewed Feb 29, 2024

View reviewed changes

tianshou/policy/modelfree/discrete_sac.py Show resolved Hide resolved

Fixed docu and tests.

74e10f2

arnaujc91 commented Mar 2, 2024

View reviewed changes

tianshou/policy/modelfree/redq.py Show resolved Hide resolved

Merge branch 'master' into feature/deterministic_eval

e0ad23a

Arnau Jimenez added 2 commits March 2, 2024 14:00

Fixing PEP8 error.

45faec2

Fixing PEP8 error. Removed unused import pytest.

a9a664d

MischaPanch merged commit 1aee41f into thu-ml:master Mar 2, 2024
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using dist.mode instead of logits.argmax. More compact. #1066

Using dist.mode instead of logits.argmax. More compact. #1066

arnaujc91 commented Feb 29, 2024 •

edited by MischaPanch

Loading

MischaPanch commented Feb 29, 2024

arnaujc91 commented Mar 2, 2024

codecov-commenter commented Mar 2, 2024 •

edited

Loading

MischaPanch commented Mar 2, 2024

arnaujc91 commented Mar 2, 2024

Using dist.mode instead of logits.argmax. More compact. #1066

Using dist.mode instead of logits.argmax. More compact. #1066

Conversation

arnaujc91 commented Feb 29, 2024 • edited by MischaPanch Loading

MischaPanch commented Feb 29, 2024

arnaujc91 commented Mar 2, 2024

codecov-commenter commented Mar 2, 2024 • edited Loading

Codecov Report

MischaPanch commented Mar 2, 2024

arnaujc91 commented Mar 2, 2024

arnaujc91 commented Feb 29, 2024 •

edited by MischaPanch

Loading

codecov-commenter commented Mar 2, 2024 •

edited

Loading