Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on use of weight argument in as_epidist_marginal_model #512

Open
athowes opened this issue Jan 28, 2025 · 4 comments · May be fixed by #515
Open

Clarification on use of weight argument in as_epidist_marginal_model #512

athowes opened this issue Jan 28, 2025 · 4 comments · May be fixed by #515
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@athowes
Copy link
Collaborator

athowes commented Jan 28, 2025

PR #509 adds support for users providing a column which contains counts / weights. Docs are:

A column name to use for weighting the data in the likelihood. Default is NULL. Internally this is used to define the 'n' column of the returned object.

Compatible with epidist_linelist_data (which doesn't naturally support weighted linelist / aggregate, hence I suppose #508).

Some notes:

  • We had back and forth on design suggesting that aggregation needed to be internal to the marginal model. I guess that isn't a blocker because aggregation still occurs in the marginal model, but there is no harm in having this additional aggregatation. Essentially it's the user saying "all this data is the same" which can never be non-optimal (unless they are wrong). This then gets further aggregated up if it's possible to.
  • Likely want examples demonstrating this feature e.g. in vignette

So I guess scope of this issue could add docs to note that the marginal model will perform further aggregation if it is possible to given the formula etc. that you specify. Or that could be best left to longer form vignette.

@athowes
Copy link
Collaborator Author

athowes commented Jan 28, 2025

I think for the docs rather than only refer to "weighting" I might mention that this would usually be counts of a particular linelist item (i.e. connecting it up to the data rather than abstractly it's a weighted likelihood -- I'm unsure under what circumstance someone would be weighting in a way unconnected to counts of an observation).

@seabbs seabbs added the documentation Improvements or additions to documentation label Jan 28, 2025
@kgostic
Copy link
Collaborator

kgostic commented Jan 29, 2025

I agree the documentation isn't super clear, but I think it can be polished up with some edits to the details, and an example. I don't know that a full vignette is needed.

@athowes
Copy link
Collaborator Author

athowes commented Jan 29, 2025

Agree that a full vignette is not needed for this alone. That said, I do think we should have a vignette / paper about the marginal model (and its differences as compared with the latent model) and it'd naturally be included there about how to use this argument.

@kgostic
Copy link
Collaborator

kgostic commented Jan 29, 2025

I'm going to take a stab at a good-enough fix now.

@kgostic kgostic self-assigned this Jan 29, 2025
@kgostic kgostic linked a pull request Jan 29, 2025 that will close this issue
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants