Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alternatives to doccer? #27

Open
drammock opened this issue Jun 7, 2024 · 4 comments
Open

alternatives to doccer? #27

drammock opened this issue Jun 7, 2024 · 4 comments

Comments

@drammock
Copy link

drammock commented Jun 7, 2024

One thing I hoped to talk about at the summit but didn't manage to was doccer (SciPy's internal tool for docstring deduplication). MNE-Python adopted/adapted doccer many years ago, and it helped us find and fix many outdated/inaccurate docstrings.

The problem we're facing is that

  1. the docstrings can't be easily read in the source code.
  2. the docstrings are only filled in when the package is imported, which means static analyzers like pyright don't fill them in. This means that in vscode the various hover/tooltip/tab-completion things also (like the source files themselves) show the cryptic docstring placeholders instead of the filled-in parameter descriptions

Problem 1 alone wouldn't be so bad (arguably an advantage, as it reduces scrolling past screens and screens of docstring between snatches of actual code), but combined with problem 2 it has left some of our devs in a perpetually frustrated state.

My questions are:

  1. have SciPy devs found good workarounds to the problems I mention above?
    • One solution I already know is "have an ipython terminal open in your IDE, and if you need to read a docstring, use ? (like mne.what.ever?)" but I'm interested in other approaches
  2. how are other packages besides SciPy and MNE dealing with param descriptions (or other aspects of docstrings) that are repeated across many parts of your codebase (i.e., how do you keep them in sync)?
@thomasjpfan
Copy link
Member

In scikit-learn, we have a tests to enforce constraints on the docstring parameters:

https://github.com/scikit-learn/scikit-learn/blob/ea1e8c4b216d4b1e21b02bafe75ee1713ad21079/sklearn/tests/test_docstring_parameters.py#L79-L80

I never pushed for a dynamic way to fill in the docstrings, because of the issues with static analyzers.

@drammock
Copy link
Author

In scikit-learn, we have a tests to enforce constraints on the docstring parameters:

Thanks @thomasjpfan. We have similar tests in MNE-Python (probably we copied ours from sklearn) but like your tests, they don't enforce much about the content of the parameter descriptions, mostly just ensuring that they exist and that they come in the same order as the function signature.

I never pushed for a dynamic way to fill in the docstrings, because of the issues with static analyzers.

It's seeming like the only option that both works with static analysis and also preserves consistency across the API would be to go back to having our docstrings all hard-coded in the source files, maintaining a mapping somewhere saying "the param description for picks (or axes or whatever) should be identical across this list of functions", and then asserting that in a test.

@cbrnr
Copy link

cbrnr commented Jun 11, 2024

To be honest, I'd prefer almost anything over doccer expansion at this point. Besides static analyzers, the benefit of being able to just read docstrings in the source cannot be overstated.

@betatim
Copy link

betatim commented Jun 12, 2024

I am also not a fan of "docstrings with holes in them" aka things that aren't fully readable by opening the source file in a simple text editor.

A random idea: maybe generating the docstring once and writing it to the source file is something to investigate? Could be a tool that you run on a new class/function. For example to generate a docstring for scipy.foo.bar you'd run python generate_docstring.py scipy.foo.bar and it will spit out a (more or less) ready to go string that people can add to the code.

That way the initial docstring would be consistent with the rules.

You could even imagine something like python generate_docstring.py --check scipy.foo.bar which creates the docstring and diff's it with what is actually present. And then extend it even further using something like libcst where it adds the docstring to the source code file automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants