-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group 03: nameforme - Py #20
Comments
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 1 Review Comments
Very practical and interesting package! The GitHub repo is well organized and followed best practices. Great work team! |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 2 hrs Review Comments
find_old_name('M', limit = 5) find_old_name('1990s', 'M', limit = 5) Then I figured out the following is the correct way to use this function Perhaps this can be made clear in the documentation.
|
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: ~ 2 hours Review Comments(Review based on the latest commit on First, congratulations on the releasing of the package. It is a fun, simple (yet cool!) idea to use this dataset to suggest baby names. What I like most is the code is clean and neat. I can see that other reviewers have already given excellent comments over the functionality aspect over the package. I would therefore would like to focus solely on the code-side of things instead. (Some comments are similar to what I have made in the R package, which can be viewed here.) 1. Consider following PEP 8 and other Python styleguides. I know that this is not strictly required, and the linting routine is not even in our CI/CD workflows, but a consistent code style can be much easier for others to contribute and build on. (We are lucky that in Python we have relatively consistent code style, unlike JavaScript.) For example, by using
It might even be better to integrate this into CI/CD to trigger for every PR. 2. Consider DRYing your code, instead of Writing Every Time. An example would be the snippet of loading the CSV dataset: url = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-22/babynames.csv"
raw_df = pd.read_csv(url) This piece of code appears for 4 times in the def load_raw_data():
# You may also do data wrangling here, e.g., dropping unnecessary columns
url = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-22/babynames.csv"
return pd.read_csv(url) While the line count does not seem to decrease much, the real benefit is that when we want to, say, change the upstream dataset which may have a different structure, it would be easier. 3. Consider enhancing your test cases, especially the conditional guards. Looking for 100% line coverage for all projects is impractical, but it is always beneficial to cover edge cases. For example, currently the line coverage report shows that we are missing 7 lines, which are related to error handling cases (e.g., throwing exception when the type is not expected, out of range). It would be beneficial that this kind of behaviors to be documented in form of a working test case. 4. I realized this when I tried out the df = data[data["tp"]==tp]
if sex == "uni":
F = df[df["sex"] == 'F']["name"]
M = df[df["sex"] == "M"]["name"]
uni_df = list(set(F).intersection(set(M)))
if len(uni_df) < 10:
return uni_df
else:
return np.random.choice(uni_df, limit, replace=False).tolist()
else:
df = df[df["sex"] == sex]["name"]
return np.random.choice(df,limit,replace=False).tolist() It could be simplified, rewritten, with an added guard as: df = data[data["tp"] == tp]
if sex.upper() == "M" or sex.upper() == "F":
df = df[df["sex"] == sex.upper()]
elif sex != "uni":
warnings.warn("Invalid sex specified, assuming unisex")
return np.random.choice(df["name"],limit,replace=False).tolist() This can allow the 5. It might be best if the parameter can be unified. For example:
Alternatively, it might even be better to use 6. The said function is defined as
It would be better to use consistent variable naming, with documentation review in place. 7. Consider using Python libraries provide a lot of specific type of exceptions for different occasions. For instance, this snippet from if not (sex == 'F' or sex == 'M' or sex == 'f' or sex == 'm'):
raise Exception("sex should be either 'F'/'f' or 'M'/'m'.") Overall, I enjoy reading the code of your project, and despite hiccups, your documentation is good. Great work! (Reviewed using pyOpenSci review template |
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Readme file requirements
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: Review Comments
|
Submitting Author: Daniel Cairns (@DanielCairns), Eyre Hong (@eyrexh), Bruce Wu (@BruceUBC), Zilong Yi (@ZilongYi)
All current maintainers: (Daniel Cairns (@DanielCairns), Eyre Hong (@eyrexh), Bruce Wu (@BruceUBC), Zilong Yi (@ZilongYi))
Package Name: nameforme
One-Line Description of Package: A helper python package that can be used to generate names. This could be used to come up with baby names, character names, pseudonyms, etc.
Repository Link: https://github.com/UBC-MDS/nameforme
Version submitted: v1.1.0
Editor: Daniel Cairns (@DanielCairns), Eyre Hong (@eyrexh), Bruce Wu (@BruceUBC), Zilong Yi (@ZilongYi)
Reviewer: @ranjitprakash1986, @Althrun-sun, @netsgnut, @kellywujy
Archive: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD
Description
A helper python package that can be used to generate names based on the dateset. This could be used to come up with baby names, character names, pseudonyms, etc.
Source Data: Contains baby names born in the United States for each year from 1880 to 2017, and the number of children of each sex given each name. Names must appear at least 5 times in each year to be included. (Source: http://www.ssa.gov/oact/babynames/limits.html)
This package is similar to the existing names package by Trey Hunner (last updated in 2014), however, our uses a more recent dataset (with names up to 2017), and more options for users to customize what type of names to generate, including the ability to generate similar sounding names.
You can check the online documentation of this package on readthedocs.
Scope
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
Who is the target audience and what are scientific applications of this package?
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication options
JOSS Checks
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Note: Do not submit your package separately to JOSS
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
Please fill out our survey
submission and improve our peer review process. We will also ask our reviewers
and editors to fill this out.
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The editor template can be found here.
The review template can be found here.
The text was updated successfully, but these errors were encountered: