Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group 8 - sentimentanalyzerR #44

Open
12 of 23 tasks
ranjitprakash1986 opened this issue Feb 4, 2023 · 4 comments
Open
12 of 23 tasks

Group 8 - sentimentanalyzerR #44

ranjitprakash1986 opened this issue Feb 4, 2023 · 4 comments

Comments

@ranjitprakash1986
Copy link


name: Submit software for review
about: Use this template to submit software for review


Submitting Author Name: Eric Tsai, Ranjitprakash Sundaramurthi, Ziyi Chen, Tanmay Agarwal
Submitting Author Github Handle: @erictsai1208, @ranjitprakash1986, @zchen156, @tanmayag97
Repository: https://github.com/UBC-MDS/sentimentanalyzerR
Version submitted: 1.1.0
Submission type: Standard
Editor: TBD
Reviewer 1: Roan Raina
Reviewer 2: Jenit Jain
Reviewer 3: Manvir Kohli
Reviewer 4: Crystal Geng


Description

When a survey asks for written comments, it is often tedious to read through every response to extract useful information or just to get a quick summary. By using this package, responses can be quickly summarized to get a general idea of the sentiments of the comments, which can be useful such as when a PR team wants to know the overall sentiment on a company or when instructors want to know the overall sentiment on a course. The goal is to provide a quick summary that is easily interpretable by combining results from a pre-trained Python natural language processing package with the use of visualizations

Scope

  • Please indicate which category or categories from our package fit policies this package falls under:

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):
    The package process text data and convert it to visualizations in the form of a histogram and wordcloud along with a likert scale metric.

  • Who is the target audience and what are scientific applications of this package?
    Professionals and scholars interested in visualization of text data collected in the form of reviews or surveys can benefit from the package. Data analysis and insight is the primary intent behind the package.

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
    While there are R package that serve similar objectives, this package brings together different visualizations tailored specifically for processing review or survey sentiments.

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
    NA

  • Explain reasons for any pkgcheck items which your package is unable to pass.
    NA

Technical checks

Confirm each of the following by checking the box.

This package:

Code of conduct

@brabbit61
Copy link

brabbit61 commented Feb 7, 2023

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing:

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

An amazing package guys! Great job. Truly simplifies the functionality of assessing text sentiments and is something we can actually use in practice! I do have some comments that I hope will improve the package even further and make it more robust.

  • Since vader is a very active repository, they are constantly updating their machine learning models and their codebase, hence you might want to replace the test_that function call in the test-aggregate_sentiment_score.R file. They may update the model and this might return a different sentimental score for the test string used by the function, which will break your test cases.
  • It might be more user-friendly if you add an argument to the generate_wordcloud function that specifies for which sentiment to generate a wordcloud (positive, negative or neutral). This will increase the speed of execution of the function and return only the necessary wordclouds.
  • It is unclear what the valence scores actually means in the aggregate_sentiment_score function.
  • I can see the wordcloud plots being created in the RStudio environment but they are not being returned by the function. Implementing this will allow the user to use the plots however they like in downstream applications.
  • One last thing is to make the package more robust by achieving 100% code coverage through the test cases.

@manvirsingh96
Copy link

manvirsingh96 commented Feb 7, 2023

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 1 hour

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

  1. Great job guys! The function is really cool and can be extremely useful if someone wants to do some quick text analysis.
  2. I believe you could explain a bit more on how your function compares to wordcloud and tidytext. Does it combine functionalites of those two packages or are you buildng on exisiting functionalites of the packages. This may help users decide which of the 3 packages between sentimentanalyzerR, wordcloud and tidytext will best serve their purpose.
  3. There are no examples in the Readme file on how to use the functions. It would help the user to understand the usage of the function better if you add examples.
  4. The Readme file mentions that your package has a function get_aggregated_sentiment_score but there is no such function in the package. I believe the function is actually called aggregate_sentiment_score See screenshot below

Screenshot 2023-02-07 132800.

  1. The function generate_wordcloud outputs 3 plots. It can also be benefecial to add a title to each of the wordclouds so that after indexing the list generated by the function generate_wordcloud , the user knows for which sentiment the word cloud has been generated (positive, negative or neutral).
  2. The names and GH usernames/emails of the contributors/authors can be added to the Readme file itself so that it is easier for users to find contact information of the authors in case they need help or wish to reach out for any other reason.
  3. A final suggestion would be to try to achieve a code coverage of 100% by adding tests that check for all the lines in the code.

@THF-d8
Copy link

THF-d8 commented Feb 8, 2023

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 1 hour

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

  1. Good job for putting together such a useful and versatile package. I really like the fact that you can output a wordcloud from the input text, which is a very helpful way of visualizing all the keywords from the text.
  2. The documentation might have been more clear if the functions can be specified more detailedly and a sample usage for each function can be listed in the the README file or a separate usage file. In addition, a sample plot can also be added for each function in the usage section.
  3. It might also be helpful if all the dependencies and their corresponding version used in this package can be specified in the README file.
  4. For function aggregate_sentiment_score, more tests can be added such as testing if the data type of the column is of string type, in addition to checking if the column name is a string. The same can also apply to the function generate_wordcloud.
  5. I was not able to find a link to the vignettes in the README file. It might be a good idea to put a copy of the link so people can have access to the full documentation html.
  6. There might be too many branches in this repo which might be worth deleting, especially those used for only generating the vignettes or minor fixes.

@roanraina
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing:

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

  • no CONTRIBUTING file despite the README implying there are contributing guidelines.
  • no URL for bug reports in the DESCRIPTION file
  • might be nice to add function documentation for tests functions
  • I disagree with comments above that mention a lack of examples/function documentation in the README. The vignette is easily findable in the documentation along with the functions docstrings which are well done and include examples.
  • l disagree that 100% code coverage needs to be achieved, but looking at the code coverage outputs, it seems likert_scale.R and aggregate_sentiment_score.R have the lowest coverage scores overall at 62.5% and 71.43%, respectively. Looking at the reasons, it seems simple tests for edge cases / error cases could easily improve code coverage substantially.
  • I agree with THF-d8, best practice would be to close development branches that are no longer active.
  • Regarding package fit, are there really no packages that visualize sentiment? A quick google search returns this package https://github.com/Lissy93/twitter-sentiment-visualisation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants