Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of duplicate names in contributors list #1421

Open
lindaokorie27 opened this issue Apr 5, 2021 · 8 comments · May be fixed by #1546
Open

Removal of duplicate names in contributors list #1421

lindaokorie27 opened this issue Apr 5, 2021 · 8 comments · May be fixed by #1546

Comments

@lindaokorie27
Copy link

There are some persons in the contributors list whose names appear twice. Normally in the form of their Full name on one side and their github usernames elsewhere in the list. To show a more accurate list of contributors, the duplicates should be tracked down and removed.

@somya104
Copy link
Contributor

somya104 commented Apr 5, 2021

Hey @LindieK ! I am an Outreachy Applicant and would like to work on this issue.

@lindaokorie27
Copy link
Author

Hi @somya104 I'm an Outreachy applicant too. I'm currently working on this issue. You can look through the site for possible things that need improvement and create an issue suggesting the improvement. That is if no one else has raised it before.

@somya104
Copy link
Contributor

somya104 commented Apr 5, 2021

Okay, thank you so much!

@patricoferris
Copy link
Contributor

@LindieK there is very little we can actually do about this. The only solution afaict is to revert back to the original command which only displayed authors if there author name was of the form <FIRSTNAME> <LASTNAME>. This has the benefit of removing these duplicates but also the downside of not displaying those who use their username. What do you think?

@lindaokorie27
Copy link
Author

@patricoferris I see what you mean. I have looked through the commit list and I noted that not only are there people that did not add a Lastname but some use multiple accounts to contribute, some authored changes but did not commit them themselves - in such cases, instead of using the person's first name and last name, it uses their github username.

Would it be possible to use the author's email as a test to see if it refers to the same person? This would reduce the duplicates to those who use multiple accounts to contribute.

@patricoferris
Copy link
Contributor

@LindieK that sounds like it might work good idea, (I would still only render one of the names used) -- give it a go and see if we cut down on the duplications :))

@lindaokorie27
Copy link
Author

@patricoferris I tried running git log --format="%aE %aN" | sort | uniq to see how much the duplicates would be cut down. I noticed a few things:

  1. Because some commits were made by people other tahn the authors of the changes, the email listed for the author is the auto-generated mail of github.
  2. The uniq command runs through the names as well so there are results that bear the same email address but slightly different names (sometimes it's just a letter that causes the difference).

Here is a screenshot of the results:
I circled two different instances of the duplicates.

contributors

I only saw meaningful changes when I didn't add the author name to the git log call

@lindaokorie27
Copy link
Author

@patricoferris I read an article on using a .mailmap file in aliasing authors in Git. Can we do that here?

@lindaokorie27 lindaokorie27 linked a pull request Apr 20, 2021 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants