Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move data to wiki data #49

Open
tobiasdiez opened this issue Dec 19, 2019 · 12 comments
Open

Move data to wiki data #49

tobiasdiez opened this issue Dec 19, 2019 · 12 comments
Assignees

Comments

@tobiasdiez
Copy link
Member

Wiki data contains also quite a few abbreviations:
https://query.wikidata.org/embed.html#SELECT%20DISTINCT%20%3Fname%20%3FISO_4_abbreviation%20WHERE%20%7B%0A%20%20%3Fjournal%20wdt%3AP31%20wd%3AQ5633421%3B%0A%20%20%20%20%20%20%20%20%20%20%20wdt%3AP1160%20%3FISO_4_abbreviation.%0A%20%20%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22.%0A%20%20%20%20%3Fjournal%20rdfs%3Alabel%20%3Fname.%0A%20%20%7D%0A%7D

query

SELECT DISTINCT ?name ?ISO_4_abbreviation WHERE {
  ?journal wdt:P31 wd:Q5633421;
           wdt:P1160 ?ISO_4_abbreviation.
  
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?journal rdfs:label ?name.
  }
}
@cmgoodall
Copy link

Hi, I had a couple of questions regarding this issue. Are you able to further explain what is meant by 'Add abbreviations from wiki data'? I'm hoping to better understand what is required before taking this issue

@tobiasdiez tobiasdiez changed the title Add abbreviations from wiki data Move data to wiki data Oct 16, 2023
@tobiasdiez
Copy link
Member Author

After thinking about this more, I'm actually of the opinion we should just retire our custom abbreviation collection and migrate to wiki data. In this way, we can profit from other peoples work and, conversely, give back to a larger community. So in particular:

@JabRef/developers any opinions?

@koppor
Copy link
Member

koppor commented Oct 20, 2023

In general, this is a good idea.

  • How much coverage does WikiData have currently in comparison to our list? - In Python, this should be easily doable IMHO.

Does the WikiData distinguish from the data sources? We distinguish from IEEE, MathSciNet and others (https://github.com/JabRef/abbrv.jabref.org/blob/main/journals/README.md). I am not sure whether there are intersecting abbreviations and users want to choose their list (we had that a few years ago).

We only have the copyright for some of the data. Most of the data is coming by external sources. For instance, we update MathSciNet automatically: https://github.com/JabRef/abbrv.jabref.org/blob/main/scripts/update_mathscinet.py

  • In case WikiData has 80% coverage, I opt for just closing this repository. If WikiData has less coverage, this needs more thought.

The process would be:

  1. Thoroughly identify each data source
  2. Contact the data source and ask them to put their data to WikiData.
  3. Keep working that the external sources are available in WikiData.

@JasonXuDeveloper
Copy link

JasonXuDeveloper commented Oct 9, 2024

@Siedlerchr hello, I would like to work on this task; one thing to ask is if we have a Wikimedia account (so we can push data to wiki data and download it from there in the main repo).

Also, what data specifically needs to be pushed to Wikimedia?

P.S. I am taking this issue as a university project - is it possible if u can assign me as the assignee for this issue? Thanks

@calixtus
Copy link
Member

calixtus commented Oct 9, 2024

This issue is free to take, but it requires some work before you start hacking the jabref sources.

Please mind the last comment of @koppor. First step is to compare the existing coverage of wiki data and our repository. Then ask the sources if they want to put their data to wiki data or do it yourself (be aware of possible copyright issues!). Then eventually modify our abbreviation list generator to import wiki data sources. Start with step 1.

Happy coding.

@tobiasdiez
Copy link
Member Author

Alternatively, just put all our manual data in wikidata (leave the auto-generated once for now), and then add an automatic pull to get all data from wikidata. Afterwards investigate how to handle the automatically generated data from mathscinet etc.

@calixtus
Copy link
Member

calixtus commented Oct 9, 2024

Do we have any manually created data at all?

@tobiasdiez
Copy link
Member Author

Only 4 of the lists are maintained automatically: https://github.com/JabRef/abbrv.jabref.org/blob/main/.github/workflows/refresh-journal-lists.yml

@koppor
Copy link
Member

koppor commented Oct 9, 2024

Only 4 of the lists are maintained automatically: https://github.com/JabRef/abbrv.jabref.org/blob/main/.github/workflows/refresh-journal-lists.yml

To contributors: First update https://github.com/JabRef/abbrv.jabref.org/blob/main/journals/README.md with a hint of whether this was manually created externally or via GitHub actions. If via GitHub actions, add a link to the workflow. You need to add a new column for that!

@koppor
Copy link
Member

koppor commented Oct 13, 2024

@JasonXuDeveloper You see this is also an issue to contact contributors and be sure that the license is followed. See #49 (comment) for some proposals for the steps.

@koppor
Copy link
Member

koppor commented Oct 13, 2024

@JasonXuDeveloper You can also start to implement an importer for the WikiData abbreviations. Write a python program creating a .csv file

All in all, we cannot push all abbreviations to WikiData. Only the ones, we created. For instance, the ones downloaded from MathSciNet cannot be published to WikiData.

After JabRef/jabref#10557 is solved, it should be easy for users to work with abbreviation lists.

@JasonXuDeveloper
Copy link

@koppor Thanks for your information, I will take a look at this approximately next week

@koppor koppor moved this from Free to take to Assigned in Candidates for University Projects Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Assigned
Development

No branches or pull requests

5 participants