-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translations #26
Comments
Yes it will help a lot too as you are doing query I guess that is enough a language picker that change something in the endpoint the game uses. |
As all the game data is currently loaded from a single file at start, I think the best approach might be to provide language-specific versions of this file. Approach 0: Instead of having a language-specific file, fetch the data of the Wikidata item each time a card is shown to see if Wikidata (at the moment) contains the desired translations. I'm not sure which endpoints can be accessed directly by the game in the browser, but e.g. these would seem to work: https://www.wikidata.org/wiki/Special:EntityData/Q42.json and https://query.wikidata.org/bigdata/ldf?subject=wd:Q42 Approach 1: For each card (Wikidata item) in the original data file, replace the original label, description and Wikipedia article title (in English) by ones in the desired language from the same Wikidata item. However, they might not be available or they might be unsuitable (contain the answer or have a mistake). Approach 2: Generate a new set of cards appropriate in the desired language e.g. by tweaking https://github.com/tom-james-watson/wikitrivia-generator. EDIT: Approach 3: Generate a new set of cards dynamically from frontend by calling a suitable Sparql endpoint such as QLever. https://qlever.cs.uni-freiburg.de/wikidata/ |
I like Approach 2 the most. Approaches 0 and 1 are for me:
I'll try Approach 2 in Romanian to see how it goes. Edit: I take back liking Approach 2 after seeing the 73GB data source. I will still give it a try, but don't have high hopes. |
@nicolaes 👍 Perhaps we can find the necessary people who can make this happen together. To make approach 2 easier, I found some initial discussion on reimplementing it based on queries against a Sparql endpoint. In my experience, the official Sparql endpoint does not have the performance needed, but QLever (and/or Virtuoso) might be able to answer all the queries we need. Here's a quick test that finds about 9000 results that might be suitable for Romanian cards: https://qlever.cs.uni-freiburg.de/wikidata/30kMrq?exec=true See also: tom-james-watson/wikitrivia-generator#6 and tom-james-watson/wikitrivia-generator#8 |
@tuukka Thanks for the idea. I appreciate the effort to put together the Romanian version. I don't know SPARQL, so I am playing around the link you provided. |
I gave QLever a few tries, then I dropped it. I got progres on the raw data source processing, and now have ~1000 usable entries for Romanian.
Since I don't have many cards, I will account for the scenario when you don't have any relevant cards to show. |
@nicolaes I hadn't thought of the possibility to create a set of cards dynamically based on a Sparql query. I've added it as "Approach 3" in my original list. At a glance, an advantage would be that the data would update automatically, but a disadvantage would be that two games couldn't be guaranteed to be played with the same set of cards. I have reported the QLever crash to its developers - I hope it's something they can easily fix as QLever is very performant in general. Do you know why you got just 10% of the amount of cards compared to English? For example, is it because the Romanian labels are missing, the filter words match more often, or the viewcounts are lower? |
Update: here's a query for QLever that returns all suitable Wikidata items and their required attributes sorted by sitelinks count (pageviews is not available for queries). You can change |
Some really interesting discussion here! @nicolaes - yeah unfortunately the wikitrivia-generator process as it stands is slow. I think sparql is definitely the future. Also, with something like the example @tuukka has worked on, that shows how easy the SPARQL approach would make it to internationalize. The discussion of how to work out the details of the SPARQL approach should be kept to tom-james-watson/wikitrivia-generator#6. |
@tuukka sorry for late reply, messed up notifications. About Romanian low count of entities: it's because not all pages are translated and I didn't adjust the view count thresholds correctly (e.g. I reduced it by 40x compared to English, while there are 60x less Romanian speakers). PS: top hit from SPARQL query in Romanian is the wiki of Russia 🤔 |
Can you support other language? You can get correct labels from Wikidata
The text was updated successfully, but these errors were encountered: