Page 2 of 5
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:05 pm
by Nidan
chl wrote: Fri Nov 08, 2024 1:26 pm
Regarding accented letter search, note that in, e.g., Swedish, the letters å, ä, and ö are not accented versions of a and o, but separate letters, and searching for "a" should not match those letters, or vice versa. Of course I have not tried the new search and am not even using the Swedish localisation, but this is a common error that exist in many other search engines, from what I've noticed.
Likewise in German, in cases where ä, ö, ü, ß aren't supported they're supposed to be written as ae, oe, ue, ss.
--------
@Hrusa: How have you handled the the one case (I'm aware of) where case mapping is distinct from the "usual"? Turkish i/İ and ı/I instead of i/I.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:08 pm
by BattleFluffy
I almost hate to say this, because I've got really used to the name.... but....
In English, we would generally say "Iron Rod", and not "Iron Stick".
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:17 pm
by bithack
For Hebrew search, I would not be surprised if it's already implemented, but the letters צ=ץ, נ=ן, פ=ף, מ=ם, כ=ך as these are their form if they are in the end of a word.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:17 pm
by Brathahn
.
P̷̧̪̯͔̪͈͉̖̲̣͖̐̋̓̓̏̅̀͛̈́̉͋̍̃͐͑͘̕̚l̸̛̛̼̜̖̣̞̻͕̆̋̂̓̉̌̌̀͊͆͘ě̸̡̻̠͙̙̼̹̭̱̘͚̘̰̰͓̠̹̺̥̳̦̟͉̦̱̠̥̟̻̥̱̦͐̒̅͆̔̂̃̈͊͛́̕͜ͅͅͅà̴̛̘̖̹̘͙̖̖͎̋̔̌͜ͅś̸̨̛̘̪͎͎̰̦̼́̍̔͊̊͋̿̿͗̏͋̈́͂̐̒̈́͒̆̽̔̅̋̎͑͝͠͠͝ȩ̶̢̡̥̺̥̭̤͖̣͔̜̪͙̜̤̱̫̤̻̱͇͓̰̝͚̤̖̩̖̙̥̪̀̊͑̈͋̌̃͗̾̈͒̾̈́̌͋̒̂͒͊̏̆̿̕̕͜͜͠͠ͅ ̶̨̛̱̏͆̽͋̊͗̏͛͛̀̾̊̊͆̉̀̈́̈̀̑́͝͝͠ą̷̢̛̛̛̗̫͕̰̱̠̬͉̩̬̘̠̻̙̯͚̜̹͕̟̹͇̝̠͓̗̈́̏̽̂̎͂̇́̉͒̌̃́̏̾͒̑͐̓̄̀̔̏̓́̋͌̑̕̕̚ḑ̷̨̧̡̛̫̝̝̯̙̟̱̙̳̯̫̰̘͉̲̰̻͔̥͉͚̩̖̰̿̈͛̒͛͊͂̔͋͗̀̆͑̃͋̆̑̐̑̊͊̀̅͑̉͊̊̈̇̕̚͜͝͠d̴̢̡̨̡̞̩̠̲͚̳̲̟̬̥͓̻̣͖̼̲̘̦̟̝̲̥̹̖͕̤̺͙̪͛̿͋͑͌̈́͂̈́̋̽͐́͐̈͑̄͘̕̕͘͜͜ ̷̡̡̡̛̛̛̙̤̤̠̟͚̘̮͙͙̟͇̞́̿̌͋̎̍́̄̊̾̈́͗́̈͒̃̓̍͐͛̈́̀͆̈́̚͘͜͝s̵̨̨̳̘̩͕̟̪͔̥͖̳̙̼̞̲̤͉̠͔̦͙̦͎̓͌́̋̅̈́͊̕̕͘ͅu̸̡̨̖͉̫̜̟͔̘̩͙͎̲̠̝̿̌ͅṕ̴͈̮͉͈͑̏̃̏̂̾̐̿̎̌͗͒͛̎̄́̀̔̒͌̕̕̚̚͝ͅp̶̧̧̧̧̧̳̭͇̝͔̬̱̳̗̭͈̣̪̰͔͎̼̼͚̼̼͖͇͈͚̹͛́͐̋̿͐̈͗͒̓͊́̊̏̑͂̌̚͠ǫ̵̧̧͔͉̹̻̠̭̲͙̩͈͓̜̭̼͖̱̘̹͚̙͍̦̣͉͖̺͓̳̇̈́̍͆̅͜͜͠͠ͅͅr̸̡̙͈̯̹̭̩̝̟͂̐̈͆̿̓̏̆́̏̊t̵̛̩͙̗͖̼̱͕͔̯̫̤̜͉͕̦̦̮̣̰͇̙̗̪̜̫̞̳̝̞̖̦̱̻̝̗͖̽̈́̀̄̃̿̀́̏̈́͛̓͂͌̚̕͜͝ ̷̨̢̪̻͉̦͍̥̰̺͇̖̮͔̝̮͇̟̰̣̱̱̜̼̗̬͓̝̭̿̋̂̏̂̃̋̈́̋̽͒̍̎̄̑̏̚̚̕͝f̴̛̤̟̞͚̹͎̺̈̇̉̆͆̂̌̀̉̅̌́͊͂̏͂̅̇̅̕͘͝ͅó̷̡̨̢̨͎̭̗̲̝̩̰͔͙͈̘͇͔̜̭̯̤̠͓̩̮̯̱̣̬̖̼͕̥̜̽̑̅͗̓́̃̒̊̀̅͛͋̀̋̈́͛̆͑͑͗̋̓̕͘͝͠͠ŕ̴̨̧̻̦̭̖̱͕̝͍̼͎̫͇̳̤̺͉̥̞͈̦̯̼̔̋̈͆̀̔̉̿̏̀͛̆̔͘ ̷̧̡̢̛̳͚̟̼̖̱͙̝̭̜̠̬͎͉͈̼͔͗̂̀́̄̐̑́̚̕̕Z̷̦̜̰̋á̶̧̢͖̲̻͈̟̜͓͎̝̦̰̰̩͚̲̬̱͓͉̯͔̟͍̫̥̹͓̓̒͛̾̿͌͐̀̑̐̋̍̓͂̏̍̓͆̇̈́̋͗͊̇́͘͠͝ļ̷̢͈͕̤̩̗͔̝̻̠͉̼̜́̈́̒̓̓̂͑̑͐̒̔̓̈́̈́̋͌̂̈̊̏̚͝͠g̸̢̡̨̢̧̖̗̻͙̹͕̻͇͕͎̯͖̹̥̳̻̬͙̫̭̪̞̲̩͎͋̈́̋́̽̊̽̑̔͊̓͂͑̾̈́̊̄͜͠͝ͅó̸̧̧̡͈̣̫͉̥͙͎̼̦͍͇̱̰̥͎̜͖͉͉͉̦̹̮͎̯͌̾̍̀̐̏̿̈́̀̓͜͜͝.̴̧̡̛̛̭̪̤̲̦̩̺̠̯̰̦̜̦͉̺̩̫̪̬̜͚̙͚̜̙̦̫̳͕͕̞̔̎̈̃͛̌͛̄̈́̆͒̑͌̾͛́̉̿̃̑͒̀̓̐̚͘͜͜͝͠͠͠͠
̶̳̤̖̺͐̈́͌̀̿̽̽͝ͅT̴̢̘̹̗̩̫̮͓͗̽̍͆̑̾̆̇̒͋͗̏͂́̈͒̀̀̓̌̌͋͆̔̾̈́̒͑͗͑͂̚̚̕͘͝h̴̛͇̰͕̣̞̰̺̔̌̾͂͂̓͆̎̐́̊̉͝e̶̢̻͇̜̒͋́͋̆̐̓̈́̃̒̒͊͗̓̎̃͊̍̑̅͂̈͊̓͘̕̚͘͠ ̵̡̛̬̤͙͉̞̜͙̞̱̞̱̠͕͈̺̠͙̺͗͒͂̍͆̈́̓̒̈́͒̀̀́́̃͛͂̀̀͛̈̋̅͊̈̊̈̆̾͋̎͑̏͝͝͠ͅͅͅf̴̡̧̢̨̡̛͚̯̭͈̩͖͍̘̼͇̥͍͓̼͍̲̩̤͕̪͍͈͉̮̤̹͖̪͍̥͐̋̂͐͛͊͛̀̽̆̍́̔̓̽́͌͒̇̑̍̿̕͘͘ͅͅa̶̧̡̡̛̘̺͙̰͎̻̩̰̞̼͕̻̭̟̹̘̥̹̻͔̜͙̽̄̄̍͋͋͆̓̽͒̈͋͗̊̔̉͌́̓̈̈́͆̋̅̓̽̈́̋̓̈̐̚͘͜͝͝c̶̡̧̧̡̛̛͎͍̘̭̞̼͉̦̤̤͓̮̗̖̦̗͕̭͕͙̟̜̼̝͙̳̝̓̐̀̍̐͒̈́̎͛̉͑͋̈̐̂͗̆͂͑͗̊͘̕͜͝͠t̵̡̡̜̟̳̮͙̱̼̪̟͚͙̫̗̗̠͈̪̞͎̗̲͖̜̼̖̯̠̰̍͊̏̾͛̇͗̆̂̽͆̽̐͆̾̄͐͆̓͆̔͛͊͂̈́̕͠ͅͅͅô̵̢̨͕͇̫̥̼̦͈̯͉̞̫̗̼̬̝͉̞̬̤̪̟͊̈́͌̇̾͠ͅͅr̶̡̡̲̜͈͇̬̱͔͔͙͓͉̺̤̳̳̹̦̪̺̺̄͜ỵ̵̧̧̛̛̳̟͎̞̥̱͍̬̦̱̤͍̗̞͎̰̞͇͎͓̈͊̏̒͛͋̅̽̾̎͑̓͋̀͌̍́̃̆͘͜͠ ̶̨̧̡̡̛̛͙̬̩̮͓̺̜̪̺̳̩̺͓͚͔̻̱͕̖̘̤̗͍̭̣̲̭͛͗̈́̉̿̏̉͐̽͐̅̈́͊͑̓͗̈́̌̄͑̈́̾͊͛̿͑̕͜͜͠͝͝m̶̧̡̧̨̡̢̝̮̯̣͙̼̝̠̺̦͈̦̥͕͇̣̰̣̲̘̙͍͖̪̠̣̼̜̟̺͌̀̓̎̆̀̈́̃͝u̵͖̜͖͍̮̱̬͚̘̇͑̎̈̀̍̆́͋̀̾̏̌̋́̄̉͛̐̌̓̌̈́̓̈́̓̐̈̏̑̕̕͝͠s̶̟̩̙̜͍̞͈̯͕̝̑̓̈́t̶̨̡̡̧̨̢̢̠͓̻̤͕̹͎̤̟̩͈͔̼̩̺̘̪̰͍̩͎̠̦͍̰̥͈̍͊̑́̍̾̂̽̈́͐͑͑̊̈̑̀̃̄̓͘͜͠ ̸̢̫͙̼̼̞̠͎̳̪͓̲͔͔̗̱͂̒̉͗͋̈́́͆̋̌͒́͒̋̈́̈́̆͑̅̈́̆͌̿̿̋̈́̉͌̂̍g̵̡̰̼͓͔̣̜͗̆̔́̂̋̍̈́̔͘͠r̵̢͍̩̲̯̠͇̠͙̲̺̮̙̗̣̺̦͈͍̃̅͂͋̅͜ͅơ̴͈͋́̀̽͋̆̋̋̏̓́̕͝͠w̸̡̢̛͕̖̗̰̻̮̫̺̗͔͓̤̺̹̘͖̠͕͓͔̭̙̜͔̯̝̮̩̤̩̞̏͆̅̅̇̈̿̈́̅̒̂̔͌̑͂̽̕̚͜ͅ.̵̞̜̳̪̯̰͚̟̗̼̀͊̾̈́̄̏̇̋̄͌͊̄̓̈́͂͋͌͛̔͌̒̕͘̚͜͝͠͝͝
.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:19 pm
by ignatio
tjoener wrote: Fri Nov 08, 2024 12:07 pm
Wouldn't unicode normalization solve this problem (Thinking of NFKD)? This might need a link to libicu, but should solve almost all these problems.
+1, sounds like a clear cut use case for ICU. There's a lot of nuance to handling accents and diacritics. E.g. Swedes would usually agree with considering "'á" == "a" for searching, but code that simplifies "å" and "ä" to "a" is quite annoying because all three are totally distinct letters - it makes about as much sense as considering "i" and "e" the same.
ICU has text search algorithms that should address this - see e.g.
this page. I'd go to great length to incorporate a 3rd party library like that (and ICU is basically the industry standard) to deal with this sort of stuff rather than roll my own.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:24 pm
by Drepple
I wanted to do this in Crowdin but too many people there upvoted the wrong answer for it to possibly get accepted: The English word "impact" refers to both a physical collision, as well as the effect of something (eg. the impact of a new law). In Dutch, we also have the word "impact", however, it only has the second meaning of the English word; it can't be used to translate impact damage, as Factorio currently does. Even many native Dutch speakers get this wrong, which is why it was able to get approved on Crowdin. In any case, we don't have a literal translation for impact in this context, though we do have words for specific types of impacts. Given that impact damage in Factorio is refers to cars or tanks running things over, the best translation would be "botsing" / "botsing schade", which means collision damage.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:28 pm
by foxiest_engineer
The amount of care and detail that goes to each facet of the game, brings a tear to my eye :')
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:29 pm
by arcosapphire
BattleFluffy wrote: Fri Nov 08, 2024 2:08 pm
I almost hate to say this, because I've got really used to the name.... but....
In English, we would generally say "Iron Rod", and not "Iron Stick".
That's been mentioned plenty of times before (including by me) and they've kept it as-is. So I assume they really like it this way.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:46 pm
by bombcar
arcosapphire wrote: Fri Nov 08, 2024 2:29 pm
BattleFluffy wrote: Fri Nov 08, 2024 2:08 pm
I almost hate to say this, because I've got really used to the name.... but....
In English, we would generally say "Iron Rod", and not "Iron Stick".
That's been mentioned plenty of times before (including by me) and they've kept it as-is. So I assume they really like it this way.
We've seen similar things in Minecraft mods - if you have something that is the same across a bunch of materials, you often want the same word, even if it doesn't "really" apply. Wood stick, iron stick, wood plate, iron plate.
Often a trick is to have a "display name" and then hidden "search names" so searching "stick" brings up wood stick and iron rod.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 2:51 pm
by animexamera
For Japanese it would be even cooler if all inputs that is roman letters, hiragana and katakana were all interchangable. I usually play without IME on but still in Japanese and I would love it if beruto gave ベルト or if singou gave 信号, etc.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 3:24 pm
by untech
I like the determination to get localisation right. I have a small nitpick: in the blogpost, the rotating greeting looks wrong when it’s the Ukrainian Привiт; in the sense that Cyrillic letters look as if they are from a different typeface than Latin. It doesn’t affect the forums though!
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 3:38 pm
by NichtElias
Love the "Hello" at the beginning of the post switching through different languages. "Grüßli Müsli" for the German one is hilarious
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 3:39 pm
by BlueTemplar
But Factorio doesn't have sticks made of other materials than iron...
----
tjoener wrote: Fri Nov 08, 2024 12:07 pm
Wouldn't unicode normalization solve this problem (Thinking of NFKD)? This might need a link to libicu, but should solve almost all these problems.
I'm guessing that's what Hrusa referred to here :
Out-of-the-box solutions are usually bloated to accommodate edge cases for many scripts and languages we can't even render in Factorio (Hieroglyphs, Sumerian Cuneiform, Emoji...).
?
----
clang wrote: Fri Nov 08, 2024 12:20 pm
Not exactly a language error but when using Dvorak (and I assume other non-qwerty layouts) search is mapped to the same buttons as qwerty and doesn't follow the keyboard layout. All other common shortcuts follow the layout change. So for copy, I hit ctrl-c in qwerty and in Dvorak I hit what would be ctrl-i in qwerty. The same goes for cut and paste, undo and redo. But ctrl-f is unique because I hit the same physical keys regardless of layout. I can fix this by rebinding the controls so it's not really important. I'm just thrilled that Factorio supports Dvorak at all - hardly any games do and it is so nice!
Yeah, I guess fixing this comes with a lot of other issues...
https://factorio.com/blog/post/fff-259
viewtopic.php?f=11&t=65083
Speaking of, map not being on 'M' for AZERTY is still an 'issue'(?) in 2.0.7 :
viewtopic.php?f=71&t=79873&p=473209#p473209
(But seems barely worth complaining about, now that it's also Tab ?)
bonob wrote: Fri Nov 08, 2024 1:07 pm
Not a localization topic, but related to search features, and one might argue vaguely related :
I use search on the map mostly to look at where resources are. It would be convenient to be able to highlight all resource patches at the same time rather than to have to go through "ore", "stone", "coal", ... one at a time. Something like a hit to categories could fit the bill I guess.
Yeah what would be nice if those could be all found by typing 'resource' or 'ore'... I guess what makes it trickier, is that even the internal name for coal & stone aren't coal-ore & stone-ore...
chl wrote: Fri Nov 08, 2024 1:26 pm
Regarding the mod portal search, I always found it strange that it's so hard to find mods you're looking for by searching for the name. For example, if I search for "Space Exploration", the mod itself with that exact name is not even on the first page of hits.
[...]
Uh, it's literally the first result for me ??
FunMaker wrote: Fri Nov 08, 2024 1:39 pm
Well it has nothing todo with language, but like the map search that was introduced in 2.0 - but i would like to have an additional map search where not the build producing the searched item is found but the entitiy in the world. <Might be better placed as a suggestion>
Meanwhile :
BeastFinder

Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 3:42 pm
by JaHultaj
In the Belarusian localisation the letter "ў" is always displayed in the upper case for some reason. It also looks as if it's bolded as you can see on the screenshot

- выява.png (2.73 MiB) Viewed 2801 times
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 3:44 pm
by MaxAstro
chl wrote: Fri Nov 08, 2024 1:26 pm
Regarding the mod portal search, I always found it strange that it's so hard to find mods you're looking for by searching for the name. For example, if I search for "Space Exploration", the mod itself with that exact name is not even on the first page of hits.
Are you sure you don't just have the wrong Factorio version selected when searching?
If I set version to 1.1 and search for Space Exploration, it's the first hit. It just hasn't been updated to 2.0 yet, so it doesn't show up searching 2.0 (or in-game, which forces a search for the version you are running).
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 4:01 pm
by valneq
JaHultaj wrote: Fri Nov 08, 2024 3:42 pm
In the Belarusian localisation the letter "ў" is always displayed in the upper case for some reason. It also looks as if it's bolded as you can see on the screenshotвыява.png
This looks like this character is rendered via a different font because the default font of Factorio does not seem to contain that character.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 4:43 pm
by DeltaKilo
As a Russian speaker first of all thank you for fixing this.
Speaking of corner cases.
Russian letters Е Ё е ё are usually interchangeble.
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 4:53 pm
by ignatio
BlueTemplar wrote: Fri Nov 08, 2024 3:39 pm
tjoener wrote: Fri Nov 08, 2024 12:07 pm
Wouldn't unicode normalization solve this problem (Thinking of NFKD)? This might need a link to libicu, but should solve almost all these problems.
I'm guessing that's what Hrusa referred to here :
Out-of-the-box solutions are usually bloated to accommodate edge cases for many scripts and languages we can't even render in Factorio (Hieroglyphs, Sumerian Cuneiform, Emoji...).
The problem with that is that Factorio users in various locales will need to report and get their problems fixed, which have likely already been fixed over a much longer time period and much larger user base with an off-the-shelf library like ICU. This is all the more valuable for localisation since users have different experiences there, so bugs are only encountered by a fraction of the user base. I'd easily go with some extra bloat for that benefit (and it's not like the game is tiny so it'd grow a lot percentage-wise).
American vs British(European)
Posted: Fri Nov 08, 2024 5:48 pm
by Mskvaer
Translation - Gas and Petrol. In USofA Gas is the liquid you put into some cars. In English(Europen style) Gas is a air that can ignite, as in gas lamp, camping gas. Cars run on Petrol (unless it is Diesel or Electricity ...)
It really confused me when I started Factorio why there was a gas that behaved like a liquid, until it struck me they probably did a US translation. But the icon looks like methane ... so ...
OK, enough of this, back to the Factory!
Re: Friday Facts #436 - Lost in Translation
Posted: Fri Nov 08, 2024 6:08 pm
by KillHour
Just throwing out there that I work for a company that specializes in document search, so I have a lot of experience in dealing with collation/tokenization/stemming. You definitely want to use a mature library instead of rolling your own because there are so many edge cases and gotchas.