[0.18.31] Searching production for Uranium shows things unrelated to Uranium

Bugs that are actually features.
Post Reply
MiniHerc
Fast Inserter
Fast Inserter
Posts: 171
Joined: Fri Jun 26, 2015 11:37 pm

[0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by MiniHerc »

What did you do?
I searched the Production log for Uranium

What happened?
In addition to Uranium ore, U-235 and U-238, copper wire, rails and repair packs also were listed.

What did you expect to happen instead? It might be obvious to you, but do it anyway!
I expected only Uranium ore, U-235 and U-238 to be listed.

Image

Bilka
Factorio Staff
Factorio Staff
Posts: 3132
Joined: Sat Aug 13, 2016 9:20 am
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by Bilka »

This is a result of having fuzzy search turned on, you can turn it off in the settings.
Image
I'm an admin over at https://wiki.factorio.com. Feel free to contact me if there's anything wrong (or right) with it.

MiniHerc
Fast Inserter
Fast Inserter
Posts: 171
Joined: Fri Jun 26, 2015 11:37 pm

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by MiniHerc »

Bilka wrote:
Fri Jun 12, 2020 4:56 pm
This is a result of having fuzzy search turned on, you can turn it off in the settings.
Image
Thanks, but how the hell does fuzzy search get copper wire from searching uranium ?!?

User avatar
boskid
Factorio Staff
Factorio Staff
Posts: 2248
Joined: Thu Dec 14, 2017 6:56 pm
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by boskid »

Heh, i looked into why it happens:

Code: Select all

  44.916 Info StringMatcher.cpp:71: can also be used to manually connect and disconnect electric poles and power switches with [font=default-semibold][color=#80cef0]left mouse button[/color][/font]. | uranium | true
  44.916 Info StringMatcher.cpp:71: copper cable | uranium | false
As you can see, it clearly matches :)
most of the letters goes from description, but last 2 "UM" goes from "default-semibold" of the font tag

movax20h
Fast Inserter
Fast Inserter
Posts: 164
Joined: Fri Mar 08, 2019 7:07 pm
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by movax20h »

boskid wrote:
Fri Jun 12, 2020 8:31 pm
Heh, i looked into why it happens:

Code: Select all

  44.916 Info StringMatcher.cpp:71: can also be used to manually connect and disconnect electric poles and power switches with [font=default-semibold][color=#80cef0]left mouse button[/color][/font]. | uranium | true
  44.916 Info StringMatcher.cpp:71: copper cable | uranium | false
As you can see, it clearly matches :)
most of the letters goes from description, but last 2 "UM" goes from "default-semibold" of the font tag
One would argue that matching shouldn't be affected by font or color tags. :)

Bug.

User avatar
invisus
Filter Inserter
Filter Inserter
Posts: 284
Joined: Fri Sep 21, 2018 5:33 pm
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by invisus »

movax20h wrote:
Wed Jun 17, 2020 7:16 pm
One would argue that matching shouldn't be affected by font or color tags. :)

Bug.
From a UX perspective, I agree.

One wouldn't reasonably expect a search in the production GUI to match on "hidden" tags, any more than they'd expect a "ctrl+f" search in a browser to match on HTML tags.

From the user standpoint, this seems like bad behavior indeed.

blahfasel2000
Inserter
Inserter
Posts: 49
Joined: Sat Mar 28, 2020 2:10 pm
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by blahfasel2000 »

movax20h wrote:
Wed Jun 17, 2020 7:16 pm
One would argue that matching shouldn't be affected by font or color tags. :)
Uhm, yeah, but that ignores the elephant in the room. Just looking for the characters in the order they are in the search word without regard to what's in between those characters is certainly a creative, but not a very useful implementation of "fuzzy search". Just look at how it actually found the "uranium" in the description string (assuming I understood boskid correctly):
can also be Used to manually connect and disconnect electRic poles ANd power swItches with (font=defaUlt-seMibold)(color=#80cef0)left mouse button(/color)(/font).
Fuzzy search is generally supposed to find matches that are similar to the search phrase, to find something even if the spelling isn't 100% right. I wouldn't call two strings similar just because one happens to have the characters from the other strewn around in random places that just so happen to be in the right order...

User avatar
Impatient
Filter Inserter
Filter Inserter
Posts: 883
Joined: Sun Mar 20, 2016 2:51 am
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by Impatient »

Makes me think that the search algo may be a regex with ".*" for the allowed characters inbetween. Then maybe ".{0,x}" could ease the problem.

But this does not address the problem of character displacement in the search string like in "uanruim". A proper fuzzy search algo like levenshtein can handle this
( https://en.wikipedia.org/wiki/Levenshtein_distance ),
but may be too expensive, as it is of O(strLenght1 * strLength2).

blahfasel2000
Inserter
Inserter
Posts: 49
Joined: Sat Mar 28, 2020 2:10 pm
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by blahfasel2000 »

Levenshtein itself is only suitable for a fuzzy string comparison, not for a fuzzy substring search.

None of the on-line (meaning that only the search pattern can be preprocessed, there's no prior indexing of the data to be searched) fuzzy search algorithms are really performance wonders. However, we aren't talking about something that has to search megabytes of data on every tick 60 times a second, we are talking about occasionally searching a few hundred, maybe a few thousand in a heavily modded game, strings for a generally short pattern. The bitap algorithm as implemented by the agrep utility for example can search ~560,000 lines of logfiles (~54MB data in total) in less than .2 seconds on my machine. My PC has a Ryzen 5 3600X CPU, so no slouch, however with the much smaller relevant dataset in Factorio it should be plenty of fast enough for an interactive search even on a potato.

movax20h
Fast Inserter
Fast Inserter
Posts: 164
Joined: Fri Mar 08, 2019 7:07 pm
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by movax20h »

blahfasel2000 wrote:
Thu Jun 18, 2020 2:14 am
Levenshtein itself is only suitable for a fuzzy string comparison, not for a fuzzy substring search.

None of the on-line (meaning that only the search pattern can be preprocessed, there's no prior indexing of the data to be searched) fuzzy search algorithms are really performance wonders. However, we aren't talking about something that has to search megabytes of data on every tick 60 times a second, we are talking about occasionally searching a few hundred, maybe a few thousand in a heavily modded game, strings for a generally short pattern. The bitap algorithm as implemented by the agrep utility for example can search ~560,000 lines of logfiles (~54MB data in total) in less than .2 seconds on my machine. My PC has a Ryzen 5 3600X CPU, so no slouch, however with the much smaller relevant dataset in Factorio it should be plenty of fast enough for an interactive search even on a potato.
There is no need to preprocess the pattern or prepare the data (index). There is less than 1000 elements to be searched, most less than 20 character, and the pattern is also pretty short, and only input by the human. The most naive implementations would be fast even on a potato.

User avatar
Impatient
Filter Inserter
Filter Inserter
Posts: 883
Joined: Sun Mar 20, 2016 2:51 am
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by Impatient »

blahfasel2000 wrote:
Thu Jun 18, 2020 2:14 am
Levenshtein itself is only suitable for a fuzzy string comparison, not for a fuzzy substring search.
...
Can you elaborate on this? A levenshtein distance can be calculated for any 2 strings. In what ways does a string comparison differ from a substring search in the context of levenshtein? I am inclined to think, that it makes no difference to a levenshtein algo.

netmand
Filter Inserter
Filter Inserter
Posts: 302
Joined: Wed Feb 22, 2017 1:20 am
Contact:

Re: [0.18.31] Searching production for Uranium shows things unrelated to Uranium

Post by netmand »

All this talk of elephants and Levenshtein's potato is a bit kooky. Come on! Repair packs and Rails should totally be a part of the fuzzy search results for uranium!

Post Reply

Return to “Not a bug”