Let's see... Currently, Factorio has English, German, French, Italian, Korean, Spanish, Chinese, Russian, Japanese, Polish, Danish, Dutch, Finnish, Norwegian, Swedish, Hungarian, Czech, Romanian, Portuguese, and Ukrainian translations.
Japanese search is super complex - it would indeed require a large library to handle it, in fact it would require a built-in Japanese dictionary! Most likely, the best, or perhaps the only way to achieve it is by cooperating with translators to make them add pronunciation info to each word (It's fine if it isn't added for every word, those could use the current search logic). I could understand why you're not doing that (Personally, I'd do it anyway, as I consider i18n very important and wouldn't want those playing in other languages to have an inferior experience). I assume Chinese is similar to Japanese. English works properly already, Korean doesn't require any capitalization and should work fine as well.
This leaves alphabets with diacritics, and Cyrillic alphabets. Unicode collation is pretty hard, so it would require a fairly big library to do properly.
However, Cyrillics in particular are super easy to handle. You could use std::locale if that works for you - it won't require any plumbing with ICU, just a few wchar_t conversions. Not doing that is just lazy in my opinion. You could simply iterate over Unicode character boundaries and check for the particular 37 values of Russian and Ukrainian capital letters, that wouldn't even require allocation!
I'll even go as far as to say not having proper search is a deal-breaker for me, and is one of the main reasons I never play in my native language.
Which is why when I saw this reply, I implemented a
lightweight C++ Unicode collation library that doesn't support the entire Unicode subset, but will definitely work for Cyrillics and most diacritics (All diacritics currently used in Factorio, if there isn't a bug somewhere).
It's made of two parts - a Python 3 script to generate the Unicode mapping, and a 250-line autogenerated C++ function that actually processes text according to the generated mapping. It uses std::string, but you can easily adapt it for any string type. I licensed it as 0BSD, so you can use it in Factorio without any licensing obligations (if you do end up using it, I'd be grateful if you credited me like you do with MIT libs, but that isn't required, since the library is really small). It's sad Wube doesn't consider it important - but I hope my implementation will make adding it easy enough to do despite being low on priority list.