Random Number on Selector Combinator

mmmPI · Post by **mmmPI** » Mon May 26, 2025 11:40 am

torne wrote: Sun May 25, 2025 1:24 pm Yes, this is what I meant, and it seems like you understand that part pretty well from your own research now.
%8 is fine though here, because 8 is a power of two: it's 2^3, and so divides evenly into our original 32-bit random value. But, yeah, if we want a value between 0 and 9, just using %10 is going to be slightly more likely to generate 0-5 and slightly less likely to generate 6-9.

Thanks but not so well x) , it's getting better hopefully

I had not realized fully that most % 2^n would work fine for positive integer as they would all divide perfectly the 'random' 32 bit number, or be itself, or higher but that's another thing. I re-read the part about rejection sampling, the whole article was too much for one time.

torne wrote: Sun May 25, 2025 1:24 pm You are correct that just testing how often each value occurs doesn't tell you that it is really random, but if the values *don't* occur equally often then it's definitely *not* a uniform random distribution - having them occur equally often is necessary but not sufficient. Correlations between the values in a supposedly-random sequence, as you say, is another important property, and this is where things get extremely complicated, because there's a pretty-much infinite number of possible ways that the values might be correlated with each other, and therefore also a pretty-much infinite number of tests we could try.

That clarify things somehow x). Because when i was doing research for the RNG, it was mentionned the period of repetition, with the >> << xor method, states of bits can and will repeat after some very long period of time, unless if the number are not choosen properly, the repetition can occur much faster than expected. So i knew there were "more complicated things", already. But if you say "pretty-much infinite numbers of ways", that make sense to me intuitively to put it in the "extremely complicated things". It clarify the pointlessness of the quest of searching for "the" test.

torne wrote: Sun May 25, 2025 1:24 pm Your example here suggests one simple test: we could take the *difference* between consecutive values in the sequence and see if the differences look uniformly distributed. The "x+1 => x" generator would fail this test very quickly, since the difference would always be 1 - we wouldn't need much data to conclude that there's a correlation. But, other generators might pass this test but still fail other more complex tests. Which tests are most useful is a whole complex area of math that's beyond my expertise - I'm just a computer scientist, not a mathematician

I think i understand the idea, i can imagine it's an area of expertise on its own, i know when i try to follow links in article that i quickly reach a point where i understand nothing x).

But for this particular example, if the number are randomly choosen, ( just a 'gambler' intuition ) , i don't think their *differences* would be uniformly distributed, i think it could be "on average" 50% smaller if you start with a number from the middle of the range and you pick a 2nd number. Or if you start with a number near a bound of the range and then a second number. Sounds like the expected distribution would be an inverse bell, with 50% as the range on the side and 25% of the range in the middle ?

I think the main idea i get is one need to compare the pseudo-RNG and a true random sequence, to compare their properties , their *difference* should have the same distribution. Which wouldn't be 1 1 1 1 x). But that it would be just 1 another test amongst "many".

torne wrote: Sun May 25, 2025 1:24 pm There's also a separate property here that's important for security/cryptography/other cases where you have an adversary, though, which is "given some numbers from the random sequence, how easy is it to predict future values in the sequence". The LCG discussed here is terrible at this: if we know the specific parameters being used (the multiplicand and addend) then all we need to know is any one value from the sequence, and then we can predict *all* future values perfectly. And, since the algorithm here is a common, standard one, and the parameters used are well-known ones, we can easily guess all of this: if we have two values from the sequence, we can very quickly try a bunch of different common PRNG algorithms/parameters and see if any of them do in fact turn the first value into the second value.

Good thing i goofed around with the parameter from the wikipedia page then, maybe it work less well for the quality of the randomness but my Linear shift back register has not a well known parameter x)

This is beyond my imagination to think of something that wouldn't let you know the future value of the sequence if you know the parameters. It's like "seeded RNG" in the case of the linear shift back register i attached, so if you know the seed and the parameter, it will always follow the same sequence, and from a number and the parameter it's also possible to do so.

I can see how a RNG made in factorio that would rely on the method eugenekay posted would be achieving such result that with just the first few number , it would remain a mystery which will be the next, but when it comes to thinking about how it works for the game to pick those numbers in the first place it's different than when the pseudo randomness comes only from combinator operations. It doesn't require any seed and as such for my music purposes i couldn't use it , because i wanted to be able to know the sequence in advance, that was the reverse of cryptography x).

I wanted to make sure it would be "random" but also always the same with the same seed, so when i changed the machine i could hear the difference based on my change and not based on a different random sequence when the machine was copy pasted. That's where i see the other method different, if used to choose receipe or train stop, it can and will choose a different "first one " everytime ! ( not sure it's 100% related with what you said though ) but i feel it's important in the context of actually using those in practice !

torne · Post by **torne** » Mon May 26, 2025 3:13 pm

mmmPI wrote: Mon May 26, 2025 11:40 am
torne wrote: Sun May 25, 2025 1:24 pm Your example here suggests one simple test: we could take the *difference* between consecutive values in the sequence and see if the differences look uniformly distributed. The "x+1 => x" generator would fail this test very quickly, since the difference would always be 1 - we wouldn't need much data to conclude that there's a correlation. But, other generators might pass this test but still fail other more complex tests. Which tests are most useful is a whole complex area of math that's beyond my expertise - I'm just a computer scientist, not a mathematician
I think i understand the idea, i can imagine it's an area of expertise on its own, i know when i try to follow links in article that i quickly reach a point where i understand nothing x).

But for this particular example, if the number are randomly choosen, ( just a 'gambler' intuition ) , i don't think their *differences* would be uniformly distributed, i think it could be "on average" 50% smaller if you start with a number from the middle of the range and you pick a 2nd number. Or if you start with a number near a bound of the range and then a second number. Sounds like the expected distribution would be an inverse bell, with 50% as the range on the side and 25% of the range in the middle ?

You are thinking of subtracting the smaller number from the larger number (always a positive result), which is what people would normally mean by "difference" in casual discussion, but in mathematics that is more specifically called the "absolute difference".

I wasn't specific enough here. In computing we can think of integers as *always* being modulo 2^32 (or 2^64, or whatever other size of integer we are using), and allow arithmetic to "wrap around", and this is what the LCG is relying on: the number it's multiplying by is very large and so is often going to give us a result larger than 2^32, but we assume that the arithmetic operation just gives us the bottom 32 bits of the result. So, if you start with a sequence of uniformly distributed random values between 0 and 2^32-1 (what we think of as an "unsigned 32-bit integer" in computing), then subtract each number in the sequence from the next number, always in that order, and let it "wrap around" modulo 2^32 (i.e. so that the result of 0 - 5 is 2^32 - 5) then the resulting sequence of differences will also be uniformly distributed between 0 and 2^32-1.

Factorio combinators use signed arithmetic which complicates explaining the behavior a little, but the results are ultimately the same: you still "wrap around" in the same way.

mmmPI wrote: Mon May 26, 2025 11:40 am Good thing i goofed around with the parameter from the wikipedia page then, maybe it work less well for the quality of the randomness but my Linear shift back register has not a well known parameter x)

This is beyond my imagination to think of something that wouldn't let you know the future value of the sequence if you know the parameters. It's like "seeded RNG" in the case of the linear shift back register i attached, so if you know the seed and the parameter, it will always follow the same sequence, and from a number and the parameter it's also possible to do so.

Any PRNG by definition always produces the same sequence given the same parameters and seed - that's what makes it pseudo-random and not actually random. But, not all PRNGs *expose all of their internal state* in every number they produce. The simple 32-bit LCG discussed in this thread has a 32-bit "current state" which is also used as the output value, and the initial seed is not special - as long as you know or can figure out what multiplicand and addend are being used, then you don't need to know the original seed to predict the future values in the sequence. You only need to know the current value in the sequence - the seed, the current internal state, and the current value, are effectively all the same thing.

For a more complex PRNG like the Mersenne Twister mentioned earlier, the internal state of the generator is going to be *larger* than the values that it produces: for the commonly used 32-bit MT19937 variant, the internal state of the generator is an array of 624 32-bit integers (and a counter keeping track of which of the 624 entries is the "current" one), but it only outputs one 32-bit integer each iteration. This means that to predict all the future numbers in the sequence you need to be able to reverse-engineer what the *entire* internal state is, which requires you to have seen *many* past numbers, not just one, and requires some quite tricky math. It also means that the sequence can take much longer to repeat: while the period of a 32-bit LCG can at most be 2^32, the period of MT19337 is 2^19337-1.

mmmPI wrote: Mon May 26, 2025 11:40 am I can see how a RNG made in factorio that would rely on the method eugenekay posted would be achieving such result that with just the first few number , it would remain a mystery which will be the next, but when it comes to thinking about how it works for the game to pick those numbers in the first place it's different than when the pseudo randomness comes only from combinator operations. It doesn't require any seed and as such for my music purposes i couldn't use it , because i wanted to be able to know the sequence in advance, that was the reverse of cryptography x).

Using the selector combinator's random selector function is just delegating the randomness to the game engine's *own* PRNG, with the game's own choice of seed. It's much, much harder to predict this sequence because the game is using the PRNG for lots of different things, not just that one selector combinator, and so even if you record all of the random values you get from the selector you aren't actually seeing *all* of the values in the sequence: many of them are being used for other random things happening in the game engine. (also the game's PRNG is likely something with a large internal state like MT19337, so is difficult to reverse-engineer even if you do see all the values).

mmmPI wrote: Mon May 26, 2025 11:40 am I wanted to make sure it would be "random" but also always the same with the same seed, so when i changed the machine i could hear the difference based on my change and not based on a different random sequence when the machine was copy pasted. That's where i see the other method different, if used to choose receipe or train stop, it can and will choose a different "first one " everytime ! ( not sure it's 100% related with what you said though ) but i feel it's important in the context of actually using those in practice !

Right - there are many cases where using a specific seeded PRNG that you control the parameters/algorithm for is important, because that's how you create a sequence that looks random but which can be "recalled" by using the same seed again. A simple LCG or LFSR, as long as you don't care if someone can easily reverse-engineer the sequence, and the parameters are good ones that give a nice long period (ideally the same length as the size of the state), works well for this as it's easy to implement, quick to run, and simple to re-seed with the same value.

But... you still need to be careful how you convert the output from the PRNG into the actual range of values that you want, to avoid introducing too much bias; as long as you know the modulo bias is a thing you can make your own judgement call on whether it is going to matter for your use case and for the size of modulus you need.

mmmPI · Post by **mmmPI** » Tue May 27, 2025 12:26 am

torne wrote: Mon May 26, 2025 3:13 pm You are thinking of subtracting the smaller number from the larger number (always a positive result), which is what people would normally mean by "difference" in casual discussion, but in mathematics that is more specifically called the "absolute difference".

I was indeed thinking like a casual person x) i'm no mathematician nor computer scientist, just enthusiast

Thanks for the clarifications, that make sense now.

torne wrote: Mon May 26, 2025 3:13 pm Any PRNG by definition always produces the same sequence given the same parameters and seed - that's what makes it pseudo-random and not actually random. But, not all PRNGs *expose all of their internal state* in every number they produce. The simple 32-bit LCG discussed in this thread has a 32-bit "current state" which is also used as the output value, and the initial seed is not special - as long as you know or can figure out what multiplicand and addend are being used, then you don't need to know the original seed to predict the future values in the sequence. You only need to know the current value in the sequence - the seed, the current internal state, and the current value, are effectively all the same thing.

Ow that make sense too ! in the LSBR the seed is in a constant combinator, and i had always wondered about the impact of letting it "on" or turning it "off" once the prng is launched. Now i understand that if it's kept "on" then the seed is always added to the result from what i remember, because it will be present on the circuit, and the seed + the latest generated number are used as input as if it was a seed. But if it's turned off, then it's only the "result" that is being used to generate the next number. I understand from those concepts that it makes a difference, the current value isn't necessarily the same as the internal state. My main wondering was wether or not it impacts the distribution, i am not sure i have build it properly, i don't remember which one i was supposed to implement to replicate the logic, i guess i can still test it, without the modulo bias, to see if there are still problems in it x).

torne wrote: Mon May 26, 2025 3:13 pm For a more complex PRNG like the Mersenne Twister mentioned earlier, the internal state of the generator is going to be *larger* than the values that it produces: for the commonly used 32-bit MT19937 variant, the internal state of the generator is an array of 624 32-bit integers (and a counter keeping track of which of the 624 entries is the "current" one), but it only outputs one 32-bit integer each iteration.

At first that felt less complex than it sound when thinking about it. Intuitively i would think it's similar to using several PRNG, some to decide which value from the others to show to the user, to confuse it a maximum. x) but then I was looking at the wikipedia page to try and see if there was a chance i could make one in game, https://en.wikipedia.org/wiki/Mersenne_Twister. and I could see the n=624 in the C code, but then for other values like "a" "b" "c" and Umask and Lmask, the values are in hexadecimal, and with UL at the end, so i guess the 32-bit MT19937 variant is only for unsigned integers ? It would require different one for factorio ? That doesn't sound less complex anymore x).

torne wrote: Mon May 26, 2025 3:13 pm Using the selector combinator's random selector function is just delegating the randomness to the game engine's *own* PRNG, with the game's own choice of seed.

That's useful to know it's there, for when you want to surprise yourself with your blueprint ( still in the context of music x) ) otherwise, even if it's "pseudo-random" it's something already heard if the seed isn't changed. It would be the closest "already in the game" regarding the suggestion and the distribution to try and replicate with a combinator made PRNG as a goal. Also what i'd used to test my machine that is supposed to test for equal distribution x).

torne wrote: Mon May 26, 2025 3:13 pm But... you still need to be careful how you convert the output from the PRNG into the actual range of values that you want, to avoid introducing too much bias; as long as you know the modulo bias is a thing you can make your own judgement call on whether it is going to matter for your use case and for the size of modulus you need.

That's definitely something i learned ! thanks !

Nidan · Post by **Nidan** » Tue May 27, 2025 1:56 am

mmmPI wrote: Tue May 27, 2025 12:26 am but then I was looking at the wikipedia page to try and see if there was a chance i could make one in game, https://en.wikipedia.org/wiki/Mersenne_Twister. and I could see the n=624 in the C code, but then for other values like "a" "b" "c" and Umask and Lmask, the values are in hexadecimal, and with UL at the end, so i guess the 32-bit MT19937 variant is only for unsigned integers ? It would require different one for factorio ? That doesn't sound less complex anymore x).

You have to keep track of 624 values, that's going to get complex either way

Regarding singed vs. unsigned, for most operations the arithmetic combinator supports you don't have to do anything special (That's why two's complement is nowadays the only relevant representation for negative integers.), only division, modulus and right shift(*) are different. Out of these, only right shift is used by MT, and it looks like it is only right shifting by constants. Within factorio you can correct the result by adding an and afterwards.

*) keywords: logic (unsigned) and arithmetic (signed) right shift

mmmPI · Post by **mmmPI** » Wed May 28, 2025 11:36 pm

Nidan wrote: Tue May 27, 2025 1:56 am You have to keep track of 624 values, that's going to get complex either way

You are probably right in a certain comon sense. But i am not gonna let myself be scared of a big number just because it's big

When thinking of it from my beginner perspective the complexity in the Mersenne twister doesn't really rely in the "624", it's not felt as an obstacle to me here's why :

You get 1024 tiles in a chunk of 32x32, but 624 value to keep track of could represent (624*2) 1248 tiles, if you use a decider for each one, and you can hardcode its adress, or 1872 tiles if you add an additionnal constant combinator to help, or 2496 tiles if you need 4 combinator tiles per value of n.

If you allow yourself that much, there's enough room to make it fit into 3 max 4 chunks x). Not sure that would be super "complex". More like expansive, but maybe there would be 95% of the combinators that have the same task of keeping those value like a big register of identical cells. ( let's not think about bit packing stuff just yet x) ).

I believe regarding the "doability" it doesn't rule it off, it seem to me like for the initialization you only need a single 32 bit seed, so there would be additionnal logic to populate those 624 combinators required, but no need to input a gigantic initial state by hand every time or to configure them manually 1 by 1. " in theory".

As far as i understand ( which is not too much ) the reason for "624", is that if you use 32 bits word, and you have 624 of them, you have a total of 19968 bits available which allow to implement the specific MT19937 variant of ( Mersenne Twister) because it uses the number 2^19937-1, which is a Mersenne Prime.

I have no idea why it works in the first place, so i would be trying to replicate the operations for the mersenne twister the same way as i did for the linear shift back register, i understand one is a more complicated version of the first one, as they look very similar in the operations ( the tempering at the end) but there are some differences ( the code the handle the 624 values ^^^). I wouldn't feel a sense of progress if i just "by luck" make one, without understanding more, just trying to implement operations i dont understand at this scale doesn't feel engaging atm.

Nidan wrote: Tue May 27, 2025 1:56 am Regarding singed vs. unsigned, for most operations the arithmetic combinator supports you don't have to do anything special (That's why two's complement is nowadays the only relevant representation for negative integers.), only division, modulus and right shift(*) are different. Out of these, only right shift is used by MT, and it looks like it is only right shifting by constants. Within factorio you can correct the result by adding an and afterwards.

*) keywords: logic (unsigned) and arithmetic (signed) right shift

I had to read that several time, and at some point i had the image of that LoL champion "Singed" perturbating my focus, but i can't blame it for my non-understanding. I had already forgotten what was a two's complement i'm sorry, i remember you and Tertius and couple others deployed great effort to teach me and during a period of time i knew. But i have very little occasion to practice, and when looking at the wikipage of the code i don't have enough familiarity to understand that it would work with signed or unsigned integers the same way if an "and" is added afterward. I imagine it means a single combinator added at a step to restore or remove the "sticky bit". That adds unecessary complexity, beyond it being shited by a constant or not it's not the worse, but still x).

To me the biggest complexity is "understanding what operations are done when doing a twister". But i don't need that much to make progress, the operations on the wikipage i wasn't sure they'd make sense with signed integers, i'm enclined to believe you when you say they do with an added step. But it doesn't really help me i feel or it doesn't go far enough, because i don't really want to (just) "make" the mersenne twister, i want to "understand" what's going on. That's what i would remember, for the 2-complements, to me it's a way to note the negative number in binary, can't remember exactly which one, i risk an off by one error ,but it's easy to refresh memory on that cuz i knew once.

The MT look like "mysterious way" to combine linear shift back register, with 624 different values to pick from, and i want to understand more the "mysterious ways", even if i'm going to forget, not only because i feel it's necessary in the first place to try and implement properly the operation in combinators, but also because i think it's "complex math" that would be good for me to know "at least once". Having a mental picture of it working even blurry is not the same as remembering the 3 hours setting up manually the 624 combinators for "whatever reason 624 is magic".

So far i think i only understand the idea of using the 2 masks to take some parts of some of the 624 values, either the upper part, or the lower part, combined into a new number. I think it's done to modifiy 1 of the 624 values, which is then given as the output, and then the whole array is shifted so this used number is far in the back in 624th position. I "think" the '397' magic constant, is the distance between 2 numbers in the 624-long array, when making a new output, it's the combined bits of 2 values out of the 624 that are 397 ranks aparts.

I know it's far beyond the scope of the suggestion, i don't know if it's gameplay help for how to make a MT in factorio, or general discussion at this point, i think it may be of some use to someone sometimes and i'd be willing to follow instruction written for layman at this point even though i would prefer more understanding. Can you provide with either one or the other or confirmation ? If i have a general understanding on the principle of it, i can try and make sense of the operation line by lane, i'm not sure what i'd be trying to do with all those values otherwise i feel.

I don't know the C langage and it's what the wikipage uses, so i'm not sure i understand which operations are supposed to be done. Mostly what is a loop and how many times it runs i think. The operation themselves ^ << & even ^= is 'fine' but the variables when it's "x" , "y" and "z" it's not a constant at the end in the tempering, and i'm not sure i understand what's before and i don't want to pile up too many conjecture. It's quite the new territory for me x)

torne · Post by **torne** » Thu May 29, 2025 9:32 pm

Personally I wouldn't particularly want to implement my own MT based on the wikipedia description even in a normal programming language as it's just a bunch of fiddly math stuff that I don't have any particularly solid grasp on, so my advice for doing it in combinators would be "don't", but I'm just some internet person, you do what you want with your time

A LFSR/LCG using a reasonable (i.e. chosen by someone who did understand the math) set of parameters from the internet is a decent PRNG for just having a seedable (and thus repeatable) random sequence for a game or simple simulation where you only need one sequence of numbers, aren't trying to prevent anyone from being able to reverse-engineer the sequence, and aren't going to need billions of numbers - they're simple and fast and the output is good enough.

As a computer science grad I would generally focus on knowing the cases where being more careful or doing more research *is* important:

- Anything to do with crypto/security. I just don't ever implement these things myself in the first place; it's not my specialty so it's wildly irresponsible to do it for any real software that someone else might use, and I don't find it interesting enough to do it just as a learning exercise.

- If your simulation/whatever is going to consume loads and loads of random numbers. If it's going to get anywhere close to the period of your PRNG then you probably need to think about how many numbers you really *are* going to need and make sure you pick something with a period much bigger than that, because any number of undesirable things might end up happening if the sequence repeats.

- If you're trying to generate multiple different sequences to use for different things (e.g. having one random number generator for each thread in a parallel process for efficiency, or things like procedural generation where you may want to be able to generate different chunks of the world independently of each other). Having multiple copies of the same PRNG initialized with different seeds is *not* always good for this, because even if the sequence generated by each PRNG is sufficiently random, there can still be unexpected correlations between the Nth value of each sequence (or other more complicated relationships). Some PRNGs can do this safely as long as you vary the correct parameters between instances, some can't, and if you can't find an authoritative answer then it probably can't. Incidentally if you *are* interested in things like procedural generation then I really like https://unity.com/blog/engine-platform/ ... om-numbers as a walkthrough of some of the problems that can come up using PRNGs for this and some suggestions on what you can do instead (though I definitely wouldn't want to implement any of those in combinators either!)

I hope this was an interesting discussion even if it drifted off the original topic a bit, but I'm probably out of useful things to add now

mmmPI · Post by **mmmPI** » Fri May 30, 2025 3:00 am

torne wrote: Thu May 29, 2025 9:32 pm Personally I wouldn't particularly want to implement my own MT based on the wikipedia description even in a normal programming language as it's just a bunch of fiddly math stuff that I don't have any particularly solid grasp on, so my advice for doing it in combinators would be "don't", but I'm just some internet person, you do what you want with your time

Your remark on the modulo bias was particularly on point, and that sound like a very reasonnable advice again x), However it happens that i already sometimes do "random stuff" to distract from the vacuity of existence ^^, i don't know much programming language and it already looked like a bunch of fiddly math stuff that i didn't have any particular grasp on when trying to make the LFSR but it worked ! So i can't guarantee i will follow the advice this time again maybe i will just fail and it will look like i didn't try x).

That other time i used one such LFSR was to animate critter sprite in javascript (random stuff), but then i seeded it with the clock of the computer, so i could just have used math.random, i enjoyed the extra detour nonetheless and found that made me able to make one in factorio later because i remembered roughly the principles i knew i could use it to make music, even though they both had the modulo bias, which i would have never known if not reading about the Mersenne Twister there , so maybe .... x)

torne wrote: Thu May 29, 2025 9:32 pm A LFSR/LCG using a reasonable (i.e. chosen by someone who did understand the math) set of parameters from the internet is a decent PRNG for just having a seedable (and thus repeatable) random sequence for a game or simple simulation where you only need one sequence of numbers, aren't trying to prevent anyone from being able to reverse-engineer the sequence, and aren't going to need billions of numbers - they're simple and fast and the output is good enough.

That's what i would use the wikipage for, choosing those parameters in a reasonnable way ! Then i added 1 or 2 to some parameters to see if things would break and how, and since it didn't result in terrible failure immediatly, nor that i could see later when using it, i just went for it x). I believe now after reading more about the recurence relation that it can cause a reduction of the period. ( and the modulo bias is another problem ). So that's a couple thing to keep in mind before updating or using the blueprints x).

torne wrote: Thu May 29, 2025 9:32 pm Having multiple copies of the same PRNG initialized with different seeds is *not* always good for this, because even if the sequence generated by each PRNG is sufficiently random, there can still be unexpected correlations between the Nth value of each sequence (or other more complicated relationships). Some PRNGs can do this safely as long as you vary the correct parameters between instances, some can't, and if you can't find an authoritative answer then it probably can't. Incidentally if you *are* interested in things like procedural generation then I really like https://unity.com/blog/engine-platform/ ... om-numbers as a walkthrough of some of the problems that can come up using PRNGs for this and some suggestions on what you can do instead (though I definitely wouldn't want to implement any of those in combinators either!)

I hope this was an interesting discussion even if it drifted off the original topic a bit, but I'm probably out of useful things to add now

Thanks for the complement of information and the article, that is very interesting, it helps making sense of why something like the mersenne twister is necessary, what it adds compared to just an array of duplicate prng, why it's necessary to use what chatgpt call "the twist before the big shuffle of the numbers on the ring" when i asked to explain like i'm a 12 years old. It less infantilizing too x) !

mmmPI · Post by **mmmPI** » Sat May 31, 2025 6:25 am

Nidan wrote: Tue May 27, 2025 1:56 am Regarding singed vs. unsigned, for most operations the arithmetic combinator supports you don't have to do anything special (That's why two's complement is nowadays the only relevant representation for negative integers.), only division, modulus and right shift(*) are different. Out of these, only right shift is used by MT, and it looks like it is only right shifting by constants. Within factorio you can correct the result by adding an and afterwards.

*) keywords: logic (unsigned) and arithmetic (signed) right shift

I gave it another try, not that long really, but to further gauge the difficulty, this time while in game and i have to say, i don't know how to handle just the very beginning because amongst the parameter required for implementation of MT19937 , from the wiki page it says : #define f 1812433253UL

It's used when wishing to create the initial 624 value from the seed, in a loop where i increment that look like this : seed = f * (seed ^ (seed >> (w-2))) + i

So while it is used in an operation, i have no idea what i'm supposed to do. The initial number 1812433253UL being around 100G it's 40 bit.

Let's pretend the operation is seed = 1812433253UL * seed, i have no idea , how to implement that with signed 32 bit integers. Or rather several ideas which i can't choose from and i think are equally likely to be wrong, as i could just truncate the number, or use 2 different signals to represent it, or use the large number in a constant, and let the game do the modulo operation.

Have i not understood what you proposed doing ? Or would you propose a different thing given that particular number ?

Nidan · Post by **Nidan** » Sat May 31, 2025 10:30 am

mmmPI wrote: Sat May 31, 2025 6:25 am
Nidan wrote: Tue May 27, 2025 1:56 am Regarding singed vs. unsigned, for most operations the arithmetic combinator supports you don't have to do anything special (That's why two's complement is nowadays the only relevant representation for negative integers.), only division, modulus and right shift(*) are different. Out of these, only right shift is used by MT, and it looks like it is only right shifting by constants. Within factorio you can correct the result by adding an and afterwards.

*) keywords: logic (unsigned) and arithmetic (signed) right shift
I gave it another try, not that long really, but to further gauge the difficulty, this time while in game and i have to say, i don't know how to handle just the very beginning because amongst the parameter required for implementation of MT19937 , from the wiki page it says : #define f 1812433253UL

It's used when wishing to create the initial 624 value from the seed, in a loop where i increment that look like this : seed = f * (seed ^ (seed >> (w-2))) + i

To handle the defines, I'd replace when everywhere and then simplify as much as possible. Using "seed = f * (seed ^ (seed >> (w-2))) + i" as an example:
seed = 1812433253UL * (seed ^ (seed >> (32-2))) + i
seed = 1812433253UL * (seed ^ (seed >> 30)) + i
Ok, that didn't simplify that much… but it is easily representable in factorio: 3 combinators (+ 4? for delays), 3 ticks
Now, the overall construct is a loop. Translating loops to factorio, or in general, from programming to hardware design, can be challenging. (You might recall that I nope'd out of one for my 1.x divider blueprint.) Two methods come to mind: "unrolling" the loop, i.e. repeating the circuitry once for each loop iteration; or implement something that allows you to store the values for state_array one value at a time (e.g. steal the RAM from here). There may be other methods, but I'm a computer scientist, not an electrical engineer, and especially not one specialized in logic circuit or FPGA design.

So while it is used in an operation, i have no idea what i'm supposed to do. The initial number 1812433253UL being around 100G it's 40 bit.

You miscounted, it's 1'812'433'253UL or about 1.8G. The true/mathematical result of the multiplication will very likely exceed 32bit, but cutting off the overflow / wrapping around is intentional here.

Have i not understood what you proposed doing ? Or would you propose a different thing given that particular number ?

I didn't notice any operations that aren't 32bit, so I gave them no thought. In general, you have to figure out how to split the operations into manageable pieces. Look for example at the unsigned long multipliers from my 1.x pairwise arithmetic thread.

mmmPI · Post by **mmmPI** » Sat May 31, 2025 11:57 am

Nidan wrote: Sat May 31, 2025 10:30 am seed = 1812433253UL * (seed ^ (seed >> 30)) + i
Ok, that didn't simplify that much… but it is easily representable in factorio: 3 combinators (+ 4? for delays), 3 ticks
Now, the overall construct is a loop. Translating loops to factorio, or in general, from programming to hardware design, can be challenging. (You might recall that I nope'd out of one for my 1.x divider blueprint.) Two methods come to mind: "unrolling" the loop, i.e. repeating the circuitry once for each loop iteration; or implement something that allows you to store the values for state_array one value at a time (e.g. steal the RAM from here). There may be other methods, but I'm a computer scientist, not an electrical engineer, and especially not one specialized in logic circuit or FPGA design.

Thanks for the help !

To me the idea was to have something like 624 memory cell, to keep track of the 624 value, and at first it's required to initialize them with a loop that would run once, from the seed, to create 624 almost similar copies using the operation mentionned. ( potentially manually turnin on-off something to reset and re-init).

Then it would be necessary to use 2 of those 624 value ( with rules on which), mask the first half, from one, and the second half from the other, (using values that look like 0000000011111111111 and 111111100000000 (written in hex in the code) and combine the 2 halves to make a 3 rd number.

Then there is a "tempering matrice" applied, to this 3rd number, this i recognized as the 3 operations similar to the LSBR
This part :
y = x ^ (x >> u);
y = y ^ ((y << s) & b);
y = y ^ ((y << t) & c);
z = y ^ (y >> l);
Which i felt """confident""" doing in a "loop". Because to turn the last "z" into the first "y" would be what i did already for the LSBR somehow. The MT return "z" as the random number for user.

And 4rth, it is necesssary to "re-order" the 624 values in the memory cells, before choosing 2 of those 624 values again. Although the code from that part comes before in the wikipage it seem more logical to put it at end when laying out the "logic" i think i understood from asking several times in different ways to different AIs to explain to me the code and the principles of the MT in general to make sure i got it right.

If that is "correct" i "think" i can make one in the game, or at least i have a strategy to cut this problem in smaller part and try, it sound complicated but i have done contraption that function similarly, storing number in combinator, doing operation on them, and putting them back in other memory cell for making music

. It ressemble the RAM system you linked which is most likely already done in a very efficient way, but it was made using pre 2.0 circuits, i will allow myself the new potentially easier ways to do things !

Nidan wrote: Sat May 31, 2025 10:30 am You miscounted, it's 1'812'433'253UL or about 1.8G. The true/mathematical result of the multiplication will very likely exceed 32bit, but cutting off the overflow / wrapping around is intentional here.

No ! I didn't ! My error was different actually, almost as bad x), I thought it was a number in hexadecimal , and i put it in a converter , because the list of the constant look like this :

#define n 624
#define m 397
#define w 32
#define r 31
#define UMASK (0xffffffffUL << r)
#define LMASK (0xffffffffUL >> (w-r))
#define a 0x9908b0dfUL
[...]
#define b 0x9d2c5680UL
#define c 0xefc60000UL
#define f 1812433253UL

There are already values that gives "more" than what you can represent with signed 32 bits when converted in binary , like 0x9d2c5680UL is 2'636'928'640 or 2.6G which i assumed could be represented as a negative number, the game already turns it to -1.6G something, thought to do the same for the "decimal form" of 1812433253UL ..... ahhh i'm no electrical engineer nor computer scientist, you can probably tell

.

Nidan wrote: Sat May 31, 2025 10:30 am I didn't notice any operations that aren't 32bit, so I gave them no thought. In general, you have to figure out how to split the operations into manageable pieces. Look for example at the unsigned long multipliers from my 1.x pairwise arithmetic thread.

That's pretty much where i stopped last time i was using hexadecimal ... I kinda remember this, it is inspiring

! Thanks for the help, now i feel kinda commited or that i have to let everyone know if i stumble accross another insurmountable obstacle x), maybe i'll just fail an humble down in silence but at least that grinded a fear gears in my mind already

Nidan · Post by **Nidan** » Sat May 31, 2025 1:08 pm

mmmPI wrote: Sat May 31, 2025 11:57 am Then there is a "tempering matrice" applied, to this 3rd number, this i recognized as the 3 operations similar to the LSBR
This part :
y = x ^ (x >> u);
y = y ^ ((y << s) & b);
y = y ^ ((y << t) & c);
z = y ^ (y >> l);
Which i felt """confident""" doing in a "loop". Because to turn the last "z" into the first "y" would be what i did already for the LSBR somehow. The MT return "z" as the random number for user.

That's just breaking up the formula into steps and removing repetition as y is used twice in each line. The final result is z ("return z;"), we don't really need y, is just a variable getting reused to store intermediate results. This is not a loop. When calling the whole function repeatedly you could view the whole thing as a loop, but "turn[ing] the last "z" into the first "y"" wouldn't be necessary, the part that's preserved across multiple calls is "state" (or "state_array" and "state_index").

Code: Select all

uint32_t aaa = x ^ (x >> u);
uint32_t bbb = aaa ^ ((aaa << s) & b);
uint32_t ccc = bbb ^ ((bbb << t) & c);
uint32_t z = ccc ^ (ccc >> l);

would work the same, or

Code: Select all

uint32_t z = ((x ^ (x >> u) ^ (((x ^ (x >> u)) << s) & b)) ^ (((x ^ (x >> u) ^ (((x ^ (x >> u)) << s) & b)) << t) & c)) ^ (((x ^ (x >> u) ^ (((x ^ (x >> u)) << s) & b)) ^ (((x ^ (x >> u) ^ (((x ^ (x >> u)) << s) & b)) << t) & c)) >> l);

but that's a bit unwieldy.

mmmPI · Post by **mmmPI** » Sat May 31, 2025 3:21 pm

Nidan wrote: Sat May 31, 2025 1:08 pm That's just breaking up the formula into steps and removing repetition as y is used twice in each line. The final result is z ("return z;"), we don't really need y, is just a variable getting reused to store intermediate results. This is not a loop. When calling the whole function repeatedly you could view the whole thing as a loop, but "turn[ing] the last "z" into the first "y"" wouldn't be necessary, the part that's preserved across multiple calls is "state" (or "state_array" and "state_index").

I see what you mean, how the part you mention is different than a "for loop" where you don't know how much time it will take before you get the intermediate result and you need it before running the rest of the algo. That's probably why i feel confident that part can be "done it in a loop" and by that i meant with no time delay regarding the execution of the MT, so that i would produce an output every tick. which i believed was what you meant by :

Nidan wrote: Sat May 31, 2025 10:30 am Now, the overall construct is a loop. Translating loops to factorio, or in general, from programming to hardware design, can be challenging

But now i'm reconsidering. Have i missed a difficulty ? What else would be a loop otherwise ?

Nidan wrote: Sat May 31, 2025 10:30 am Two methods come to mind: "unrolling" the loop, i.e. repeating the circuitry once for each loop iteration; or implement something that allows you to store the values for state_array one value at a time

To me it appears that using 624 or so different unique memory states combinators gets rid of the problem of the uncertainty in time of execution for "any" part of the MT algo in a way that "makes sure" the resulting contraption can produce an output "every tick", but this is not necessary to handle the "tempering matrice part" the step you detailed, i agree. To me this part is similar with the LSBR, the part that has been "generalized" and "Twisted" for the MT. That part when used repeatdly is a "simple" constitute a "loop" to me, but this part needs to happen inside another "loop" which is the MT.

Is this the "repeating-the-circuitry" you mean ? using 624 memory cells ? Or is there something else i've missed that would add delay, and potentially requiring another "solution" ? like another "for loop" in the code that would need even more duplication of circuitry ?

Factorio Forums

Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator

Re: Random Number on Selector Combinator