[1.1.42] Lamp color is wrong when the sum of red+green circuit wires overflows
[1.1.42] Lamp color is wrong when the sum of red+green circuit wires overflows
One constant combinator supplies 2 billion red. The other supplies 2 billion green and negative 2 billion blue. When passed through both red and green wires, the sum of those wires overflows at the input. This isn’t normally a concern, as demonstrated by letting an arithmetic combinator do the summing. However, the lamp itself behaves erratically when the overflow occurs at its input. How does overflowed/now-negative red become yellow? How does a lamp radiate black?
I do like the black lamp. It would be nice if you could set that color by normal means, and as the off state for improved contrast on displays that don’t need a combinator per pixel.
Edit again: confirmed in 1.1.42.
- Attachments
-
- factorio-current.log
- (5.59 KiB) Downloaded 147 times
Last edited by quale on Wed Oct 13, 2021 7:18 pm, edited 2 times in total.
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
Hmm... when I import and place your blueprint, I get two white lamps (left) and two blue lamps (right), which seems to be the expected result: the highest priority of the colours that are positive, after summing the wires, determines the colour.
The only obvious difference is Factorio 1.1.39 (build 58937, mac, steam) (you) vs Factorio 1.1.39 (build 58937, win64, steam) (me)...
The only obvious difference is Factorio 1.1.39 (build 58937, mac, steam) (you) vs Factorio 1.1.39 (build 58937, win64, steam) (me)...
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
Then I smell undefined behavior. People forget that C and C++ don’t allow signed overflow among other things. It will often behave like the underlying hardware, until it comes to bite you. Maybe the Mac compiler took advantage in this case. A compiler flag like -fwrapv should stop that, if not a source code change to make it overflow in a defined way.
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
That doesn’t matter. A compiler can do what it wants, and can have legitimate reasons to optimize. Examples include using wider types to store values, and eliminating conditions that are redundant under these assumptions. I can observe the latter on x86-64 in practice. It very much resembles this bug.
Code: Select all
bool redundant(int a, int b)
{
return a > 0 && b > 0 && a + b > 0;
}
redundant(int, int): # @redundant(int, int)
test edi, edi
setg cl
test esi, esi
setg al
and al, cl
ret
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
Sorry, I misread your previous post and missed the point. But AIUI, the optimisation in your example is valid because, given a > 0 and b > 0, "a + b > 0" is true and therefore can be optimised away in all cases where the behaviour is defined, ie where a + b does not overflow. "The usual arithmetic conversions" prohibit "using wider types to store values" (assuming you mean eg the result of a + b in your example being wider than int) but that is part of what makes the optimisation valid - it's important to specify when undefined behaviour occurs .quale wrote: ↑Tue Sep 14, 2021 4:13 pm That doesn’t matter. A compiler can do what it wants, and can have legitimate reasons to optimize. Examples include using wider types to store values, and eliminating conditions that are redundant under these assumptions. I can observe the latter on x86-64 in practice. It very much resembles this bug.
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
I dug into the assembly because this was killing me. It was indeed as I suspected, in addition to being an unspeakable mess I’ll spare you. Distilled down to the bare minimum that should help you understand the bugged coloring:
An alpha value of -1 means the color is ignored, making it white.
Code: Select all
color_channels = {
{"red", {1, 0, 0, 1}},
{"green", {0, 1, 0, 1}},
{"blue", {0, 0, 1, 1}},
{"yellow", {1, 1, 0, 1}},
{"pink", {1, 0, 1, 1}},
{"cyan", {0, 1, 1, 1}},
{"white", {1, 1, 1, 1}}
};
result = {1, 1, 1, -1};
for (channel : color_channels) {
signal = red_network[channel.id] + green_network[channel.id];
// Using the flags of the add instruction, as if we have 33 result bits.
// Undefined behavior allows this to happen.
if (signal > 0) {
result.b = channel.color.b;
result.a = channel.color.a;
}
// Using the flags of a subsequent test instruction,
// based on the wrapped 32-bit result.
if (wrap(signal) > 0) {
result.r = channel.color.r;
result.g = channel.color.g;
break;
}
}
return result;
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
Given your "as if we have 33 result bits" comment, the flags state tested to effectively evaluate "signal > 0" is not obvious to me (at least). Care to elaborate, to satisfy my curiousity?quale wrote: ↑Wed Sep 15, 2021 7:20 amCode: Select all
... signal = red_network[channel.id] + green_network[channel.id]; // Using the flags of the add instruction, as if we have 33 result bits. // Undefined behavior allows this to happen. if (signal > 0) { ...
Not that I see reason to doubt there is a bug due to reliance on a particular outcome of undefined behaviour, as you suggested.
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
It uses a JLE instruction, which jumps when ZF=1 or SF≠OF. Like all instructions named after inequalities, it uses a flag based on the carry bit. That’s very useful after a comparison (which is just a subtraction that doesn’t store its result). The optimization opportunity exists in the limited set of conditional instructions: there is no single instruction for > or ≤ that doesn’t use the carry bit. If you were subtracting, you could swap the operands and use SF for < or ≥ instead. But after an addition, all you can do is test the 32-bit result with an extra instruction, or check ZF and SF separately in two conditional instructions.
This is where the compiler decides it’s not having any of it, and invokes undefined behavior to save a whole instruction. Not that it helps when it goes split-brain on the two halves of the result, with the overhead of both. There’s more weird stuff accounting for way more than one instruction, while other sequences are conspicuously optimal. It must be hard to identify which optimization to apply at which time, and when to end. It’s honestly amazing that optimizers do a pretty good job in a reasonable amount of time.
This is where the compiler decides it’s not having any of it, and invokes undefined behavior to save a whole instruction. Not that it helps when it goes split-brain on the two halves of the result, with the overhead of both. There’s more weird stuff accounting for way more than one instruction, while other sequences are conspicuously optimal. It must be hard to identify which optimization to apply at which time, and when to end. It’s honestly amazing that optimizers do a pretty good job in a reasonable amount of time.
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
An area where this can potentially make a huge difference is with stuff like loop iteration variables. A lot of existing code is written with a 32-bit int as the iteration variable, which is pretty straightforward to handle on a 32-bit target. But on a 64-bit target, if you can't evaluate the iteration condition to a known constant range, then you could face having to perform an extra truncation operation on the iteration variable after every increment to make sure it still fits in a 32-bit range. It makes the loop a lot harder to unroll and optimize efficiently. You can exploit the undefined behavior to ignore the truncation and generate efficient code for the normal case where the iteration variable is not expected to overflow. But in some bizarro cases where the programmer was expecting that to be defined and somehow relying on it, it can lead to surprises.
Re: [1.1.39] Lamp color is wrong when the sum of red+green circuit wires overflows
Anachrony wrote: ↑Mon Oct 25, 2021 10:38 pmAn area where this can potentially make a huge difference is with stuff like loop iteration variables. A lot of existing code is written with a 32-bit int as the iteration variable, which is pretty straightforward to handle on a 32-bit target. But on a 64-bit target, if you can't evaluate the iteration condition to a known constant range, then you could face having to perform an extra truncation operation on the iteration variable after every increment to make sure it still fits in a 32-bit range. It makes the loop a lot harder to unroll and optimize efficiently. You can exploit the undefined behavior to ignore the truncation and generate efficient code for the normal case where the iteration variable is not expected to overflow. But in some bizarro cases where the programmer was expecting that to be defined and somehow relying on it, it can lead to surprises.
Code: Select all
char c = 0;
while (c++ >= 0) { ... };
Code: Select all
1: b 1b
Re: [1.1.42] Lamp color is wrong when the sum of red+green circuit wires overflows
We made some changes, can anybody confirm if it is fixed in the latest 1.1 release?
Re: [1.1.42] Lamp color is wrong when the sum of red+green circuit wires overflows
It worked as expected on Windows with 1.1.39, and still does with 1.1.46. Need somebody to test on Mac OS.
(But from what quale described, this is the sort of bug you should know is fixed, otherwise it isn't, eg it may "randomly" resurface.)
Re: [1.1.42] Lamp color is wrong when the sum of red+green circuit wires overflows
I gave 1.1.46 a go on my Mac. It seems to use the full, pre-overflow result now. A cursory glance at the machine code suggests the same: addl followed by jg, without split-brain handling of the two result parts. It’s at least consistent with itself, but not with the usual circuit network behavior and the way it used to work (and still works according to SoShootMe) on other compilers. My blueprint is now a proper gaming build with RGB lights.
You can get the expected behavior by casting to unsigned or a wider signed type before the addition, and then casting back to 32-bit signed for the comparison. Conversion to unsigned and unsigned operations are defined to wrap by the language. Conversion to signed is implementation-defined in case of overflow: the implementation has to pick a definite result or raise a signal, which comes down to wrapping on contemporary systems. The undefinedness only happens during signed arithmetic, which is avoided here. Both Clang and GCC cut through it and generate efficient code. That would fix it without reducing optimization potential elsewhere, unlike boilerplate -fwrapv. I’m just worried that there is more code like this that’s broken by such optimizations, waiting to be discovered or to become a problem in the future.