[1.1.46]Non-blocking-save leaks memory
-
- Long Handed Inserter
- Posts: 55
- Joined: Fri Apr 22, 2016 6:20 pm
- Contact:
[1.1.46]Non-blocking-save leaks memory
Very happy to be back on linux playing this game after a dark cloud of Windows gaming. The amazing not-windows-only feature of Non-Blocking saves is now available to me! My factory has grown rather large, over 1k spm, with several mods including Swarmageddon. (Save file is 70MB)
So I imagine the game-state in memory is quite the block, and the simulation even freezes for maybe 1/4 second while it begins the save file. I assume it duplicated the memory pages, continues the game on one fork, and executes the save function on the other, now frozen, fork.
Seems there is a very large quantity of memory leaked after this procedure.
To duplicate, all you have to do is play a very large factory with autosave on 5 minutes or less, for several hours. Open a TOP and notice the bloat. It goes from 1g resident size to nearly 8g, eating a lot of virtual memory... I used to run this game on even bigger factories with NO virtual memory. Had to turn on a page file because of this. (And other programs leaking memory like mad fools. I suspect this bug might extend into the operating system's ability to free memory)
So I imagine the game-state in memory is quite the block, and the simulation even freezes for maybe 1/4 second while it begins the save file. I assume it duplicated the memory pages, continues the game on one fork, and executes the save function on the other, now frozen, fork.
Seems there is a very large quantity of memory leaked after this procedure.
To duplicate, all you have to do is play a very large factory with autosave on 5 minutes or less, for several hours. Open a TOP and notice the bloat. It goes from 1g resident size to nearly 8g, eating a lot of virtual memory... I used to run this game on even bigger factories with NO virtual memory. Had to turn on a page file because of this. (And other programs leaking memory like mad fools. I suspect this bug might extend into the operating system's ability to free memory)
Re: [1.1.46]Non-blocking-save leaks memory
See known issues.
Re: [1.1.46]Non-blocking-save leaks memory
ugh it's just that you don't get how memory management works. you were running without a "page file"? on Linux, it's known as swap, and it's mandatory to have some so the kernel can work correctly.gallomimia wrote: ↑Fri Nov 19, 2021 11:09 pmVery happy to be back on linux [...] Had to turn on a page file because of this. (And other programs leaking memory like mad fools. I suspect this bug might extend into the operating system's ability to free memory)
source: Linux kernel dev for a decade.
Re: [1.1.46]Non-blocking-save leaks memory
The requirement to have twice the ram as swap is a total myth and linux works perfectly without any swap. Linux (and many other unixes) has been used on disk-less systems using netboot that have no place to put swap at all for decades. Using swap over the network is possibly but actually causes more problems than it solves as there are too many ways for it to deadlock. So I'm not sure what kernel dev you are talking about, they should know better.ptx0 wrote: ↑Sat Nov 20, 2021 3:22 pmugh it's just that you don't get how memory management works. you were running without a "page file"? on Linux, it's known as swap, and it's mandatory to have some so the kernel can work correctly.gallomimia wrote: ↑Fri Nov 19, 2021 11:09 pmVery happy to be back on linux [...] Had to turn on a page file because of this. (And other programs leaking memory like mad fools. I suspect this bug might extend into the operating system's ability to free memory)
source: Linux kernel dev for a decade.
PS: There are features that require swap, like suspend to swap, for obvious reasons. But nothing the kernel needs.
Re: [1.1.46]Non-blocking-save leaks memory
I'm the kernel developer. where'd I say you need twice the RAM? putting a bunch of words into my mouth. in fact, I'm just going to forever ignore you on this forum because of the amount of noise you contribute. thanks.mrvn wrote: ↑Sat Nov 20, 2021 4:37 pmThe requirement to have twice the ram as swap is a total myth and linux works perfectly without any swap. Linux (and many other unixes) has been used on disk-less systems using netboot that have no place to put swap at all for decades.source: Linux kernel dev for a decade.
edit: to add more info for anyone who actually wants to know why running without swap "works", it's because you're not running mission-critical systems and you've taken stock of the downsides of not having swap (e.g. behaving poorly under memory pressure) and decided that you don't require support from anyone else, because you know better than the Best Practices. this is similar to any precaution anyone takes - like having verified backups or wearing a seatbelt.
good thing it's only eating up virtual memory.eating a lot of virtual memory
Re: [1.1.46]Non-blocking-save leaks memory
Sorry, didn't mean to imply that you said anything about twice the ram. It's just the common myth that is repeated whenever the discussion comes to "How much swap do I need?".ptx0 wrote: ↑Sat Nov 20, 2021 4:48 pmI'm the kernel developer. where'd I say you need twice the RAM? putting a bunch of words into my mouth. in fact, I'm just going to forever ignore you on this forum because of the amount of noise you contribute. thanks.mrvn wrote: ↑Sat Nov 20, 2021 4:37 pmThe requirement to have twice the ram as swap is a total myth and linux works perfectly without any swap. Linux (and many other unixes) has been used on disk-less systems using netboot that have no place to put swap at all for decades.source: Linux kernel dev for a decade.
The better comparison would be to drive around with a spare tire. It's good to have one. But driving without spare tire does not make the car drive any worse nor does it endanger your live.ptx0 wrote: ↑Sat Nov 20, 2021 4:48 pmedit: to add more info for anyone who actually wants to know why running without swap "works", it's because you're not running mission-critical systems and you've taken stock of the downsides of not having swap (e.g. behaving poorly under memory pressure) and decided that you don't require support from anyone else, because you know better than the Best Practices. this is similar to any precaution anyone takes - like having verified backups or wearing a seatbelt.
Anyway, the presence or lack of swap in linux has nothing to do with factorio leaking memory. That's all on factorio.
-
- Filter Inserter
- Posts: 275
- Joined: Thu Jun 01, 2017 12:05 pm
- Contact:
Re: [1.1.46]Non-blocking-save leaks memory
Does the issue go away if you use blocking save?
I tried reproducing the problem with vanilla factorio, and it doesn't happen. Non-blocking save at 1min intervals, game.speed=100. It constantly saves, but no memory is leaked.
If you want, you can share your save and we can help investigate.
I tried reproducing the problem with vanilla factorio, and it doesn't happen. Non-blocking save at 1min intervals, game.speed=100. It constantly saves, but no memory is leaked.
If you want, you can share your save and we can help investigate.
Re: [1.1.46]Non-blocking-save leaks memory
Hmmm. I don't find anything related. Can you provide a direct pointer?
Author of: Factorio Blueprint Decoder
Re: [1.1.46]Non-blocking-save leaks memory
I tried running factorio in valgrind. Even with a very small factory I only manage 10UPS. I run with and without non-blocking saving with a 1 minute interval and doing 1 save. This is the summary I got:
==28055== in use at exit: 6,909,950 bytes in 3,454 blocks
==28055== total heap usage: 4,592,592 allocs, 4,589,138 frees, 2,037,177,631 bytes allocated
==28055==
==28055== LEAK SUMMARY:
==28055== definitely lost: 186,009 bytes in 236 blocks
==28055== indirectly lost: 6,219,842 bytes in 617 blocks
==28055== possibly lost: 32,768 bytes in 1 blocks
==28055== still reachable: 471,331 bytes in 2,600 blocks
==28055== suppressed: 0 bytes in 0 blocks
==28055== Rerun with --leak-check=full to see details of leaked memory
==28055==
==28055== For counts of detected and suppressed errors, rerun with: -v
==28055== Use --track-origins=yes to see where uninitialised values come from
==28055== ERROR SUMMARY: 656400 errors from 19 contexts (suppressed: 2 from 2)
==1965== in use at exit: 392,750,652 bytes in 909,016 blocks
==1965== total heap usage: 4,182,156 allocs, 3,273,140 frees, 2,023,279,175 bytes allocated
==1965==
==1965== LEAK SUMMARY:
==1965== definitely lost: 306,241 bytes in 2,221 blocks
==1965== indirectly lost: 5,650,947 bytes in 799 blocks
==1965== possibly lost: 4,765,235 bytes in 13,971 blocks
==1965== still reachable: 382,028,229 bytes in 892,025 blocks
==1965== of which reachable via heuristic:
==1965== newarray : 690,976 bytes in 2,061 blocks
==1965== multipleinheritance: 79,368 bytes in 316 blocks
==1965== suppressed: 0 bytes in 0 blocks
==1965== Rerun with --leak-check=full to see details of leaked memory
==1965==
==1965== For counts of detected and suppressed errors, rerun with: -v
==1965== Use --track-origins=yes to see where uninitialised values come from
==1965== ERROR SUMMARY: 649112 errors from 20 contexts (suppressed: 2 from 2)
That's the forked process that does the saving and then exits. The kernel will have freed anything leaked in there.
==32243== HEAP SUMMARY:
==32243== in use at exit: 6,909,494 bytes in 3,450 blocks
==32243== total heap usage: 4,508,162 allocs, 4,504,712 frees, 1,929,692,502 bytes allocated
==32243==
==32243== LEAK SUMMARY:
==32243== definitely lost: 184,753 bytes in 235 blocks
==32243== indirectly lost: 6,158,803 bytes in 611 blocks
==32243== possibly lost: 95,063 bytes in 8 blocks
==32243== still reachable: 470,875 bytes in 2,596 blocks
==32243== suppressed: 0 bytes in 0 blocks
==32243== Rerun with --leak-check=full to see details of leaked memory
==32243==
==32243== For counts of detected and suppressed errors, rerun with: -v
==32243== Use --track-origins=yes to see where uninitialised values come from
==32243== ERROR SUMMARY: 188953 errors from 22 contexts (suppressed: 2 from 2)
The leak summary for the main processes is close enough to call it identical. So there is no big fat "memory leaks HERE" arrow. The "possibly lost" rises from 32kb to 90kb so that might include the leak. One would have to run valgrind with the full mem-check option, a larger save game and probably a number of autosaves to find any pattern in that mess.
I don't expect to have 0 leaks in any sizeable application anymore, modern code quality plain sucks, but the above is a lot. A lot of invalid memory accesses, most of them in the saving code it seems (the error count drops by 2/3rd with non-blocking save). If wube can reduce those 235 definitely lost blocks and then maybe it would become obvious where the leak in non-blocking saves is.
blocking saving
==28055== HEAP SUMMARY:==28055== in use at exit: 6,909,950 bytes in 3,454 blocks
==28055== total heap usage: 4,592,592 allocs, 4,589,138 frees, 2,037,177,631 bytes allocated
==28055==
==28055== LEAK SUMMARY:
==28055== definitely lost: 186,009 bytes in 236 blocks
==28055== indirectly lost: 6,219,842 bytes in 617 blocks
==28055== possibly lost: 32,768 bytes in 1 blocks
==28055== still reachable: 471,331 bytes in 2,600 blocks
==28055== suppressed: 0 bytes in 0 blocks
==28055== Rerun with --leak-check=full to see details of leaked memory
==28055==
==28055== For counts of detected and suppressed errors, rerun with: -v
==28055== Use --track-origins=yes to see where uninitialised values come from
==28055== ERROR SUMMARY: 656400 errors from 19 contexts (suppressed: 2 from 2)
async saving
==1965== HEAP SUMMARY:==1965== in use at exit: 392,750,652 bytes in 909,016 blocks
==1965== total heap usage: 4,182,156 allocs, 3,273,140 frees, 2,023,279,175 bytes allocated
==1965==
==1965== LEAK SUMMARY:
==1965== definitely lost: 306,241 bytes in 2,221 blocks
==1965== indirectly lost: 5,650,947 bytes in 799 blocks
==1965== possibly lost: 4,765,235 bytes in 13,971 blocks
==1965== still reachable: 382,028,229 bytes in 892,025 blocks
==1965== of which reachable via heuristic:
==1965== newarray : 690,976 bytes in 2,061 blocks
==1965== multipleinheritance: 79,368 bytes in 316 blocks
==1965== suppressed: 0 bytes in 0 blocks
==1965== Rerun with --leak-check=full to see details of leaked memory
==1965==
==1965== For counts of detected and suppressed errors, rerun with: -v
==1965== Use --track-origins=yes to see where uninitialised values come from
==1965== ERROR SUMMARY: 649112 errors from 20 contexts (suppressed: 2 from 2)
That's the forked process that does the saving and then exits. The kernel will have freed anything leaked in there.
==32243== HEAP SUMMARY:
==32243== in use at exit: 6,909,494 bytes in 3,450 blocks
==32243== total heap usage: 4,508,162 allocs, 4,504,712 frees, 1,929,692,502 bytes allocated
==32243==
==32243== LEAK SUMMARY:
==32243== definitely lost: 184,753 bytes in 235 blocks
==32243== indirectly lost: 6,158,803 bytes in 611 blocks
==32243== possibly lost: 95,063 bytes in 8 blocks
==32243== still reachable: 470,875 bytes in 2,596 blocks
==32243== suppressed: 0 bytes in 0 blocks
==32243== Rerun with --leak-check=full to see details of leaked memory
==32243==
==32243== For counts of detected and suppressed errors, rerun with: -v
==32243== Use --track-origins=yes to see where uninitialised values come from
==32243== ERROR SUMMARY: 188953 errors from 22 contexts (suppressed: 2 from 2)
The leak summary for the main processes is close enough to call it identical. So there is no big fat "memory leaks HERE" arrow. The "possibly lost" rises from 32kb to 90kb so that might include the leak. One would have to run valgrind with the full mem-check option, a larger save game and probably a number of autosaves to find any pattern in that mess.
I don't expect to have 0 leaks in any sizeable application anymore, modern code quality plain sucks, but the above is a lot. A lot of invalid memory accesses, most of them in the saving code it seems (the error count drops by 2/3rd with non-blocking save). If wube can reduce those 235 definitely lost blocks and then maybe it would become obvious where the leak in non-blocking saves is.
-
- Filter Inserter
- Posts: 275
- Joined: Thu Jun 01, 2017 12:05 pm
- Contact:
Re: [1.1.46]Non-blocking-save leaks memory
As said, we can't reproduce. Maybe share a save.Known Issues Thread wrote: Various issues such as freezing, long saving times or high memory usage when using experimental "Non-blocking saving". Please disable this feature before reporting a bug related to saving. -->
Re: [1.1.46]Non-blocking-save leaks memory
I know of no memory leaks in the game anywhere so I'd be interested in the full valgrind output if you can produce it.
EDIT: that's not true... I know of one in the standard library locale logic that has been reported regularly but there's nothing we can do about it and it leaks something like 40 bytes for the lifetime of the process and it's windows-only so valgrind would not see it.
EDIT: that's not true... I know of one in the standard library locale logic that has been reported regularly but there's nothing we can do about it and it leaks something like 40 bytes for the lifetime of the process and it's windows-only so valgrind would not see it.
If you want to get ahold of me I'm almost always on Discord.
Re: [1.1.46]Non-blocking-save leaks memory
We have valgrind running nightly and I just checked the latest run. It looks like basically every single thing it thinks is "leaked" is simply wrong. An example: a std::string stored in a static variable. Another example: a heap-allocated object immediatly put into a std::unique_ptr: "leaked" according to valgrind.
So I would be interested to see what it reports for you. It seems at least something is quite wrong with the version we are running thinking random stuff is leaked when I can say for 100% certain that they aren't being leaked.
So I would be interested to see what it reports for you. It seems at least something is quite wrong with the version we are running thinking random stuff is leaked when I can say for 100% certain that they aren't being leaked.
If you want to get ahold of me I'm almost always on Discord.
Re: [1.1.46]Non-blocking-save leaks memory
Looks like you reimplement your own malloc and that accesses memory outside the allocated blocks. You probably need to define exceptions for valgrind for this if it's metadata for the memory handling.
The memory leaks seem to be in SDL (just a touch) and your PNG loader. Attached the valgrind output from just starting factorio and quiting right away. Not even loading a game at all.
PS: I tried running a save game as bechmark for a bit and that gave me this:
==15128== HEAP SUMMARY:
==15128== in use at exit: 157,513 bytes in 32 blocks
==15128== total heap usage: 2,516,887 allocs, 2,516,855 frees, 536,868,140 bytes allocated
==15128==
==15128== LEAK SUMMARY:
==15128== definitely lost: 0 bytes in 0 blocks
==15128== indirectly lost: 0 bytes in 0 blocks
==15128== possibly lost: 0 bytes in 0 blocks
==15128== still reachable: 157,513 bytes in 32 blocks
==15128== suppressed: 0 bytes in 0 blocks
That's how I like my applications.
I can't reproduce the leaks on save the original post described. Might be my map is too small, has the wrong entities or it only happens with modded entities.
The memory leaks seem to be in SDL (just a touch) and your PNG loader. Attached the valgrind output from just starting factorio and quiting right away. Not even loading a game at all.
PS: I tried running a save game as bechmark for a bit and that gave me this:
==15128== HEAP SUMMARY:
==15128== in use at exit: 157,513 bytes in 32 blocks
==15128== total heap usage: 2,516,887 allocs, 2,516,855 frees, 536,868,140 bytes allocated
==15128==
==15128== LEAK SUMMARY:
==15128== definitely lost: 0 bytes in 0 blocks
==15128== indirectly lost: 0 bytes in 0 blocks
==15128== possibly lost: 0 bytes in 0 blocks
==15128== still reachable: 157,513 bytes in 32 blocks
==15128== suppressed: 0 bytes in 0 blocks
That's how I like my applications.
I can't reproduce the leaks on save the original post described. Might be my map is too small, has the wrong entities or it only happens with modded entities.
- Attachments
-
- valgrind.log
- (40.07 KiB) Downloaded 146 times
Re: [1.1.46]Non-blocking-save leaks memory
But we didn't/don't. We just call "new" (or std::make_unique in most cases) and the standard library implementation gets used (malloc by default).
If you want to get ahold of me I'm almost always on Discord.
Re: [1.1.46]Non-blocking-save leaks memory
Ah... SDL isn't our code and the PNG loader for linux is not used on Windows. Windows uses GDI+.
I did find and fix 1 memory leak with SDL in the past so it's not too surprising that there are others. It is C after all and there are no destructors so it's all up to who ever wrote the code to make sure everything malloc-ed is free-d and it seems that isn't the case.
If you want to get ahold of me I'm almost always on Discord.
Re: [1.1.46]Non-blocking-save leaks memory
If you didn't define your own malloc stuff then it might be real invalid reads and out of range memory accesses. Something to look into.
Re: [1.1.46]Non-blocking-save leaks memory
The PNG loader one I fixed in this latest release (I got it compiling on windows and tested it + fixed it). It just didn't get mentioned in the changelog. It was not a ongoing leak just it didn't release memory it allocated during startup so the whole process would use more than required runtime.
If you want to get ahold of me I'm almost always on Discord.