Page 2 of 2

Re: [1.0.0] Non blocking save hangs

Posted: Fri Sep 25, 2020 6:17 pm
by sorahn
kovarex wrote:
Thu Sep 24, 2020 1:02 pm
We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.
That is super unfortunate to hear. I would wager that most of the dedicated servers running factorio are linux, and as the saves get bigger the process takes longer. When you mix in mods with that you have the possibility of the mods crashing the server, so you want to save frequently, but if you have blocking saving you spend as much time saving as you do playing.

So then if you start saving say once an hour instead, you risk a huge setback from a crash.

Certainly sad news if it gets pulled.

Is there anything we (the players) can do to help you guys to keep this feature?

Re: [1.0.0] Non blocking save hangs

Posted: Fri Oct 16, 2020 7:14 am
by ssilk
kovarex wrote:
Thu Sep 24, 2020 1:02 pm
We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.
Please don’t!

This feature is a relief for people who like to play with mega- and gigabases. In my current world game save takes nearly a minute. Totally unplayable without this.

It’s even a relief for those who play normal bases: the seconds of waiting time for save is an interruption in gameplay. How often did I loose a live, because when I’m in the middle of a biter nest the game saves?

It’s vice versa: Factorio needs this feature for Windows (it works also on mac not only Linux), too. :) It’s such important and such high gameplay-value!

Suggestions

viewtopic.php?f=6&t=84785 (Recommended)
viewtopic.php?f=6&t=61941
viewtopic.php?f=6&t=81156
viewtopic.php?f=6&t=56073

Re: [1.0.0] Non blocking save hangs

Posted: Fri Oct 16, 2020 2:30 pm
by Rseding91
ssilk wrote:
Fri Oct 16, 2020 7:14 am
It’s vice versa: Factorio needs this feature for Windows (it works also on mac not only Linux), too. :) It’s such important and such high gameplay-value!
It's not possible to implement on Windows. It's not a matter of "difficult" or "time consuming": it simply can not be done.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Fri Oct 16, 2020 8:19 pm
by sthalik
What about ZwCreateProcess?

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Fri Oct 16, 2020 8:25 pm
by Rseding91
sthalik wrote:
Fri Oct 16, 2020 8:19 pm
What about ZwCreateProcess?
Tried it; it doesn't work. Nothing works and the new process just sits using 0% CPU never executing code and or crashes immediately on trying to do anything.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Mon Oct 19, 2020 4:57 pm
by Squelch
I can understand the Windows problem, but why should that dictate whether the feature is removed from 'nix OS's?

WSL (Windows Subsystem for Linux) and/or Docker allow us to run a headless server instance in the background as a workaround if not actually running on a Linux Xserver for example.

Please don't remove non-blocking saves?

PS. For the record, I have never encountered the stuck/defunct/zombie save process problem after many many hours. I do run mods, one major (Py suite) and a few smaller QoL. Should I ever run into the problem, I would be all over it to find the cause or a solid repro.

This problem does not seem that common at all, so to remove non-blocking saves will be like throwing the proverbial baby out with the bathwater.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Mon Oct 19, 2020 7:52 pm
by Rseding91
Squelch wrote:
Mon Oct 19, 2020 4:57 pm
I can understand the Windows problem, but why should that dictate whether the feature is removed from 'nix OS's?

WSL (Windows Subsystem for Linux) and/or Docker allow us to run a headless server instance in the background as a workaround if not actually running on a Linux Xserver for example.

Please don't remove non-blocking saves?

PS. For the record, I have never encountered the stuck/defunct/zombie save process problem after many many hours. I do run mods, one major (Py suite) and a few smaller QoL. Should I ever run into the problem, I would be all over it to find the cause or a solid repro.

This problem does not seem that common at all, so to remove non-blocking saves will be like throwing the proverbial baby out with the bathwater.
The problem is people reporting issues to us and taking developer time when there are no reproduction steps or ability for us to debug/fix the issues. It's just a time sink.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Mon Oct 19, 2020 8:45 pm
by Squelch
Rseding91 wrote:
Mon Oct 19, 2020 7:52 pm
The problem is people reporting issues to us and taking developer time when there are no reproduction steps or ability for us to debug/fix the issues. It's just a time sink.
Then please crowdsource the problem? The Factorio community, as a whole, are pretty competent at problem solving (it is the nature of the game after all).

The feature is clearly identified as "Experimental", and as such, does not bring any guarantees or support. However, I would hazard a guess that there are many games out there with the option enabled, and have not encountered the same issues. This would suggest that there might be something in the environment on a subset of systems that do encounter the problem that could then be identified by gathering more data. Do you have those metrics available to you? ie how many games have the option enabled, and how many crash reports attributed to the feature?

I am more than happy to volunteer my time in attempting to identify and collate that information to come up with a reliable reproduction. Other areas of the game have already benefited from user investigation on behalf of, and for, the development team, so as a result, valuable development time investigating the problem can be spent elsewhere until such a time that a more complete picture is available.

Some current examples:
[Oxyd] [0.18.28] Stuck on waiting to save map - Directly pertinent to this issue.
Factorio flickers heavily all of the sudden

What I, and I hope some other users are asking, is to allow us to continue with this experimental feature for a while longer, and without the expectation of developer time spent on it until we can identify a solid reproduction or cause?

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Tue Oct 20, 2020 2:48 am
by ferromagus
Rseding91 wrote:
Mon Oct 19, 2020 7:52 pm
The problem is people reporting issues to us and taking developer time when there are no reproduction steps or ability for us to debug/fix the issues. It's just a time sink.
I think in my case it's just a collision between the auto-save mechanic in the game and a systemd timer that is periodically sending a /server-save command to the server running in a screen session right before taking a btrfs snapshot now that I think about it.

I also still have the coredump, that should help debugging the problem and give insight into the state of the server when it was hanging. I would happily provide it, through discord or email maybe, if desired. At least I was able to get the stack traces from the threads out of it with gdb. I don't want to be a burden to anyone, I just thought it might be insightful and useful feedback to an experimental feature and thus offered to provide the coredump. The road to hell is paved with good intentions I guess.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Wed Oct 21, 2020 10:23 pm
by rafasc
Rseding91 wrote:
Mon Oct 19, 2020 7:52 pm
The problem is people reporting issues to us and taking developer time when there are no reproduction steps or ability for us to debug/fix the issues. It's just a time sink.
Isn't that true for all bugs?
I can understand If Wube as a company has decided that the cost-benefit of expending time to fix this experimental, -nix exclusive, feature is not worth it; but please don't put the blame on your users.

You guys gained excellent reputation about caring and "sinking time" on fixing esoteric bugs that the majority of people would never run into. The reasoning of "We are removing the experimental feature because users file bug reports about it" feels peculiar.

Start a multiplayer game using the steam version where steam cloud and Blueprint sync is enabled.
My crashes crashes went away when since I've disabled those features. And come back when I re-enable them.

I can reproduce it, not deterministically, but I've never seen it take more than 15 save attempts to crash.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Sat Oct 24, 2020 7:27 am
by ssilk
Rseding91 wrote:
Fri Oct 16, 2020 8:25 pm
sthalik wrote:
Fri Oct 16, 2020 8:19 pm
What about ZwCreateProcess?
Tried it; it doesn't work. Nothing works and the new process just sits using 0% CPU never executing code and or crashes immediately on trying to do anything.
I hear between the lines, that it’s scraping at your programmers honor. But nobody can be perfect in everything. 8-)

So I would go so far and say: then wube needs to hire a specialist. O.K. That is me leaning out of the window. Sorry for that.

Because the point is: This feature is really a game-changer. When it is working. ;)

And so when wube is willing to invest that implementation (and I think this could become very expensive), there are many things that could increase the chance:
- explain the problem. For example in the FFF. It has by minimum a political and a technical aspect. I would not mix them. Yes there will be hundreds of posts, and everyone knows it better how to implement this, but it’s part of that investment.
- as said: crowdsource the problem, means: make this option easier to turn on (but with a fat warning), collect more logs, if that has been turned on. More errors - better chance to find the problem. If that feature is just hidden behind, it will be used only by experienced people, that are happy, that it just works. ;)
- ask actively for help (fff)
- search actively for people, that have deep knowledge into that or have already developed it for another software.
- search for examples where something like this is already working. For example: Reason, my beloved digital audio workstation, is able to do something similar (saving gigabytes of samples, while working/playing with them). I’m sure there are many more examples.
- and there is surely more that can be done.

But even if wube won’t invest into this: please don’t remove this. It’s much better to have it running with some bugs, than without. I’m normally saying the opposite, but this case is different!

Sorry for those, that cannot use it, but it’s experimental.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Sat Oct 24, 2020 7:53 am
by ptx0
just ignore the bug and leave the async save feature in-tact, or hire one of the devs like myself who have submitted their resume, are competent and capable of fixing Linux issues.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Mon Nov 02, 2020 12:37 pm
by kovarex
Moving to pending, as there is no one able and willing to fix it at the moment.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Mon Nov 02, 2020 4:54 pm
by Squelch
kovarex wrote:
Mon Nov 02, 2020 12:37 pm
Moving to pending, as there is no one able and willing to fix it at the moment.
Thank you for giving it a stay of execution and not performing a coup de grâce.

I think identifying the issue properly would help immensely. I play on both native Linux client or server on a desktop machine, as well as a Win10 client using a local server via WSL and Docker on my laptop, all with non-blocking saves enabled. I haven't encountered this problem at all after quite some period. This is all non Steam, and LAN only which may be factors however.

That said, there does seem to be a problem with some setups that is triggering these problems, so I am currently collating as many of the reports and suggested causes and trying to recreate the crashes as I'm able. So far nothing sticks out as a common denominator, but there must be one.

Re: [Oxyd] [Linux/Mac] non-blocking save crashes

Posted: Wed Nov 18, 2020 6:09 pm
by ptx0
kovarex wrote:
Mon Nov 02, 2020 12:37 pm
Moving to pending, as there is no one able and willing to fix it at the moment.
awaiting response from your PM.

Re: [1.0.0] Non blocking save hangs

Posted: Tue Nov 24, 2020 9:09 pm
by movax20h
kovarex wrote:
Thu Sep 24, 2020 1:02 pm
We will fix it by removing the non-blocking save feature, it is just trouble that works only on linux and is not worth it.
:(

Please no.

I love this feature, and it works fine for me for very long time on my Linux machine. It really makes working with big bases and big saves way more pleasurable. I do autosave every 5 minutes, and the save takes about 20 seconds (I have very fast machine), but on some other people machines it could be a minute. Non-blocking save really solve this issue. I do save every 5 minutes, just to not loss progress, but also as a snapshot for factorio maps (I do have script that archives every autosave automatically with timestamps, so I have 100s of autosaves now).

If anybody of people experiencing the crashes with non-blocking save, could share all the details: log, save, mods, info how often it does happen. I can test it on my machine.