Page 1 of 1

Allow one more rendering thread than available cores

Posted: Mon Oct 14, 2019 9:10 pm
by Jon8RFC
I tested in-game and it seemed like it helped, so I tried to reduce the variables and made something in map editor, and it still helped by increasing both FPS and UPS. Zoom all the way out in the attached save.

When setting core affinity to between 3 and 7 cores on Windows 10 with an i7700, each time that I set the rendering threads to be one more than that, there was a noticeable gain in FPS and UPS. Beyond one extra rendering thread, there wasn't a noticeable difference on my end.

Except when the phsyical/logical cores will be heavily utilized (such as allowing only two cores, and they're maxed out already), increasing the rendering thread to one more than the available cores (setting affinity for the process) improves FPS and UPS. This was reliably reproducible.

I can't test with more than 8 rendering threads on my i7700 since Factorio won't allow it even if I set it within the config. I did disable hyperthreading from BIOS, and setting core affinity to 3 and rendering threads to 4 yielded an increase in performance there, as well.

The map is set to run at a fast game speed so that UPS changes are more easily noticeable. Zoom all the way out in the attached save. It's worth a test, right?


EDIT...Info I should've added:
With hyperthreading enabled:
7 core affinity and 8 rendering threads noticeably outperformed 7 core affinity and 7 threads
8 core affinity and 8 rendering threads was marginally better than 7 core affinity and 8 threads

With hyperthreading disabled via BIOS:
3 core affinity and 4 rendering threads noticeably outperformed 3 core affinity and 3 rendering threads
4 core affinity and 4 rendering threads was marginally better than 3 core affinity and 4 rendering threads

And just out of curiosity (because some people may be on weak 2/4 core cpus), when the cores were heavily taxed:
2 core affinity and 2 rendering threads outperformed 2 core affinity and 3 rendering threads
2 core affinity and 4 rendering threads was worse than 3 rendering threads

When there's headroom on the cores in use, one extra rendering thread was beneficial to both FPS and UPS, but more than one extra rendering thread wasn't beneficial. Using process explorer (not good to run while truly comparing in-game FPS/UPS), I could see that total cpu usage did increase as rendering threads increased even though the total load was spread out, and FPS/UPS improved. The amount of increased cpu usage wasn't linear, and beyond one additional rendering thread over cores available wasn't a performance improvement even though total cpu usage increased.

This would only be useful when there are lots of entities on the screen, as I originally noticed when someone accidentally dropped tens--or maybe hundreds--of thousands of items on the ground when downgrading many chests to wooden chests.

EDIT 2019-10-16:
Found an interesting discussion and this:
https://www.mcs.anl.gov/~huiweilu/pdf/ISC12-Lv.pdf

Which mentions the usefulness of Intel's memory management. No clue if AMD performs similarly, and maybe the Ryzen 3 has caught up since I often read how it's only as good as your ram. The gist from what I skimmed is that "data-intensive" workloads, rather than raw computations, benefit from Intel's memory management (as of Nehalem, 2008) with more threads in use. Allowing more concurrent memory requests as well as simply "dividing the workload" is what's beneficial, I've read. So, the bump in UPS could be explained, but the fact that Factorio calls them "rendering threads" has me wondering if that's just a name and those threads aren't strictly related to rendering, but data processing. I forgot if it was the discussion, a different article in the discussion, or that whitepaper, but it was also specifically mentioned that if cores aren't fully taxed, then there's often a benefit to running more threads for data-intensive workloads, at least on Intel as of back then.

So that aligns with that I observed. I think it's worth a shot to let Factorio use more threads than OS-reported cores, especially after reading that article and how often I've heard Rseding91 say that it's primarily a memory-bound game rather than cpu-bound.

Re: Allow one more rendering thread than available cores

Posted: Wed Oct 16, 2019 4:09 am
by Honktown
It's worth mentioning that the point of hyperthreading is to use more processor time while certain code is waiting on memory or other things, so using more threads than cores is almost the same thing as leaving hyperthreading on if the OS is cycling through threads and the threads have many/big memory requests.

I would say still you often get a performance boost, setting threads one more than available processors, unless it's something very processing heavy. The context switching cost is low for one more thread, and the more threads you have the more often one is waiting on something.

P.s. switching back and forth between non-hyperthreaded cores and hyperthreaded ones can be misleading. My first cpu was dual-core, and 4 threads would run at 80% speed per thread (25 seconds for each task) while using a single thread was 100% speed (20 seconds for a single task). Depending on where the bottleneck in a program is, you might see more performance from fewer threads. It's also task dependent. I sank my ups to 36 once in a normal game when using a pollution mod (enemies were nesting in groups of like, 200), and in the old wave defense it could sink to single digits :lol:

Re: Allow one more rendering thread than available cores

Posted: Sat Oct 19, 2019 8:46 am
by ssilk
More threads is also a sign of slow memory.
So I would be very careful. The fact is, that it runs faster on your computer. But there is no proof, that this works for every configuration. :)