Potentially easy chest-behavior performance optimization

Post your ideas and suggestions how to improve the game.

Moderator: ickputzdirwech

dragontamer5788
Fast Inserter
Fast Inserter
Posts: 154
Joined: Fri Jul 15, 2016 1:44 am
Contact:

Potentially easy chest-behavior performance optimization

Post by dragontamer5788 »

TL;DR
A discussion on UPS on Reddit suggested that inserters -> chests are a performance bottleneck in Megabases.
ichaleynbin wrote:The only thing I'd mention is that if you're working on that scale, one of the bigger UPS penalties isn't inserters, so much as inserting into chests, because each chest insertion has to iterate all of the currently present stacks. If your chests are doing anything, then they're hurting your UPS.
What ?
Consider the problem where a chest has iron ore stacks of [10, 0, 0, 50, 0...]. (10 iron ore in the first stack, two empty stacks, 50 ore in the 4th stack, and the rest of the chest as empty]. It seems like the game currently walks all of the "stacks" every insertion.

As a suggestion, maybe the game should only recalculate chest-stacks under two conditions: when a "non-common" item is inserted into the chest (ie: copper plates are inserted into a chest normally filled with iron ore), or when the player opens a chest. In all other cases, it seems like two counts: "Common Item Inserted" and "Common Item Removed" are the only values needed to keep track of. This would reduce the number of "internal chest organization" calls and potentially optimize the game for Megabases a bit better.

For example: a chest with iron ore stacks of [10, 0, 0, 50, 0, ...] has 16 iron ore removed from it, then 100 iron ore added to it. This should result in an internal stack of [50, 50, 0, 44, 0, ...] afterwards. The other ordering (100 ore added then 16 removed) would be equivalent to 84 items added (resulting in an internal configuration of [50, 34, 0, 50]).

If behavior needs to be 100% kept with previous versions, you can potentially recalculate stacks whenever a "new" item gets inserted into the chest. For example, if 16 iron ore is removed, then 100 iron ore is added, but a copper-plate was added "somewhere in between", you'd either have [50, 1-copper, 50, 44] or [50, 50, 1-copper, 44], depending on the ordering. I'd expect that this difference wouldn't affect most people's bases (most "mixed chests" are either logistic chests or are interfaced with filter-inserters). So this edge case can arguably remain ill-defined for performance reasons. Otherwise, you can simply recalculate stacks whenever item insertions change, since I expect this case to be relatively rare (aside from logistic requester chests).
Why ?
At least one megabase player has started to avoid chests, due to perceived UPS problems. Further testing is probably necessary, but I feel like the devs of this wonderful game should be aware of the issue. It seems like an easy fix (although I know that in reality, nothing is ever easy in programming). Well, easy compared to belt-optimizations anyway. But I feel like letting yall know about it might be useful.

In short: I have a performance suggestion that might be relatively easy to accomplish. Thanks for hearing me out!
mrvn
Smart Inserter
Smart Inserter
Posts: 5952
Joined: Mon Sep 05, 2016 9:10 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by mrvn »

I don't thing that would work as long as stacks of the same item in chests don't merge.

Consider a chest that has all stacks taken by 1 iron plate each. Now someone wants to insert a copper plate but no space is left. Now a second inserter adds 1 iron plate and a third removes 1 iron plate. Can the copper plate be inserted now or not? Without considering what each addition or removal does to each stack you don't know when stacks become free.

On the other hand the chest could keep a lookup table or cache for what item goes to or comes from which stack. Say the last 4 item types used with the chest. If the last wood was added to stack 17 then the next wood can go there too until the stack is full. Then you search the chest for the next stack of wood to fill. Same for emptying.
Tekky
Smart Inserter
Smart Inserter
Posts: 1040
Joined: Sun Jul 31, 2016 10:53 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by Tekky »

In my opinion, it may be worth thinking about getting rid of stacks in containers altogether. Instead, a container should only have one single "free space" value that can be read by the circuit network. This would make handling of containers containing different items by the circuit network a lot simpler, since it would be a lot easier to determine how much space is available for a specific item.

However, this would be a major change to the game, especially if that also meant that the player's inventory would no longer work with stacks. But if the game's interface is being completely redesigned anyway for 0.17, now may be the time to decide whether stacks should remain in the game.
User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 5211
Joined: Tue Jul 12, 2016 9:03 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by eradicator »

Tekky wrote:In my opinion, it may be worth thinking about getting rid of stacks in containers altogether. Instead, a container should only have one single "free space" value that can be read by the circuit network. This would make handling of containers containing different items by the circuit network a lot simpler, since it would be a lot easier to determine how much space is available for a specific item.

However, this would be a major change to the game, especially if that also meant that the player's inventory would no longer work with stacks. But if the game's interface is being completely redesigned anyway for 0.17, now may be the time to decide whether stacks should remain in the game.
O_o. Stacks are merely a concept to measure the "size" of an item. If you wanted to remove stacks you'd have to replace that measurement with some other measurement, like per-item-volume or per-item-weight. Obviously that would mean a complete rewrite of most of the engine. So "unlikely" to happen i'd say.
dragontamer5788
Fast Inserter
Fast Inserter
Posts: 154
Joined: Fri Jul 15, 2016 1:44 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by dragontamer5788 »

mrvn wrote:I don't thing that would work as long as stacks of the same item in chests don't merge.

Consider a chest that has all stacks taken by 1 iron plate each. Now someone wants to insert a copper plate but no space is left. Now a second inserter adds 1 iron plate and a third removes 1 iron plate. Can the copper plate be inserted now or not? Without considering what each addition or removal does to each stack you don't know when stacks become free.
On the contrary. Copper plate becomes blocked. Second inserter adds +1 to common_items_inserted, while the 3rd inserter -1 to common_items_inserted. Copper plate remains blocked.

This is equivalent to:

Copper plate becomes blocked. 2nd inserter adds +1 to the first stack. 3rd inserter -1 to the first stack. Copper plate remains blocked.
On the other hand the chest could keep a lookup table or cache for what item goes to or comes from which stack. Say the last 4 item types used with the chest. If the last wood was added to stack 17 then the next wood can go there too until the stack is full. Then you search the chest for the next stack of wood to fill. Same for emptying.
I'd assume that the chest logic is incredibly well optimized right now (considering that many people have achieved 1kspm or even 5kspm bases). If I were to assume the limitation, its that Factorio is primarily latency-bound in code: that is the CPU spends most of its time waiting for RAM to respond to small-requests of data.

Having an alternate, very common, code path look at only 2-variables per chest (common item inserted, common item removed), would likely fit inside of a single cache line. Which would improve cache locality, and maximize performance in a latency perspective. Which is why I suggest this particular methodology. I don't think that adding lookup tables would help, and chests likely fit within a 64-byte cacheline (or maybe at worst: a 128-byte cacheline pair which Intel L2 Cache is optimized to fetch from main memory). Although, if chests are 48-bytes (1-byte per stack and 48 stacks), maybe they're already fitting inside of cache-lines as they are.

2-variables per chest may require some advanced high-performance code (structure of arrays pattern) to see performance benefit. Which is why I wrote my post in a relatively ambiguous way. I'm really not "sure" if it will be easy to improve performance here. But the possibility exists, so its probably worth mentioning.
eradicator wrote:
Tekky wrote:In my opinion, it may be worth thinking about getting rid of stacks in containers altogether. Instead, a container should only have one single "free space" value that can be read by the circuit network. This would make handling of containers containing different items by the circuit network a lot simpler, since it would be a lot easier to determine how much space is available for a specific item.

However, this would be a major change to the game, especially if that also meant that the player's inventory would no longer work with stacks. But if the game's interface is being completely redesigned anyway for 0.17, now may be the time to decide whether stacks should remain in the game.
O_o. Stacks are merely a concept to measure the "size" of an item. If you wanted to remove stacks you'd have to replace that measurement with some other measurement, like per-item-volume or per-item-weight. Obviously that would mean a complete rewrite of most of the engine. So "unlikely" to happen i'd say.
I agree.

Still, a UPS-friendly "single-stack" chest however would solve the problem however. Single-stack chests (UPS-optimized) would be useful for ensuring chest-to-chest Stack inserter mechanics, simple circuits (ie: train balancers) among other things. And they would provide the opportunity for the programmers to really unleash huge UPS performance boosts on a specially designed item. Furthermore, it'd provide circuit players another item to toy with.

So from my perspective, maybe it'd be a worthy idea to consider a UPS-friendly "tiny chest", which fits in 1-byte and is processed by SIMD instructions by the game engine. Both GCC and Visual Studio output "autovectorization", so you just need to lay out the data correctly and decode some compiler text to get SIMD-acceleration these days.

Megabase planners don't like buffering anyway. Megabase players only use chests because its an efficiency hack on stack inserters. Stack-size 1 chests would keep the efficiency hack.
User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 5211
Joined: Tue Jul 12, 2016 9:03 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by eradicator »

dragontamer5788 wrote: Still, a UPS-friendly "single-stack" chest however would solve the problem however. Single-stack chests (UPS-optimized) would be useful for ensuring chest-to-chest Stack inserter mechanics, simple circuits (ie: train balancers) among other things. And they would provide the opportunity for the programmers to really unleash huge UPS performance boosts on a specially designed item. Furthermore, it'd provide circuit players another item to toy with.
From a game design perspective i don't like the idea of adding such an ultra-specialized item, we already have 3 types of chest of which realistically only two are used (nobody uses iron chests) and wooden chests are only used because in the early game you have no other options. Adding a special mod-only "ups-optimized-chest" doesn't really work either because there's too many people obsessed over staying "clean vanilla" (i seriously wish that particular obsession would just *poof* away as if it never ever existed anywhere).

But. What about optimizing inserter interaction with chests that are limited ("red bar") to only one stack? Inserters could in that case directly interact with the stack instead of dealing with the whole container. This would break their ability to pick items from the "red area" of chests, but there have been other compromises in favor of performance before. And i always thought it was a bit "wacky" that inserters can take items from the red area but not put any there, because i always saw the "red area" as a sort of "don't interact with this"-zone.

Hm. But now that's special-case optimization too and doesn't help with the most common scenario of train-unloading at all. :cry:.
User avatar
mrudat
Fast Inserter
Fast Inserter
Posts: 248
Joined: Fri Feb 16, 2018 5:21 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by mrudat »

How about something like the following.

A chest tracks the number of completely empty stacks.

It also has a list of:
  • item
  • quantity
  • used_stacks - equal to ceil(quantity/stack_size[item])), faster to store or derive?
If you try adding one more item than will fit in the number of stacks currently used, it claims an empty stack (if available) and keeps going.
If you remove enough items, it returns an empty stack to the counter.

If we only ever have one item type in a chest, we only ever update two, perhaps three integers per cycle.

We only display the items in stacks when a user opens a chest to see what's in it, but we do track the number of stacks actually occupied, so stack size still matters.

Edit: It should also mean that a 20000-stack chest won't make the game slow down either, as cost of an interaction with a chest would depend only on the number of distinct types of item in the chest, not on how many there are.
mrvn
Smart Inserter
Smart Inserter
Posts: 5952
Joined: Mon Sep 05, 2016 9:10 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by mrvn »

dragontamer5788 wrote:
mrvn wrote:I don't thing that would work as long as stacks of the same item in chests don't merge.

Consider a chest that has all stacks taken by 1 iron plate each. Now someone wants to insert a copper plate but no space is left. Now a second inserter adds 1 iron plate and a third removes 1 iron plate. Can the copper plate be inserted now or not? Without considering what each addition or removal does to each stack you don't know when stacks become free.
On the contrary. Copper plate becomes blocked. Second inserter adds +1 to common_items_inserted, while the 3rd inserter -1 to common_items_inserted. Copper plate remains blocked.
Except what actually happens in the game iirc is that the inserted plate merges into the first stack while the removed plate is taken from the last stack. Thereby freeing a slot for copper plates.
mrvn
Smart Inserter
Smart Inserter
Posts: 5952
Joined: Mon Sep 05, 2016 9:10 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by mrvn »

dragontamer5788 wrote:
On the other hand the chest could keep a lookup table or cache for what item goes to or comes from which stack. Say the last 4 item types used with the chest. If the last wood was added to stack 17 then the next wood can go there too until the stack is full. Then you search the chest for the next stack of wood to fill. Same for emptying.
I'd assume that the chest logic is incredibly well optimized right now (considering that many people have achieved 1kspm or even 5kspm bases). If I were to assume the limitation, its that Factorio is primarily latency-bound in code: that is the CPU spends most of its time waiting for RAM to respond to small-requests of data.

Having an alternate, very common, code path look at only 2-variables per chest (common item inserted, common item removed), would likely fit inside of a single cache line. Which would improve cache locality, and maximize performance in a latency perspective. Which is why I suggest this particular methodology. I don't think that adding lookup tables would help, and chests likely fit within a 64-byte cacheline (or maybe at worst: a 128-byte cacheline pair which Intel L2 Cache is optimized to fetch from main memory). Although, if chests are 48-bytes (1-byte per stack and 48 stacks), maybe they're already fitting inside of cache-lines as they are.
I think you grossly underestimate the amount of memory needed.

Stacks can be larger than 256 so a single byte per stack won't do. I assume a stack can also be larger than 65536 so that would be 4 bytes per stack. And each stack needs an item type too, another 4 bytes. You also have a v-table and RTTI pointer (16 byte), a map location (2x 4 byte), a size (2-4 byte). So a wooden chest with 16 stacks would be 16 + 8 + 4 + 16 * 8 = 156 bytes. Oh, and add the max hit points, health, last user, clip rectangle, ... You are nowhere near a cache line for the whole chest.

Similar your "common item inserted, common item removed" wouldn't be 2 variables (and why would you want 2 variables when all you need is the current count?). They would be an array of 2 variables, one pair per type. And you still have to search through them all to see if an item isn't in the chest.
User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 5211
Joined: Tue Jul 12, 2016 9:03 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by eradicator »

mrvn wrote: Stacks can be larger than 256 so a single byte per stack won't do. I assume a stack can also be larger than 65536 so that would be 4 bytes per stack.
Just for reference: Stack size is a long integer (2^31)-1. (Edit: I don't know what data type that is :P)
Last edited by eradicator on Thu Mar 22, 2018 4:54 pm, edited 1 time in total.
mrvn
Smart Inserter
Smart Inserter
Posts: 5952
Joined: Mon Sep 05, 2016 9:10 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by mrvn »

eradicator wrote:
mrvn wrote: Stacks can be larger than 256 so a single byte per stack won't do. I assume a stack can also be larger than 65536 so that would be 4 bytes per stack.
Just for reference: Stack size is a long integer (2^31)-1.
Long int on amd64 linux would be 64bit while it's 32bit on i386 linux (now no longer supported by factorio). I'm not sure what amd64 windows has for long int but windows for alpha used 32bit long int. I highly doubt the factorio devs used long int for stack size. It's int, or hopefully (u)int32, making it clear what size is intended.

But it's good to have iot confirmed that stack size isn't limited to 65535 or even 32767.
dragontamer5788
Fast Inserter
Fast Inserter
Posts: 154
Joined: Fri Jul 15, 2016 1:44 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by dragontamer5788 »

mrvn wrote:
dragontamer5788 wrote:
On the other hand the chest could keep a lookup table or cache for what item goes to or comes from which stack. Say the last 4 item types used with the chest. If the last wood was added to stack 17 then the next wood can go there too until the stack is full. Then you search the chest for the next stack of wood to fill. Same for emptying.
I'd assume that the chest logic is incredibly well optimized right now (considering that many people have achieved 1kspm or even 5kspm bases). If I were to assume the limitation, its that Factorio is primarily latency-bound in code: that is the CPU spends most of its time waiting for RAM to respond to small-requests of data.

Having an alternate, very common, code path look at only 2-variables per chest (common item inserted, common item removed), would likely fit inside of a single cache line. Which would improve cache locality, and maximize performance in a latency perspective. Which is why I suggest this particular methodology. I don't think that adding lookup tables would help, and chests likely fit within a 64-byte cacheline (or maybe at worst: a 128-byte cacheline pair which Intel L2 Cache is optimized to fetch from main memory). Although, if chests are 48-bytes (1-byte per stack and 48 stacks), maybe they're already fitting inside of cache-lines as they are.
I think you grossly underestimate the amount of memory needed.

Stacks can be larger than 256 so a single byte per stack won't do. I assume a stack can also be larger than 65536 so that would be 4 bytes per stack. And each stack needs an item type too, another 4 bytes. You also have a v-table and RTTI pointer (16 byte), a map location (2x 4 byte), a size (2-4 byte). So a wooden chest with 16 stacks would be 16 + 8 + 4 + 16 * 8 = 156 bytes. Oh, and add the max hit points, health, last user, clip rectangle, ... You are nowhere near a cache line for the whole chest.
While the chest itself has HP, location, etc. etc. None of those memory variables are changing in most cases. As such, I'd expect the CPU to optimize the entire event into a single cache line (potentially: if the data is properly aligned and whatnot). The 8-byte pointer for V-Table and RTTI is probably needed, so that's a point there. But that sort of stuff cannot be optimized away easily (for those who want to know how to optimize it away... well... there's this pattern). So its effectively neutral.

If you don't access a memory location, then it doesn't get stored into the cache. If it doesn't get stored into the cache, the memory controller never even transfers the data to the CPU. Modern CPUs are grossly efficient and cut corners magically, even in cases like this where tons of variables take up memory. But yes, you're definitely right about V-table and RTTI.

With that said: most stack sizes in the game seemed to be 200 or less. If stack sizes can be larger than 1-byte, then that definitely means that easy memory optimizations can be had here. The less memory transferred between the memory controller / DDR4 and the CPU, the better. "One cache line" is the magic sauce here. Modern CPUs transfer 64-bytes at a time from RAM, so the ideal case is to get all of the logic to work off of a single memory transaction (aka: a singular, aligned 64-byte memory location).

If you have a ton of "rarely used data", that's fine. The cache will ignore that rarely used data (hp, last used player, etc. etc.). Work hard to minimize cache-line pulls from main memory: that's the "expensive" part. As it stands right now, chest-traversals likely require more than 64-bytes, which means multiple times the CPU would "talk" to RAM (and probably sit around waiting)
Similar your "common item inserted, common item removed" wouldn't be 2 variables (and why would you want 2 variables when all you need is the current count?). They would be an array of 2 variables, one pair per type.
No, you don't need an array. But you're right that a 3rd variable is needed: a variable that describes the current "common item". If your chest has been inserting iron ore over and over again... it needs to "fully recalculate" the stacks as soon as a copper-ore enters the picture.

My 2-variables used assumes that the CPU / memory system is being smart with caching. Yes, all the variables are being stored in memory, but this game seems quite efficient at that. The optimization that's available here is to reduce the number of memory-transfers between the cache system and the RAM. Now, these two variables should be within the same 64-byte cache line to get the best effect (and sure, put it on the same cache line as the V-Table / RTTI), but what I'm proposing here is a simple optimization to handle a common case, so that the game doesn't have to perform a costly array traversal each time an inserter operates on a chest.

I feel like 2-variables are the minimum needed to preserve current chest behavior. But if I'm wrong about this, maybe there's a way to do it with just 1-variable. But 2-variables is the best I personally can come up with.
And you still have to search through them all to see if an item isn't in the chest.
How many chests in your base handle more than one item? I'd bet less than 1% of them. Most of my chests are in train stations dedicated to a singular resource. That's why I think this is a great case for an optimization. If the "common code path" is grossly optimized to only use 2-variables, then you'll cut down on memory usage, especially compared to scanning through all of the stacks of a chest.

EDIT: Oh right, requester chests for assembly machines. I guess those are pretty common. But ignore requester chests :-). The provider chests on the "output" of assembly machines generally speaking deal with just one item though, so those would benefit for sure.
mrvn wrote:
dragontamer5788 wrote:
mrvn wrote:I don't thing that would work as long as stacks of the same item in chests don't merge.

Consider a chest that has all stacks taken by 1 iron plate each. Now someone wants to insert a copper plate but no space is left. Now a second inserter adds 1 iron plate and a third removes 1 iron plate. Can the copper plate be inserted now or not? Without considering what each addition or removal does to each stack you don't know when stacks become free.
On the contrary. Copper plate becomes blocked. Second inserter adds +1 to common_items_inserted, while the 3rd inserter -1 to common_items_inserted. Copper plate remains blocked.
Except what actually happens in the game iirc is that the inserted plate merges into the first stack while the removed plate is taken from the last stack. Thereby freeing a slot for copper plates.
I'll test this later. But my understanding is that inserters remove from the first stack available. Therefore: inserted plate merges into the first stack, then the removed plate is taken from the first stack. That's why I suggest two variables, because two variables are needed to preserve current behavior.
mrvn
Smart Inserter
Smart Inserter
Posts: 5952
Joined: Mon Sep 05, 2016 9:10 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by mrvn »

Ahh, so you want to optimize the case of a chest only actively handling one item. There might be other stuff in there but it isn't removed or inserted frequently. That should work well for train stations and such where you have the same item type inserted over and over.

But that is a special case of my cache table, with a size of 1 item. I would make it a bit bigger so it spans e.g. exactly one cache line. Then most access to the chest uses that cache line and a full search is uncommon even if you have 2, 3 or 4 items. For example a requester chest for an assembler building something. 4 Items will probably cover most things.
Rseding91
Factorio Staff
Factorio Staff
Posts: 14413
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by Rseding91 »

The container entity is size is 352 bytes + the dynamically allocated content (the item stacks, logistic information and so on). An inventory is a vector of ItemStack (24 bytes each). After the item stack optimization (https://www.factorio.com/blog/post/fff-198) most item stacks have no dynamic alocation which makes iterating and checking against them incredibly fast.

Inserters inserting into or taking out of a chest virtually never shows up as being slow. The only time that has ever happened is when mods change an inventory size to something crazy (500 to 2000 slots) but no matter what you do at that scale it's going to be slow.

Chests are so well optimized that literally the only time I've seen them show up as a slow spot when profiling saves in my 3+ years of working on Factorio is when someone was using 2000 slot modded chests with 20+ inserters putting into and out of 50+ chests at once just to show it was slow.
At least one megabase player has started to avoid chests, due to perceived UPS problems.
Then that player has no idea what they're doing. Chests aren't slow - in fact if nothing is interacting with them they have zero CPU cost.
If you want to get ahold of me I'm almost always on Discord.
dragontamer5788
Fast Inserter
Fast Inserter
Posts: 154
Joined: Fri Jul 15, 2016 1:44 am
Contact:

Re: Potentially easy chest-behavior performance optimization

Post by dragontamer5788 »

Rseding91 wrote:Chests are so well optimized that literally the only time I've seen them show up as a slow spot when profiling saves in my 3+ years of working on Factorio is when someone was using 2000 slot modded chests with 20+ inserters putting into and out of 50+ chests at once just to show it was slow.
Well, there we have it folks! Hard data beats any theory. So thanks for taking the time to respond! If its not a concern based on your profiling, then its probably not a concern at all.

BTW: could you write a blogpost / friday factorio article about the current state of profiling? It'd be nice to know what is UPS friendly or not. There's a lot of conflicting reports out there (Bots vs Belts... unfortunately... has a lot of UPS concerns. Especially with the new belt-optimizations, I'm not sure if there have been any UPS studies). There have been so many UPS optimizations in 0.16 that its hard to keep track of all of them.
buggy123
Long Handed Inserter
Long Handed Inserter
Posts: 64
Joined: Sat May 13, 2017 11:53 pm
Contact:

[Optimization] Improvement for container insertion logic

Post by buggy123 »

tldr
The logic for inserting items into a storage container (chest, car, etc) is simple, but it's not the most performant. Experienced players deliberately limit the size of chests to improve UPS on megabases, and mods that add warehouses or other very large containers can have a extreme performance hit if used poorly.

I've given this a bit of thought, and I've come up with a small and easy change that should substantially improve performance.


As-is, inserting/removing items from a container iterates over every inventory slot (i'm assuming it's a array internally) until it finds the slot with the right criteria. This is ideal if we're assuming we have no idea of what the contents are or what we're putting in.

But in most cases:
  • Chests contain only one or a few item types.
  • They're filled consistently (all stacks are full except for the one near the 'end' of the container/array).
  • Only inserters or bots are adding/removing items on a regular basis.
This combines to lead to some simple and straightforward behavior: the first N slots are filled with full stacks of the item, and inserters only add to/remove from the last stack in the container. So store the last modified slot/index, and start iterating from there instead of the start of the container.

For inserting items, this works perfectly, and it goes from O(N) complexity to O(1).

For removing items, this is trickier because the iteration is in the opposite direction compared to insertion. In a 'ideal' chest as above, this doesn't matter, but it's possible for a chest to end up missing stacks near the 'start' so inserting and removing acts at completely different locations. I suppose you could just store a second index number for removing items.

Finally, filter inserters, bots, players, and other interactions that might 'skip' over stacks are tricky. Players don't play by the same 'rules' as inserters so they should probably reset the 'last modified' indexes whenever they touch a container at all (the performance hit would be insignificant anyway). Maybe filter inserters could store the last modified item type, but that might be adding too many checks and too much extra data to be performant.
User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

Re: [Optimization] Improvement for container insertion logic

Post by ssilk »

This is a typical case, where adding an index would make things faster. In special cases. 8-)

A) don’t you think the last slot of a chest isn’t already cached? Why? :roll:

B) general usage of index would lead to a very big index compared to the slots if there are many different things in a chest. So the algorithm needs to lookup the right index first and then lookup the slot. Instead of looking for the first slot with this item.

C) it’s always a bad idea (not only for Factorio) to take a special case and improve around that.

D) Can you prove, that your algorithm is always — especially in the most used cases — faster than yet.

In other words: I don’t think it makes any sense, this is already highly optimized. But you have luck, because I have too less implementation knowledge to move it to won’t implement. ;)
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...
buggy123
Long Handed Inserter
Long Handed Inserter
Posts: 64
Joined: Sat May 13, 2017 11:53 pm
Contact:

Re: [Optimization] Improvement for container insertion logic

Post by buggy123 »

ssilk wrote: Fri May 13, 2022 5:36 am This is a typical case, where adding an index would make things faster. In special cases. 8-)

A) don’t you think the last slot of a chest isn’t already cached? Why? :roll:

B) general usage of index would lead to a very big index compared to the slots if there are many different things in a chest. So the algorithm needs to lookup the right index first and then lookup the slot. Instead of looking for the first slot with this item.

C) it’s always a bad idea (not only for Factorio) to take a special case and improve around that.

D) Can you prove, that your algorithm is always — especially in the most used cases — faster than yet.

In other words: I don’t think it makes any sense, this is already highly optimized. But you have luck, because I have too less implementation knowledge to move it to won’t implement. ;)
First, my apologies, I don't think I explained it very well.

For A, I don't know for sure that some sort of indexing isn't already in play. However the fact that insertion/removal logic behaves exactly as if it's iterating over every slot, and very large containers (warehouse mods, etc) are notorious for causing lag unless used sparingly, suggests it.
There have been threads discussing this and talking about adding indexing of some sort before, though I decided to make a new thread because i'm suggesting a specific indexing method.


For B and C and D, i'm not suggesting a general index for every item, i'm suggesting a minimalist index that should improve performance significantly in some usecases but barely affect performance in others. Assuming I'm understanding how inserters and such function, anyway. Let me provide some illustrations to better explain:


We have a chest.
image

When a inserter, robot, etc tries to add items to a chest, it iterates over every slot starting from the first slot.
image

For every slot it iterates over, it checks if there is a stack it can add the item onto. Meaning, if it's the same item type, if the stack isn't full, etc. If there is such a thing, it adds to the first one it finds.
image

And if it doesn't find any partial stacks to add to after iterating over the whole length of the chest...
image
...It goes back and adds a stack at the first free slot it found.
image


Removing items is much the same process, except the iteration starts from the last slot and goes backwards, and additionally checks if each slot meets any filter criteria (filter inserter, inserting into a assembler, etc).
image


Now, here's the interesting bit - if you're just adding or removing the same items from a chest, and you've already found the slot to do so with, there's no reason to iterate over the slots again - when adding items, the stacks can only be added in the direction that it iterates in, and likewise for removing items. So you can just remember where you last added or removed a item, and start from there.
image
If a stack is filled, it still puts the next stack the correct slot.
image

And in the worst case, if a stack is removed, it just has to iterate over all the contents to find the right spot again.
image
Which is the same worst case scenario as right now, when a chest is almost entirely full and it has to add to the stack on the end.
image

Now of course, this implementation would deviate from current behavior if we removed a stack in the middle somehow.
image
But conveniently, since inserters add items from the start onwards, and remove items from the end, inserters can't do that.

(note, removing items would have to have a check added for partial stacks just like adding items, in order to get consistent behavior for both halves of this coin. This shouldn't be a big deal.)

Now it's easy to see on it's own how this would be a great optimization - the worst case for this optimization (iterating the entire chest) here happens whenever a stack is filled or emptied in the chest, while the worst case right now happens every time a item is added to a almost full chest or removed from a almost empty chest. Since stacks take many item addition/removal operations to fill or empty,and this only requires adding a integer or two and a slight change to the iteration algorithm (to check for and start from a different index number than "0"), this is optimization is certainly better... in this limited scenario.

But, you're right. There's a scenario where it breaks down. Two scenarios, actually:
1: Things that don't play by the rules
2: Multiple item types


For #1, things that don't play by the 'rules' means that it is something that doesn't iterate from start on insertion and iterate from the end on removal. A example would be the player; a player can trivially remove a stack from the middle of a chest...
image
...and thus break this nice optimization instantly.
image
Thankfully, this is fairly easy to resolve: any time a player, or other 'rule breakers', touch a chest at all, just reset the 'last modified slot' index so that inserters fall back to default behavior. Since players will typically interact withchests around the factory much less frequently than inserters, since inserters outnumber players a thousand-to-one, this isn't a big deal. Also, it just means that the worst case is, again, about the same as the worst case right now.


And finally, the elephant in the room, #2. When chests have only one item type going in or out, this thing works fine. But slots are shared between all itemtypes, and these indexes are (as described) agnostic of item type. So if you have more than one item in a chest...
image
...and a filter inserter, or something that causes a inserter to filter (inserting into assembler, filtered container, etc), removes a item or stack in the middle...
image
...then there are a variety of ways for it to break.
image

Now, there are some ways you could try to solve that, each with upsides and downsides.
  • You could make any 'filtered' interaction with a chest at all reset the indexes just like player interaction, but that denies the optimization in almost all of the situations it would be used - all bot interactions, any time a inserter is taking from a chest and putting in to a assembler, etc.
  • You could store the type of item along with the 'last interacted' index, or store one index per item in the chest, etc. But these add complexity and additional memory reads and writes and operations to the process, all of which will diminish the efficacy of the optimization and increase the performance losses outside of the areas where it improves performance.
  • You could disable the optimization on any chest with more than one item type. A bit of a mixed bag, this would certainly help in many situations (e.g. a warehouse mod storing 1,000 stacks of iron ore), but not in others (requester chests with multiple items).
And more.


Someone at Wube, with actual access to the code, could probably do a better job of figuring out how they'd patch the holes in the optimization. But I think there's enough merit here to warrant investigating it, especially since it's really not that big of a change. Depending on the details, this change might only need to touch a couple dozen lines of code. Testing it could take longer than changing it. Compared to the massive efforts they've gone through to optimize bots and belts and so on, that's nothing.
User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

Re: [Optimization] Improvement for container insertion logic

Post by ssilk »

First of all, thank you for this great explanation. Most of it I did know, but you showed me some new aspects. How long did it take to make this post? The subject is interesting.

Second: when shrinked to only one item type, this looks like a real improvement.

But I can’t be sure, because in general wube thinks a lot of such optimizations. The good question is, is it really worth the extra hassle, or is it only useful for modders, that use only 6x6 warehouse chests.

So I linked this article in the hope some dev (Rseding91 :roll: ) will have a look.
viewtopic.php?p=568010#p568010
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...
buggy123
Long Handed Inserter
Long Handed Inserter
Posts: 64
Joined: Sat May 13, 2017 11:53 pm
Contact:

Re: [Optimization] Improvement for container insertion logic

Post by buggy123 »

ssilk wrote: Sat May 14, 2022 7:56 am First of all, thank you for this great explanation. Most of it I did know, but you showed me some new aspects. How long did it take to make this post? The subject is interesting.

Second: when shrinked to only one item type, this looks like a real improvement.

But I can’t be sure, because in general wube thinks a lot of such optimizations. The good question is, is it really worth the extra hassle, or is it only useful for modders, that use only 6x6 warehouse chests.

So I linked this article in the hope some dev (Rseding91 :roll: ) will have a look.
viewtopic.php?p=568010#p568010
It took me roughly a hour to put together; the images were actually pretty easy, I just loaded up a sandbox world, used the windows snipping tool for screenshots, and Gimp to add the lines. It was helpful too, since it allowed me to make sure I was getting the inserter logic right.

As for it's worth, it's hard to say. I suspect it won't actually be that much hassle; at it's simplest, iterating a list or array from a different starting index is something like 1 or 2 extra lines of code. Probably a fair bit more in practice, of course, but I'll bet its a lot less than what they had to do to do that big belt optimization back in 0.15.
As well, the devs have gone to great lengths to optimize the game and also make things easier for mods. To say for sure if it's worth it would require some benchmarking to see how much impact chests have on performance.

No matter what they decide, thank you for sending it to the devs! Hopefully they find it useful.
Post Reply

Return to “Ideas and Suggestions”