Friday Facts #264 - Texture streaming

Regular reports on Factorio development.
posila
Factorio Staff
Factorio Staff
Posts: 2755
Joined: Thu Jun 11, 2015 1:35 pm

Re: Friday Facts #264 - Texture streaming

Post by posila » Fri Oct 12, 2018 8:26 pm

eradicator wrote:
Fri Oct 12, 2018 6:51 pm
@posila:
Thanks for the post. You need to have more pride in your work and not feel like it's something to apologize for ;).
Btw, what happend to the idea of requiring modders to pre-process textures (from this thread)?
I didn't want to talk about it in FFF so I don't make people hyped accidentaly ... My big goal is to make streaming from HDD work. For that to work, virtual atlas needs to be saved on disk (there's no way to stream from PNG spritesheets, efficiently), but given that practically majority of people can fit all sprites in VRAM anyway, we probably won't distribute pre-baked virtual atlas, and streaming will be optional. So user computers would build atlases anyway, as they do now. If we decide to distribute pre-baked virtual atlas (or some metadata to generate more efficiently packed atlas), the game would generate atlases for each mod, which would be appended to virtual texture coordinate space of main mega-atlas. So no additional work for modders in any scenario.
EDIT: When I wrote the first line, I though about Xterminator's FFF discussion video ... I didn't think Klonan will leave it in :D
Oktokolo wrote:
Fri Oct 12, 2018 7:15 pm
Factorio also has a lot of shadows. If they are only polygons filled with a single RGBA color they could probably be stored as vectors and rendered as geometry somehow to make them occupy less VRAM.
That's an interesting idea. More on this topinc in the future, though.
Oktokolo wrote:
Fri Oct 12, 2018 7:15 pm
I am no GPU coder, but would expect stuff like smoke (maybe also clouds) to be faster if rendered by a procedural shader in the GPU instead of layering tons of bitmaps above each other. That stuff probably is mathematically generated (instead of modelled or painted by hand) already. There are also almost always clusters of many smoke sources in a factory (steam power plants, smelter arrays...).
The smoke shader would probable be able to use a single list or bitmap of smoke source locations and state. It would render all the smoke in a single pass, so that it would cause only a single overdraw per scene pixel.

Maybe, power/circuit cables could also be drawn procedurally for the full scene at once. But they probably do not cause much overdraw anyway.
Procedural wires are in our backlog (not very high priority at the moment though). We also would like to try to do smoke procedurally, but there we might run into an issue of reopening work that was considered finished.
Oktokolo wrote:
Fri Oct 12, 2018 7:15 pm
If for some reason, VRAM use ever becomes an issue, you could give players the option to only use any 2nd, 3rd or 4th frame of each animation, so that only animation FPS suffers instead of having the entire scene FPS drop. That way, animations would lag but player movement would still be butter smooth. Optimally the player could also select groups of stuff to have the full or at least half the animation frames (so they would still get 30 FPS for them). One obvious exemption group would be belts another one inserters. But apart from that i don't know what groups would make sense.
Yeah, I would prefer "animation and variation quality" options to current low and very-low sprite quality options. (Btw. I am not sure how many animations are 60 FPS but I would say not many, there might be some 40 FPS ones, most 30 or less FPS).

Sander_Bouwhuis
Fast Inserter
Fast Inserter
Posts: 203
Joined: Mon Dec 07, 2015 10:45 pm

Re: Friday Facts #264 - Texture streaming

Post by Sander_Bouwhuis » Fri Oct 12, 2018 8:57 pm

Thanks for the VERY interesting article.

A quick question though: is VRAM the amount of GB you have on the GPU?
I have a GTX-660 with 2GB and 32GB of system RAM.
Does that mean I have only 2GB of VRAM, and therefore belong to the bottom 25% of players?

Jap2.0
Smart Inserter
Smart Inserter
Posts: 1748
Joined: Tue Jun 20, 2017 12:02 am

Re: Friday Facts #264 - Texture streaming

Post by Jap2.0 » Fri Oct 12, 2018 9:21 pm

Sander_Bouwhuis wrote:
Fri Oct 12, 2018 8:57 pm
Thanks for the VERY interesting article.

A quick question though: is VRAM the amount of GB you have on the GPU?
I have a GTX-660 with 2GB and 32GB of system RAM.
Does that mean I have only 2GB of VRAM, and therefore belong to the bottom 25% of players?
Yes. (although I'm also running the game fine with 2GB)
There are 10 types of people: those who get this joke and those who don't.

Koub
Global Moderator
Global Moderator
Posts: 3876
Joined: Fri May 30, 2014 8:54 am

Re: Friday Facts #264 - Texture streaming

Post by Koub » Fri Oct 12, 2018 9:31 pm

Sander_Bouwhuis wrote:
Fri Oct 12, 2018 8:57 pm
I have a GTX-660 with 2GB and 32GB of system RAM.
Does that mean I have only 2GB of VRAM, and therefore belong to the bottom 25% of players?
Image
Koub - Please consider English is not my native language.

User avatar
mrudat
Inserter
Inserter
Posts: 49
Joined: Fri Feb 16, 2018 5:21 am

Re: Friday Facts #264 - Texture streaming

Post by mrudat » Fri Oct 12, 2018 10:05 pm

Have you compared the performance of compositing sprites on-the-fly in the GPU, as opposed to pre-compositing a stack of sprites (probably also using the GPU, but as part of loading the sprites) and storing the results in VRAM?

In theory, for an electric-mining-drill pre-compositing could potentially reduce vram usage (given that there are two states, one being the base textures, the other made of the base textures plus more than two additional transparent textures), and would also reduce overdraw, as I believe it should be able to be reduced to either a single texture, or two textures, object/shadow (I believe that the shadow texture can be more heavily compressed?).

That said, I believe that it is possibly the single entity with the largest number of texture layers (or at least one of the largest I've come across).



Another thought on pre-compositing; would it be feasible to flag and store pre-composited static sprites between frames? eg. create a (set of?) screen/chunk-sized textures that contain non-animated sprites; you potentially draw twice as many layers, but, trees, for example, only draw individual trees when a chunk comes on-screen, and after that a chunk of nothing but trees would draw perhaps three textures; the ground texture, shadow layer, (entities that move, eg. the player) and tree layer.

As a guess, you could add a flag to the draw call to indicate if this draw call is expected to change regularly between frames (ie. it is animated), take all of the draw calls for a given chunk/layer flagged as static and hash them to determine if you need to re-draw that layer of if you can use the one from the last frame (to account for destroying/building new entities).



Edit: A third thought occurs to me; it would probably be a win for startup speed to cache the sprite atlas on disk. You'd need to determine if the sprite atlas would end up being built the same way as the last time, but in theory, you would be able to skip reading, decompressing and compositing the sprite atlas if nothing has changed, and merely stream the pre-composited atlas off disk. Apparently this is already an option, I just didn't notice it.

knightelite
Burner Inserter
Burner Inserter
Posts: 6
Joined: Fri Oct 05, 2018 3:49 pm

Re: Friday Facts #264 - Texture streaming

Post by knightelite » Sat Oct 13, 2018 2:51 am

I don't know anything about texture rendering, but I do work with MPEG video regularly for my job. For those unfamiliar with it, one of the key things that MPEG video compression does (as compared to just image compression algorithms) is to perform difference operations between frames and only render the differences, rather than the whole picture.

MPEG video does this by having three types of frames:

[*] I Frames: Complete picture, contains no difference data. This one is used as a reference for the other types.
[*] P Frames: Forward differenced picture relative to the previous I or P frame.
[*] B Frames: Forward and backward differenced picture, relative to both the preceding I or P frame, and the following P frame.

Video is then sent as what is called a Group of Picture, or GOP, which looks something like this:

I-B-B-B-P-B-B-B-P-B-B-B-P

An MPEG decoder then functions as follows:

[*]Draw I Frame, store in buffer.
[*]Draw following P frame based on difference information from I frame, and store in buffer.
[*]Draw intervening B frames based on info from both.
[*]Draw next P frame, replace original I frame in its buffer.
[*]Repeat.

For high motion scenes (like sports), this doesn't always work very well, but for low difference scenes it's incredibly effective (think scrolling credits). An I frame with all the credits is drawn once, and each subsequent frame is just "slide that frame upwards a bit, maybe add a bit of new stuff at the bottom" which results in very good compression. Similarly, a scene like a ball moving through a static scene also compresses very well, as the scene is largely static with only the ball presenting differences.

Now, I'm not saying you should replace all graphics in such a way as this, but you might be able to make use of it for some layers of the renderer if you aren't doing this already. Cloud shadows, for instance, seem like they might be a prime candidate. Render a larger than screen size set of cloud shadows once, and just slide it over the screen one pixel at a time for instance to handle the cloud motion. Similarly for full conveyor belts; take the rendered image of the entire belt and slide it over a pixel or two each frame, etc...

Or, more likely since I don't know much about rendering from a game engine none of this is very useful as I am out of my depth, but I had fun typing it up anyway :).

Paul17041993
Inserter
Inserter
Posts: 32
Joined: Fri Nov 25, 2016 4:26 am

Re: Friday Facts #264 - Texture streaming

Post by Paul17041993 » Sat Oct 13, 2018 3:19 am

Anyway, a GTX 1060 renders a game scene like this in 1080p in 1ms. That's fast, but it means in 4K it would take 4ms, 10ms on integrated GPUs, and more that a single frame worth of time (16.66ms) on old, non-gaming GPUs. No wonder, scenes heavy on smoke or trees can tank FPS, especially in 4K.
Just a note here that linear scaling actually almost never exists with GPUs, there's quite a lot of factors that controls throughput of certain operations. Vega64 for example will have very close to the same copy-through performance from 4K down to 100x100px, as it's the scheduling and cache latency that affects it until you actually hit the bandwidth tipping-point, of which you'll then see a sharp climb in latency.

Power levels are another thing as well, seeing as the vega got 1.5ms for, I assume, a simple buffer copy; it was likely still in power state 0 and running at only 80MHz core, 167MHz VRAM. So basically the vega would have been acting like a simple integrated GPU in an old pentium...

Maxwell for example as well has a particular L2 cache flaw that you can hit with high memory throughput, which will also cause a tipping point before your VRAM limit if the buffer sizes allow.
Please be sure you've googled your question before asking me about code... :T

User avatar
Philip017
Fast Inserter
Fast Inserter
Posts: 232
Joined: Thu Sep 01, 2016 11:21 pm

Re: Friday Facts #264 - Texture streaming

Post by Philip017 » Sat Oct 13, 2018 3:38 am

this has been a very interesting read, i have never experience any tearing, but have had the jittering when fps/ups drop abruptly.

i use windowed maximized option and i do like to tab out and have that happen quickly, where full screen i might get better performance, but if i ever need to tab out, it really hits the game performance hard and takes much longer to tab out and back (all games).

i have no idea how this stuff works, so i apologize if i am not making any sense on this:

i feel steam, clouds and other "pretty" items when disabled in the options should not be rendered or even called in any way, because i have not seen any performance reason to have steam off, when i walk by my reactor/steam setups, on or off. perhaps this can be something that can be looked into. optional/if possible these could also be completely disabled automatically if fps suffers when their render call is done.

i remember doodads could at one time be turned down/off at one point, perhaps this could once again be an option? maybe that was a mod. but still would be nice to have when performance falls.

also entities such as steam engines/turbines, reactors, boilers, beacons, miners, assembly machines, furnaces could have a low(er) priority if fps is suffering and be non-animated or animate slower.

best wishes and thanks for making this great game.

muzzy
Fast Inserter
Fast Inserter
Posts: 186
Joined: Sat Nov 23, 2013 7:17 am

Re: Friday Facts #264 - Texture streaming

Post by muzzy » Sat Oct 13, 2018 6:45 am

Try to minimize the discarding by applying a simplified mesh around the sprites so that most of the empty areas don't get rasterized at all. After all, the discarded pixels still get shaded...

Dinaroozie
Manual Inserter
Manual Inserter
Posts: 2
Joined: Sat Oct 13, 2018 5:25 am

Re: Friday Facts #264 - Texture streaming

Post by Dinaroozie » Sat Oct 13, 2018 8:10 am

I've gotta say, mad props to you for worrying about stuff like getting your game performant on integrated graphics at 4k resolution. With the preponderance of ultrabooks with high pixel density displays and no discete graphics, I can totally see it becoming a more common use case. I used to play Factorio on a Surface Pro 2 and it ran great, while other games with a tenth as much happening would be dropping frames all over the place. You guys rule.

You mentioned smoke and trees are killing weaker GPUs at high display resolutions because of fill rate. It's my first post here, so I hope I'm not out of line/breaking forum etiquette/whatever if I share an idea or two on that front?

With smoke, I'm assuming that the smoke sprites basically render over the top of the rest of the scene. If that's the case, have you considered rendering the smoke particles to an off-screen buffer at a reduced resolution, then stretching the resulting texture over the main scene? For instance, if your main resolution is 4k, you could render smoke to a 1080p buffer, thus reducing your overdraw-heavy smoke rendering by a factor of four. You'd still have to render it over the main buffer at full resolution, but it would only be at most one pixel-write per pixel at the full resolution, rather than 10+. Smoke is usually a pretty good candidate for this because it tends not to have lots of fine detail and so the loss of resolution isn't very noticeable. If it comes out looking too blurry, you could try blending a fine noise texture in when you do the final smoke pass. And in any case, slightly blurry smoke is probably a good tradeoff for framerates on older/crappier GPUs.

This approach totally wouldn't work for trees, because they would look crappy at reduced resolution and because they have to get overdrawn by other, non-tree objects that are in front of them.

I have a slightly weirder idea that I'm much less confident in, but I'm just going to throw it out there, to see if it's of any use. I've always wondered if this would be a viable technique for a rendering environment like Factorio, but I've never worked with one so it's never come up.

I'm guessing currently you draw all your sprites back to front, blending over the top, hence lots of overdraw. You kinda have to, because you mentioned having lots of semi-transparent pixels, but I'm assuming there are also quite a lot of fully opaque pixels in your sprites (hopefully this is especially true for trees and other things that get rendered a lot). So my crazier idea is, could you do two rendering passes, one for the opaque pixels and one for the transparent ones, and use the depth buffer to prevent overdraw in the opaque pass (and limit it in the transparent pass)? Like:

1. Enable depth test/write. Render everything from front to back. Discard any fragments that are less than 100% opaque in the fragment shader. This will render most of the pixels, while only writing to every pixel exactly once.

2. Disable depth write, but keep the depth buffer from the first pass and leave depth-testing on. Use a fragment shader that discards fragments that are 100% opaque - only draw the ones that have more than zero but less than full alpha (i.e. the ones you didn't render the first time). Now, draw all the sprites again, from back to front this time.

As I say, I haven't personally tested this approach, but if I'm getting my theory right it would give you the same visual result (although I don't know how lighting works in Factorio - that might be a complicating factor). It would cost more in terms of geometry processing, draw calls, GPU state changes, and probably a few other things, but you'd save a lot of fill rate (and fragment shader calls, although I'm guessing that's less of a big deal). Whether it would actually be a net win, I have no clue, but if Factorio is fill rate bound at higher resolutions it might be worth a shot.

arrow in my gluteus
Inserter
Inserter
Posts: 21
Joined: Mon Apr 24, 2017 1:52 pm

Re: Friday Facts #264 - Texture streaming

Post by arrow in my gluteus » Sat Oct 13, 2018 8:24 am

Koub wrote:
Fri Oct 12, 2018 9:31 pm
Sander_Bouwhuis wrote:
Fri Oct 12, 2018 8:57 pm
I have a GTX-660 with 2GB and 32GB of system RAM.
Does that mean I have only 2GB of VRAM, and therefore belong to the bottom 25% of players?
Image
stargate!

Xerophyte
Manual Inserter
Manual Inserter
Posts: 4
Joined: Sat Dec 19, 2015 9:47 pm

Re: Friday Facts #264 - Texture streaming

Post by Xerophyte » Sat Oct 13, 2018 8:48 am

I imagine you thought of all these, but:

1. From the overdraw image it looks like you're drawing your sprites as quads, regardless of their shape. You could try fitting a more complex polygon to each frame of animation in order to have a tighter fitting mesh in order to not draw quite so many fully transparent pixels. Using some form of k-DOP shouldn't add too much complexity and should still be fairly simple to generate automatically on loading a mod asset.

2. Relatedly, it looks like you have a lot of overdraw for opaque pixels. A more complex approach is to split each sprite's mesh into an opaque and maybe-transparent part. Then you could draw the opaque submeshes first in (approximate) front to back and stencil or z-test to cull stuff behind. You'd only use painter's for the maybe transparent parts, and could still cull them if they're fully occluded. Programatically finding good mesh splits for mod assets and the like seems like possibly a very large headache, though.

3. Smoke seems like a big culprit. Procedural was mentioned, but it seems like you additionally have option of binning the smoke particles by position, passing the particle bins to the shader and drawing them all in a single global pass where the shader iterates over particles. That likely wont be an improvement as long as the smoke particle appearance itself is an image, though.

[Edit]: Think of it as basically doing lighting in a tiled forward or deferred renderer, but instead of binning lights per render tile you bin smoke particles. It might actually be worth it even with image-based particles if you can make the tiles align with your warps, but I think that's really going to vary from scene to scene.

User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 2604
Joined: Tue Jul 12, 2016 9:03 am

Re: Friday Facts #264 - Texture streaming

Post by eradicator » Sat Oct 13, 2018 8:58 am

arrow in my gluteus wrote:
Sat Oct 13, 2018 8:24 am
Koub wrote:
Fri Oct 12, 2018 9:31 pm
Sander_Bouwhuis wrote:
Fri Oct 12, 2018 8:57 pm
I have a GTX-660 with 2GB and 32GB of system RAM.
Does that mean I have only 2GB of VRAM, and therefore belong to the bottom 25% of players?
Image
stargate!
Image
Author of: Hand Crank Generator, Screenshot Hotkey 2.0
Mod support languages: 日本語, Deutsch, English

Zaflis
Fast Inserter
Fast Inserter
Posts: 191
Joined: Sun Apr 24, 2016 12:51 am

Re: Friday Facts #264 - Texture streaming

Post by Zaflis » Sat Oct 13, 2018 9:22 am

Oktokolo wrote:
Fri Oct 12, 2018 8:15 pm
sthalik wrote:
Fri Oct 12, 2018 7:35 pm
Oktokolo wrote:
Fri Oct 12, 2018 7:15 pm
If for some reason, VRAM use ever becomes an issue, you could give players the option to only use any 2nd, 3rd or 4th frame of each animation, so that only animation FPS suffers instead of having the entire scene FPS drop. That way, animations would lag but player movement would still be butter smooth.
It doesn't solve the suboptimal memory management problem. Worst-case scenario is that the memory management problem is pushed back to lower priority :(
It might indeed solve the memory management problem. After reducing the number of animation sprites, the remaining sprites could fit into the VRAM all at once. That would remove the need to constantly load and unload sprites. Without deallocation, there is no fragmentation. Without fragmentation they do not need to implement the defragmentation.
Better memory management comes with a runtime cost and is nontrivial to implement and test.
Dropping half or more of the animation sprites could be done once at loading the game and is easy to implement and test.
If i would have a VRAM-limited system, i would want more complicated memory management pushed back in favor of implementing something that is fast to implement and guaranteed to reduce the VRAM load dramatically...
You know, cars and tanks (and other vehicles from mods.. i got many) have many sprites and their shadows. If game really wants to optimize, it could track vehicle types that no player is driving currently, or has not been used within a minute. No real reason to keep all the rotations of it loaded, just ones that are present in game world.

In a modded world it is even possible that no vanilla trains are placed in the world, if one uses only some modded high tier ones with different sprites.

posila
Factorio Staff
Factorio Staff
Posts: 2755
Joined: Thu Jun 11, 2015 1:35 pm

Re: Friday Facts #264 - Texture streaming

Post by posila » Sat Oct 13, 2018 9:35 am

Sander_Bouwhuis wrote:
Fri Oct 12, 2018 8:57 pm
A quick question though: is VRAM the amount of GB you have on the GPU?
I have a GTX-660 with 2GB and 32GB of system RAM.
Does that mean I have only 2GB of VRAM, and therefore belong to the bottom 25% of players?
More like bottom 40% of players who have dedicated graphics card.
mrudat wrote:
Fri Oct 12, 2018 10:05 pm
Have you compared the performance of compositing sprites on-the-fly in the GPU, as opposed to pre-compositing a stack of sprites (probably also using the GPU, but as part of loading the sprites) and storing the results in VRAM?

In theory, for an electric-mining-drill pre-compositing could potentially reduce vram usage (given that there are two states, one being the base textures, the other made of the base textures plus more than two additional transparent textures), and would also reduce overdraw, as I believe it should be able to be reduced to either a single texture, or two textures, object/shadow (I believe that the shadow texture can be more heavily compressed?).

That said, I believe that it is possibly the single entity with the largest number of texture layers (or at least one of the largest I've come across).
Not quite. If entities have multiple layers there is usually some reason for it - either some of the layers are modified dynamically at runtime, or something else needs to be drawn in between the layers (for example robot flying out of roboport), or so we can create permutations dynamically and save VRAM. We thought about precompositing item icons with icon background for alt view, but it's a minor thing and it's always tough decision if we want to use more VRAM as tradeof for less GPU or CPU work. But you are right electric mining drill might not need to have second shadow overlay + fluid connector overlay, it might have been just another full mining drill animation with all the layers baked it and it wouldn't use more VRAM. (unless you have texture compression turned on, which makes shadows compressed). However electric mining drill sprites are not final, so I was not really thinking about how to optimize current version.
Some time ago we tried to precompose items on belts - it was improvement for CPU in prepare step as less draw orders were generated.
mrudat wrote:
Fri Oct 12, 2018 10:05 pm
Another thought on pre-compositing; would it be feasible to flag and store pre-composited static sprites between frames? eg. create a (set of?) screen/chunk-sized textures that contain non-animated sprites; you potentially draw twice as many layers, but, trees, for example, only draw individual trees when a chunk comes on-screen, and after that a chunk of nothing but trees would draw perhaps three textures; the ground texture, shadow layer, (entities that move, eg. the player) and tree layer.
I am not sure how would we make sure that moving entities overlap with tree layer properly, imho any movement on chunk would prevent the chunk from being cached this way. Anyway, we are moving more in direction of world being less static.
knightelite wrote:
Sat Oct 13, 2018 2:51 am
Now, I'm not saying you should replace all graphics in such a way as this, but you might be able to make use of it for some layers of the renderer if you aren't doing this already. Cloud shadows, for instance, seem like they might be a prime candidate. Render a larger than screen size set of cloud shadows once, and just slide it over the screen one pixel at a time for instance to handle the cloud motion. Similarly for full conveyor belts; take the rendered image of the entire belt and slide it over a pixel or two each frame, etc...
It is very cheap to figure out how cloud shadows should be rendered, but the fact that GPU has to go fetch a pixel from texture and blend it with background for every pixel of game view is expensive. So this technique would not help cloud shadows at all. But ... going through terrain tiles and figuring out what sprite should be rendered (and what transitions between terrains should be rendered) is quite expensive, so the caching you proposed, we already do for terrain. As player moves, we shift terrain rendered in previous frame and render just tiles that were not visible before. Thanks for write-up on MPEG, though :)
Philip017 wrote:
Sat Oct 13, 2018 3:38 am
i remember doodads could at one time be turned down/off at one point, perhaps this could once again be an option? maybe that was a mod. but still would be nice to have when performance falls.
Smoke, doodads (decoratives), cloud shadows, can still be turned off in graphics options and are not rendered at all if you turn them off. It would be nice though, if the game always performed well with doodads turned on, as they are quite essential for the game visuals.

User avatar
Unknow0059
Burner Inserter
Burner Inserter
Posts: 12
Joined: Tue Aug 08, 2017 7:37 pm

Re: Friday Facts #264 - Texture streaming

Post by Unknow0059 » Sat Oct 13, 2018 9:43 am

Why do the shadows have their own dedicated sprites?

Is that really less performance intensive than generating one in-game by using the already existing sprites?
Uh... yea

posila
Factorio Staff
Factorio Staff
Posts: 2755
Joined: Thu Jun 11, 2015 1:35 pm

Re: Friday Facts #264 - Texture streaming

Post by posila » Sat Oct 13, 2018 10:24 am

Paul17041993 wrote:
Sat Oct 13, 2018 3:19 am
Just a note here that linear scaling actually almost never exists with GPUs, there's quite a lot of factors that controls throughput of certain operations. Vega64 for example will have very close to the same copy-through performance from 4K down to 100x100px, as it's the scheduling and cache latency that affects it until you actually hit the bandwidth tipping-point, of which you'll then see a sharp climb in latency.

Power levels are another thing as well, seeing as the vega got 1.5ms for, I assume, a simple buffer copy; it was likely still in power state 0 and running at only 80MHz core, 167MHz VRAM. So basically the vega would have been acting like a simple integrated GPU in an old pentium...

Maxwell for example as well has a particular L2 cache flaw that you can hit with high memory throughput, which will also cause a tipping point before your VRAM limit if the buffer sizes allow.
0.15ms, but still kind of slow for what would one expect from Vega on essentially a buffer copy, I suppose. Thanks for the info, do you have any recommendations for articles/books/other materials about this topic I should read?
muzzy wrote:
Sat Oct 13, 2018 6:45 am
Try to minimize the discarding by applying a simplified mesh around the sprites so that most of the empty areas don't get rasterized at all. After all, the discarded pixels still get shaded...
That's a good idea, thanks.
Dinaroozie wrote:
Sat Oct 13, 2018 8:10 am
With smoke, I'm assuming that the smoke sprites basically render over the top of the rest of the scene. If that's the case, have you considered rendering the smoke particles to an off-screen buffer at a reduced resolution, then stretching the resulting texture over the main scene? For instance, if your main resolution is 4k, you could render smoke to a 1080p buffer, thus reducing your overdraw-heavy smoke rendering by a factor of four. You'd still have to render it over the main buffer at full resolution, but it would only be at most one pixel-write per pixel at the full resolution, rather than 10+. Smoke is usually a pretty good candidate for this because it tends not to have lots of fine detail and so the loss of resolution isn't very noticeable. If it comes out looking too blurry, you could try blending a fine noise texture in when you do the final smoke pass. And in any case, slightly blurry smoke is probably a good tradeoff for framerates on older/crappier GPUs.
We do this for lights (that's what is modified by "Lights render resolution" in graphics settings), and I did try to do it for smoke too, but it didn't seem to improve performance as opposed to enabling mipmaps on smoke sprites, so that's what I did instead. But I didn't test it at 4K, so I was probably CPU bound at that point. I will implement and measure it again, as we can now time GPU execution too. We will also probably add option, as an alternative to low and very-low sprite quality, to render the game view in lower resolution, upscale it to screen resolution and then render GUI in native screen resolution.
Dinaroozie wrote:
Sat Oct 13, 2018 8:10 am
I have a slightly weirder idea that I'm much less confident in, but I'm just going to throw it out there, to see if it's of any use. I've always wondered if this would be a viable technique for a rendering environment like Factorio, but I've never worked with one so it's never come up.
...
I don't know. The idea is not weird, I like it. But I think we don't have that big overdraw of opaque pixels to pay for cost of rendering the scene twice.
Xerophyte wrote:
Sat Oct 13, 2018 8:48 am
3. Smoke seems like a big culprit. Procedural was mentioned, but it seems like you additionally have option of binning the smoke particles by position, passing the particle bins to the shader and drawing them all in a single global pass where the shader iterates over particles. That likely wont be an improvement as long as the smoke particle appearance itself is an image, though.

[Edit]: Think of it as basically doing lighting in a tiled forward or deferred renderer, but instead of binning lights per render tile you bin smoke particles. It might actually be worth it even with image-based particles if you can make the tiles align with your warps, but I think that's really going to vary from scene to scene.
I read about similar idea in GPU Pro 5, it was called Tiled Deffered Blending. Is definitelly something I want to try out. I also want try to combine tree trunk and leaves into drawing of single polygon. We have separate sprites for leaves as they are tinted dynamically (and change tint as tree consumes pollution). Thanks for your suggestions.

Since I am not occupied by writing Friday Facts anymore, I had time to think about the overdraw visualizations we used in the FFF, and found a bug that caused smokes to draw themselves fully opaque up to 120 ticks after end of their life time. Fixing that reduced number of smoke sprites drawn by steam turbines by 15 - 20%.
Zaflis wrote:
Sat Oct 13, 2018 9:22 am
No real reason to keep all the rotations of it loaded, just ones that are present in game world.
That's the main purpose of texture streaming :)

User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 2604
Joined: Tue Jul 12, 2016 9:03 am

Re: Friday Facts #264 - Texture streaming

Post by eradicator » Sat Oct 13, 2018 10:40 am

Unknow0059 wrote:
Sat Oct 13, 2018 9:43 am
Why do the shadows have their own dedicated sprites?

Is that really less performance intensive than generating one in-game by using the already existing sprites?
Game objects are modelled as 3D objects and are then projected ('rendered') onto a 2D plane ('sprite'). The shadow has to be calculated from the 3D object. So you can't "generate" a shadow live, because the sprite lacks the 3D information required. Imagine trying to guess the height of a perfect cuboid when you only see one side of it.
Author of: Hand Crank Generator, Screenshot Hotkey 2.0
Mod support languages: 日本語, Deutsch, English

luc
Fast Inserter
Fast Inserter
Posts: 115
Joined: Sun Jul 17, 2016 9:53 pm

Re: Friday Facts #264 - Texture streaming

Post by luc » Sat Oct 13, 2018 10:49 am

Trees are always an issue in large maps: it already spends a lot of time on entity updates, so very little time is available for graphics drawing.

Smoke can be turned off, trees cannot. If you could find some solution for trees, I think that would solve 90% of the cases where my framerate drops below 30 fps (that's where I roughly draw the line between playable and unplayable). For example if there were 5 sprites for forests with different tree densities (and maybe 2 or 3 different types of trees), that would be fine by me. When I zoom in and there are only a few trees visible, individual trees can be drawn, but most of the time I really don't care where each individual tree is.

User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 2604
Joined: Tue Jul 12, 2016 9:03 am

Re: Friday Facts #264 - Texture streaming

Post by eradicator » Sat Oct 13, 2018 11:48 am

luc wrote:
Sat Oct 13, 2018 10:49 am
Smoke can be turned off, trees cannot.
If they're a major problem to you you can disable them in the map generator settings or remove them via command. But ofc that means they just stop existing. So the whole map becomes a grass plain which might be rather boring to look at, and you don't have any source of wood (Just as a quick workaround.)
Author of: Hand Crank Generator, Screenshot Hotkey 2.0
Mod support languages: 日本語, Deutsch, English

Post Reply

Return to “News”