Page 1 of 6

Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 2:20 pm
by Klonan

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 2:28 pm
by dee-
That was great :)

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 2:32 pm
by Nova
"Sorry" for technical articles? You don't have to be, many readers like them. :)

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 2:59 pm
by ownlyme
i thought you'd use a new graphics engine for 0.17 to make the game smoother...
something like having the gpu detached from cpu calculations to make running/driving around much smoother..

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:04 pm
by meganothing
Mmmh, Vega 64 50% slower than GTX1060 in your render test even though it should be exactly the other way round. Do you know why?

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:06 pm
by gamah
Without pretending to know exactly how SDL and GPU's work... why do frames that can't be rendered in time get dropped?

Shouldn't the game be writing (ideally, only the changed pixels) to some sort of frame buffer that is constantly updated as the screen is polled? My understanding of screen tearing is that it is the result of frames that take too long to draw. Drivers have built-in ways to handle capping frame rates to reduce tearing (v-sync), why hard code it?

Someone more knowledgeable please correct and enlighten me...

woo first post

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:13 pm
by Durabys
Klonan wrote: Fri Oct 12, 2018 2:20 pm https://www.factorio.com/blog/post/fff-264
posila wrote:No wonder, scenes heavy on smoke or trees can tank FPS, especially in 4K. Maybe we should do something about that...
FUCK YES MR. OBVIOUS YOU SHOULD HAVE DONE SOMETHING ABOUT THAT!!!
We have been screaming at you for the last two years to do something about that.

Sorry. Had to get it out. Sorry again. :oops:

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:17 pm
by ledow
ownlyme wrote: Fri Oct 12, 2018 2:59 pm i thought you'd use a new graphics engine for 0.17 to make the game smoother...
something like having the gpu detached from cpu calculations to make running/driving around much smoother..
I don't think you understand how it works.

The GPU is independent. But it has to draw what the CPU has told it is there.

The graphics engine change is, I believe, Allegro -> SDL. That's a backend change that isn't going to do anything noticeable by default. It's a programming change, not an instant performance enhancement. Precisely because behind both, you're actually using OpenGL to draw the game (or else your performance would be atrocious!) and the same OpenGL commands are happening whether under Allegro or SDL.

The performance changes come from knowing what the game is doing, what's unnecessary, what the hardware can do better, what the bottlenecks are, and squeezing what you have to send from CPU to GPU each frame and how many times you have to do that. Which is what this entire article is about - optimising the OpenGL and unnecessary stuff that you're asking the GL driver to do.

To be honest, sprite atlases / mega textures, and optimising texture streaming like this are just par-for-the-course. It's a nice article, interesting to the techy gamer, but any DirectX/OpenGL programmer has been using these techniques for years. If you want "impressive" articles about mega-optimisation of these paths, there's a lovely one on the programming behind GTA V which has something ridiculous like hundreds of texture etc. layers per frame, all perfectly optimised and information crammed into them and used for all kinds of rendering tricks.

Factorio just seems to use the same techniques as I have found myself by flicking through OpenGL tutorials to make a basic 2D isometric game in SDL that uses OpenGL to speed up the graphics bit. I'll save you the 14 versions of that game "graphics engine" that used everything from direct bitmap calls, to direct-mode OpenGL to display lists to vertex buffers, etc. etc. as the technology became available and as I found out about them and rewrote the drawing bits.

Pretty much, Factorio pushes almost everything it can to the GPU. It may push a bit too much, in fact, and it's about learning the balance between "let the GPU do everything" and "the GPU is doing everything and it can't keep up". It's always a very delicate balance.

I'm surprised that there's quite so much over-drawing, but more surprised that the steam graphics are still as slow as they are. Presumably it's using some kind of fog? I wouldn't be surprised if you could speed up things a lot with "basic smoke" being just a transparent animated texture. I know my laptop slows when I'm in a smoky area - whether that's near loads of steam engines, or inside a forest fire.

Such optimisation is hard though - I only have 1Gb of VRAM. I wouldn't want them to over-optimise towards massive-VRAM machines, it would literally kill playability of the game for me. Equally, I don't want to hold back those people who want to play in Extreme mode with 8K resolution and make it looks worse for them "for compatibility" with my machine.

There are tools which can intercept OpenGL calls and display them in a separate program while your game is running, for debugging, etc. and show you where you're filling up memory and hitting performance limits and statistics on what kind of rendering you're causing the graphics card to do. I know I used them and was surprised how easy it was to push stuff to even a high-end graphics card and bring it to a grinding halt.

Don't forget that even Chrome uses OpenGL etc. nowadays, not to mention anything playing video, so having it open even in the background means it's fighting the game for resources all the time. People have some unrealistic expectations of the game, just because it "looks" a bit like an old retro game if you zoom in.

Fact is, Factorio quite stresses a gaming machine as it is. The only optimisations you can do are going to be largely clever-tricks to make the game do things that look the same but under the hood are vastly different to what you expect (like the belt management - instead of having every square of belt be one entity, treat them as "runs" of belts between corners/splitters/etc., thus reducing the number of calculations you have to do - that's one very difficult to maintain, easy-to-break and hard-to-write piece of code!).

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:29 pm
by ledow
gamah wrote: Fri Oct 12, 2018 3:06 pm Without pretending to know exactly how SDL and GPU's work... why do frames that can't be rendered in time get dropped?

Shouldn't the game be writing (ideally, only the changed pixels) to some sort of frame buffer that is constantly updated as the screen is polled? My understanding of screen tearing is that it is the result of frames that take too long to draw. Drivers have built-in ways to handle capping frame rates to reduce tearing (v-sync), why hard code it?

Someone more knowledgeable please correct and enlighten me...

woo first post
At 60fps you have 16.67ms to do everything to get the next frame ready. If you miss that deadline, you miss the vsync, and your frame never shows (or shows late, which means the next frame is a frame behind, which builds up and "60 frames" actually takes "90 frames" to draw - your FPS plummets and the game runs slow.

VSync is literally saying "THE NEXT FRAME IS GOING ON THE SCREEN NOW". You give it the graphics card data before then, or it doesn't get pushed to the screen.

With VSync off, you are able to do what you like in whatever time-frame. But quite literally, you're drawing direct to the screen in between "blinks" and it results in tearing (where the top of the screen got updated in time, but the bottom is actually now the NEXT frame, so you end up with 2 or even 3, 4 frames of data... the top quarter of your screen is still showing frame 1, the next quarter frame 2, etc.). This looks ugly, real ugly, and manifests as lines that scroll down your screen as you move.

The fix is no vsync, but double-buffering. You don't draw to the screen AT ALL. You draw into RAM. Only when you have the entire frame drawn do you them copy the whole thing to the screen. This doesn't need vysnc. It doesn't tear. But if you took more than 16.66ms to draw that frame and copy it from RAM to GPU (which can be slow!), then it will show up as a FPS drop again.

Literally, 60 times a second the card is saying "GIVE ME DATA". If you get it there on time, it draws it and the user never knows. If you don't, it continues to show old data. So your screen still refreshes at 60Hz, but you're only get 50 FPS out of it, and so on.

Basically, you've got a deadline. No matter what tricks you use, if you don't meet it, FPS or the actual display of your game suffers such that people notice and complain. If you meet it, the graphics feel smooth.

And 16.66ms is NOT an awful lot of time. The article itself shows, it can take 2ms just to put a part of a frame into GPU memory. Not including what the OS is doing, what your game is calculating, getting those textures into RAM beforehand, dealing with sound and input and networking and everything else. Do your graphics naively and you can drop to single-digit FPS immediately. Overload the graphics card and you can do the same. Hold up the CPU too long and it will never get the data out in time.

What your graphics driver does is shove data to the card, and tell you when the next frame is going to start drawing on the screen. That's *IT*. VSync or not doesn't change the 16.66ms time limit to make the graphics look smooth.

EDIT: In fact, GPU VSync options normally do nothing more than MAKE YOU WAIT if you completed your frame before 16.66ms. Literally... I will spin and do nothing because it's not time to draw yet. That's why turning VSync off makes things work faster, but results in more tearing, and more CPU usage - because the CPU is "not allowed" to do anything more until the time is right and the frame is drawn. (Not quite true on modern multi-core/processor systems, but vsync basically just hangs up the drawing CPU until it's actually time to draw again. P.S. if you have VSync on, and give it a frame at 16.8ms, it will wait until the next frame before it does anything... which will be at 33.33 ms... an entire frame later, because you missed the call by 0.1 of a ms).

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:42 pm
by SuperSandro2000
Sorry, not sorry.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 3:43 pm
by gamah
ledow wrote: Fri Oct 12, 2018 3:29 pm
gamah wrote: Fri Oct 12, 2018 3:06 pm Without pretending to know exactly how SDL and GPU's work... why do frames that can't be rendered in time get dropped?

Shouldn't the game be writing (ideally, only the changed pixels) to some sort of frame buffer that is constantly updated as the screen is polled? My understanding of screen tearing is that it is the result of frames that take too long to draw. Drivers have built-in ways to handle capping frame rates to reduce tearing (v-sync), why hard code it?

Someone more knowledgeable please correct and enlighten me...

woo first post
At 60fps you have 16.67ms to do everything to get the next frame ready. If you miss that deadline, you miss the vsync, and your frame never shows (or shows late, which means the next frame is a frame behind, which builds up and "60 frames" actually takes "90 frames" to draw - your FPS plummets and the game runs slow.

VSync is literally saying "THE NEXT FRAME IS GOING ON THE SCREEN NOW". You give it the graphics card data before then, or it doesn't get pushed to the screen.

With VSync off, you are able to do what you like in whatever time-frame. But quite literally, you're drawing direct to the screen in between "blinks" and it results in tearing (where the top of the screen got updated in time, but the bottom is actually now the NEXT frame, so you end up with 2 or even 3, 4 frames of data... the top quarter of your screen is still showing frame 1, the next quarter frame 2, etc.). This looks ugly, real ugly, and manifests as lines that scroll down your screen as you move.

The fix is no vsync, but double-buffering. You don't draw to the screen AT ALL. You draw into RAM. Only when you have the entire frame drawn do you them copy the whole thing to the screen. This doesn't need vysnc. It doesn't tear. But if you took more than 16.66ms to draw that frame and copy it from RAM to GPU (which can be slow!), then it will show up as a FPS drop again.

Literally, 60 times a second the card is saying "GIVE ME DATA". If you get it there on time, it draws it and the user never knows. If you don't, it continues to show old data. So your screen still refreshes at 60Hz, but you're only get 50 FPS out of it, and so on.

Basically, you've got a deadline. No matter what tricks you use, if you don't meet it, FPS or the actual display of your game suffers such that people notice and complain. If you meet it, the graphics feel smooth.

And 16.66ms is NOT an awful lot of time. The article itself shows, it can take 2ms just to put a part of a frame into GPU memory. Not including what the OS is doing, what your game is calculating, getting those textures into RAM beforehand, dealing with sound and input and networking and everything else. Do your graphics naively and you can drop to single-digit FPS immediately. Overload the graphics card and you can do the same. Hold up the CPU too long and it will never get the data out in time.

What your graphics driver does is shove data to the card, and tell you when the next frame is going to start drawing on the screen. That's *IT*. VSync or not doesn't change the 16.66ms time limit to make the graphics look smooth.

EDIT: In fact, GPU VSync options normally do nothing more than MAKE YOU WAIT if you completed your frame before 16.66ms. Literally... I will spin and do nothing because it's not time to draw yet. That's why turning VSync off makes things work faster, but results in more tearing, and more CPU usage - because the CPU is "not allowed" to do anything more until the time is right and the frame is drawn. (Not quite true on modern multi-core/processor systems, but vsync basically just hangs up the drawing CPU until it's actually time to draw again. P.S. if you have VSync on, and give it a frame at 16.8ms, it will wait until the next frame before it does anything... which will be at 33.33 ms... an entire frame later, because you missed the call by 0.1 of a ms).
16.6ms is eons in computer-speak, about 50 million cpu cycles at 3ghz.... Instead of dropping a fame that can't be drawn wouldn't it be more visually pleasing to have a subroutine hold on to whatever the current/last frame data was, and constantly feed that at 60 fps whether or not you can update the whole thing? So what if the bottom 200 lines in the next screen draw are from the previous frame.... at least the first 880 can be updated with the current frame instead of abandoning all of that work entirely...

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 4:08 pm
by posila
meganothing wrote: Fri Oct 12, 2018 3:04 pm Mmmh, Vega 64 50% slower than GTX1060 in your render test even though it should be exactly the other way round. Do you know why?
I didn't think anything of it, since what I measured was essentialy glorified mem-copy and the result was in a ballpark of what I expected. Maybe I made a mistake. I shall investigate.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 4:10 pm
by dzaima
gamah wrote: Fri Oct 12, 2018 3:43 pm 16.6ms is eons in computer-speak, about 50 million cpu cycles at 3ghz.... Instead of dropping a fame that can't be drawn wouldn't it be more visually pleasing to have a subroutine hold on to whatever the current/last frame data was, and constantly feed that at 60 fps whether or not you can update the whole thing? So what if the bottom 200 lines in the next screen draw are from the previous frame.... at least the first 880 can be updated with the current frame instead of abandoning all of that work entirely...
"wouldn't it be more visually pleasing to have a subroutine hold on to whatever the current/last frame data was" that's exactly what turning off vsync is for. Those 200 lines being shifted to wherever you were in the previous frame are very ugly, and that's what vsync fixes.
Also, 50M CPU cycles is not that much. Divide that by the pixel amount in a 1080×1920 screen, and you're left with 24 cycles per pixel for everything for it. (though we have GPUs for per-pixel parallel processing, so CPUs shouldn't be involved.)
I'd assume that the work isn't abandoned with vsync on, that it's finished and pushed for next frame, otherwise you wouldn't get a frame drawn, ever, if every frame consistently takes more than 16.6ms.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 4:15 pm
by OutOfNicks
Superb! <3 techie facts. Never be sorry for them.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 4:33 pm
by Koub
Tomik wrote: Fri Oct 12, 2018 3:13 pm
Klonan wrote: Fri Oct 12, 2018 2:20 pm https://www.factorio.com/blog/post/fff-264
posila wrote:No wonder, scenes heavy on smoke or trees can tank FPS, especially in 4K. Maybe we should do something about that...
FUCK YES MR. OBVIOUS YOU SHOULD HAVE DONE SOMETHING ABOUT THAT!!!
We have been screaming at you for the last two years to do something about that.

Sorry. Had to get it out. Sorry again. :oops:
No you hadn't :
"Yes please".
That way, you'd have looked like a polite interested community member, instead of a spoiled brat (and you wouldn't have had to apologize for being one) :mrgreen:

BTW, every time you post about graphic engine, I realize how little I understand about that in particular. Luckily enough, I did understand the first sentence
Hello, it is me, posila, with another technical article.
:lol:

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 4:46 pm
by posila
Tomik wrote: Fri Oct 12, 2018 3:13 pm
posila wrote:No wonder, scenes heavy on smoke or trees can tank FPS, especially in 4K. Maybe we should do something about that...
FUCK YES MR. OBVIOUS YOU SHOULD HAVE DONE SOMETHING ABOUT THAT!!!
We have been screaming at you for the last two years to do something about that.

Sorry. Had to get it out. Sorry again. :oops:
I am sorry my tease caused you such a negative emotional response. As I said
posila wrote:Our assumption was that the problems were caused by the game wanting to use more video memory than available (the game is not the only application that wants to use video memory) and the graphics driver has to spend a lot of time to optimize accesses to the textures.
we assumed problems come from using to much video memory and so we were adding options to change how sprites are loaded and grouped, so people can find which configuration works for them best. When I changed something it always break performance for some people, so that's we added these changes as options.

Anyway, overdraw is part of the problem, ... other problems (also besides overcommiting of video memory) are how we allocated our render targets, nonoptimal use of texture filtering, ... material for some future friday facts.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 5:01 pm
by gamah
dzaima wrote: Fri Oct 12, 2018 4:10 pm
gamah wrote: Fri Oct 12, 2018 3:43 pm 16.6ms is eons in computer-speak, about 50 million cpu cycles at 3ghz.... Instead of dropping a fame that can't be drawn wouldn't it be more visually pleasing to have a subroutine hold on to whatever the current/last frame data was, and constantly feed that at 60 fps whether or not you can update the whole thing? So what if the bottom 200 lines in the next screen draw are from the previous frame.... at least the first 880 can be updated with the current frame instead of abandoning all of that work entirely...
"wouldn't it be more visually pleasing to have a subroutine hold on to whatever the current/last frame data was" that's exactly what turning off vsync is for. Those 200 lines being shifted to wherever you were in the previous frame are very ugly, and that's what vsync fixes.
Also, 50M CPU cycles is not that much. Divide that by the pixel amount in a 1080×1920 screen, and you're left with 24 cycles per pixel for everything for it. (though we have GPUs for per-pixel parallel processing, so CPUs shouldn't be involved.)
I'd assume that the work isn't abandoned with vsync on, that it's finished and pushed for next frame, otherwise you wouldn't get a frame drawn, ever, if every frame consistently takes more than 16.6ms.
but factorio appears to just drop frames instead of allowing tearing to happen, even with vsync off...

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 5:06 pm
by PunPun
ledow wrote: Fri Oct 12, 2018 3:29 pmThe fix is no vsync, but double-buffering. You don't draw to the screen AT ALL. You draw into RAM. Only when you have the entire frame drawn do you them copy the whole thing to the screen. This doesn't need vysnc. It doesn't tear. But if you took more than 16.66ms to draw that frame and copy it from RAM to GPU (which can be slow!), then it will show up as a FPS drop again.
Umm. Absolutely no. You never ever draw to ram unless you wan't sub 1fps. Doublebuffering is verymuch done by telling the gpu to draw into a frambuffer that is very much in vram on the gpu.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 5:13 pm
by posila
gamah wrote: Fri Oct 12, 2018 5:01 pm but factorio appears to just drop frames instead of allowing tearing to happen, even with vsync off...
That's because of borderless fullscreen, the rendered image is not presented to screen directly, but is sent to desktop compositing manager, which then presents it with v-sync anyway. With DirectX 11 we can set flags to allow tearing (on Windows 10 only, though) even when rendering in window or borderless fullscreen. We also plan to add option for exclusive fullscreen.

Re: Friday Facts #264 - Texture streaming

Posted: Fri Oct 12, 2018 5:17 pm
by Twisted_Code
I'm an IT student, but I'm still getting lost in the explanation somehow. Could someone try to break it down for me?