s3tc compressing atlases takes up to half a minute

Bugs that are actually features.
Post Reply
sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

s3tc compressing atlases takes up to half a minute

Post by sthalik »

Compressing atlases takes a very long time during startup. The scenario here is "texture-compression=true" in the config file.

Code: Select all

   3.826 Created atlas bitmap 4096x1896 [compressed, mask]
   3.843 Created atlas bitmap 4096x4092 [no-crop, compressed, trilinear-filtering, mask, icon, light]
   3.849 Created atlas bitmap 4096x1532 [compressed, mask, alpha-mask]
  22.003 Sprites loaded
  22.003 Convert atlas 4096x4096 to: compressed
[more of the same]
  50.706 Convert atlas 4096x4092 to: trilinear-filtering compressed
  51.175 Convert atlas 4096x1532 to: alpha-maskcompressed
  51.317 Verbose AtlasSystem.cpp:739: Atlas memory size: 759.03MB
  51.317 Verbose AtlasSystem.cpp:740: Size of sprites outside of atlas: 0.00MB
  58.275 Custom inputs active: 3
  58.309 Factorio initialised
Note, despite such a low VRAM usage figure my framerate remains low. Low VRAM mode is off in that log.

I can see solutions here:

- drawing loaded sprites to a compressed S3TC texture as a rendertarget (?)
- translating the S3TC compression algorithm as a shader, then drawing to a generic buffer object (?)
- in case there's already an S3TC rendertarget present, maybe the pipeline's being stalled by reading from the BO too soon?
- caching each mod's sprites as a finished S3TC buffer
- compression takes place on only one CPU thread, make it take place on all cores as a quick fix
Last edited by sthalik on Sun May 06, 2018 8:14 pm, edited 1 time in total.

Loewchen
Global Moderator
Global Moderator
Posts: 8321
Joined: Wed Jan 07, 2015 5:53 pm
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by Loewchen »

Post the complete log please.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by sthalik »

First off, note the usage of Alien Biomes to exacerbate the problem, and the config option "texture-compression=true" for VRAM usage's sake. The actual "low VRAM" texture streaming mode is disabled. The rest of the options are pretty sane. Also:

- video-memory-usage=medium
- graphics-quality=high

Purely for testing various combinations.

Given the accounted 750-ish MB worth of VRAM I may be getting onto something with a 2GB VRAM GPU.
Attachments
factorio-current.log
(18.38 KiB) Downloaded 91 times

orzelek
Smart Inserter
Smart Inserter
Posts: 3911
Joined: Fri Apr 03, 2015 10:20 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by orzelek »

Do you have Alien Biomes only or also hi-res terrain addon.
I'm pretty sure that with 2GB VRAM you will end up memory swapping on GPU a lot. That would explain why compression takes so long.

It's highly possible that actual textures are bigger then 2GB and compression causes re-loading them from RAM.

posila
Factorio Staff
Factorio Staff
Posts: 5202
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by posila »

Hello, I don't consider this a bug because texture-compression=true is not available from game options, it just remnant from time I considered compressing all sprites but rest of the team didn't like compression artifacts that are quite visible on animated sprites.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by sthalik »

posila wrote:Hello, I don't consider this a bug because texture-compression=true is not available from game options, it just remnant from time I considered compressing all sprites but rest of the team didn't like compression artifacts that are quite visible on animated sprites.
For 2GB VRAM GPUs it's actually possible to fit in neatly into little over 1 GB. That is, with Evil Biomes HD and high sprite quality. Not saying it's a good option in general but for low-end card owners it does wonders, 60 FPS where normally it's 35-ish. I'll read up on artifacts and how much they're visible on 4K displays with higher pixel density.

I also verified with apitrace that:
- you transfer RGB_888 to S3TC_DXTN5 using an FBO, which is correct
- you make only enough textures when loading into the S3TC atlas, which is also correct

Thus there's no bug on end. I'll come back to this issue in case 4K pixel density can ignore artifacts.
orzelek wrote:Do you have Alien Biomes only or also hi-res terrain addon.
I'm pretty sure that with 2GB VRAM you will end up memory swapping on GPU a lot. That would explain why compression takes so long.

It's highly possible that actual textures are bigger then 2GB and compression causes re-loading them from RAM.
They're 3GB on "auto" and 1.1GB on force-compress-all-atlasses. Try generating an Alien Biomes HD map with biome density of 10 and see if you get a slowdown on max unzoom.

orzelek
Smart Inserter
Smart Inserter
Posts: 3911
Joined: Fri Apr 03, 2015 10:20 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by orzelek »

sthalik wrote:
orzelek wrote:Do you have Alien Biomes only or also hi-res terrain addon.
I'm pretty sure that with 2GB VRAM you will end up memory swapping on GPU a lot. That would explain why compression takes so long.

It's highly possible that actual textures are bigger then 2GB and compression causes re-loading them from RAM.
They're 3GB on "auto" and 1.1GB on force-compress-all-atlasses. Try generating an Alien Biomes HD map with biome density of 10 and see if you get a slowdown on max unzoom.
One of the reasons I stayed away from alien biomes is that with high res version it stutters without compression on 8GB GPU. With compression enabled Factorio reports 3.3GB in texture atlas size and it doesn't stutter - but it has other issues like very slow map generation.
This mod is pretty and adds a lot of terrain variety but it pushes the game engine to the limit.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by sthalik »

posila wrote:Hello, I don't consider this a bug because texture-compression=true is not available from game options, it just remnant from time I considered compressing all sprites but rest of the team didn't like compression artifacts that are quite visible on animated sprites.
On a 4K 43" display I haven't seen any of the typical S3TC artifacts. Do you (or others) have a case where there's a visible quality loss? I converted the pixel art to DXT5 and compared pixel values. There were only very few stray pixels with over 6 total of R+B+G difference. Also there's the matter of hires compressed art vs uncompressed normal-quality art. Could you consider adding more atlas categories to the "compressible" list, or add them only if hires tiles are used? Please consider reevaluating S3TC usage amount based on these observations. I'll send a pixel difference histogram in another message.

On this 2GB VRAM card I'm able to play vanilla with hires pixel art, max atlas size, and no low VRAM provided the "true" option is used. Fully-compressed atlas size is 750 MB (hires alien biomes are 1.1GB, for testing). Also any random stutter is gone, before it happened even with normal tiles.

The original issue in this thread is invalid. I verified with apitrace that you do a hardware-assisted upload with glTex(Sub)Image to a DXT surface, and use memory efficiently for things loaded before creating the full-on textures. This is a "by-the-book" approach.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by sthalik »

posila, can you allow caching compressed textures? There's a caching option but it uploads pre-compressed ones. There's gl-compressed-tex-(sub)-image that does the trick for S3TC.

posila
Factorio Staff
Factorio Staff
Posts: 5202
Joined: Thu Jun 11, 2015 1:35 pm
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by posila »

Hi, I don't plan to do any more changes in 0.16. We wrote the new rendering framework to work like that from the get-go, but there is still lot of work ahead of us and I don't even know if static atlases will be still a thing when 0.17 is released.

As for DXT artifacts - see for example this: http://joostdevblog.blogspot.cz/2015/11 ... 3-dxt.html
We don't have cartoony graphics as Awesomnauts do, so single compressed frame doesn't have as noticable artifacts (unless you are looking for the 4x4 grid of compressed pixels in which case you'll find it), but we have lot of small details and anything that animates has different artifacts in each frame and that creates quite noticable noise (see for example character just standing if you compress everything). Also compression of tightly packed atlases with mipmaps (like tiles) creates artifacts in lower mip levels as neighboring tiles start to share compression blocks.

There is potential for huge VRAM saving in shadows - to use single channel texture and compress it using BC4 which would make it 1/8th of current uncompressed size. We might also make an option to use lower-res sprites for shadows. And last but not least lot of shadow sprites could be cropped more because they contain also parts that is always occluded by machine that is casting the shadow.

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by sthalik »

Even if the artifacts in cartoon-sprite games are visible on 1080p displays, they are negligible or invisible on my 2160p 43" screen [1]. This display has higher pixel density, almost retina. The 4K displays are getting more common. Some vendors even go for future 8K for graphics artists etc. If it was possible I'd invite you to see 4K videos at my place :)

It's only few images that result in bad S3TC artifacts. The imagemagick difference metric skyrockets for few directories and is minimal for the rest. The metrics could be better but they're heavily correlated with perceptual quality anyway.

As a purely theoretical non-suggestion, what do you think of pre-compressed ASTC textures? Sadly these can't be done in real time, but have variable block size. ETC2 is even worse in the respect that it uses twice the space of S3TC despite being a newer invention.

[1] Pixels per inch only increases from 80 to 100 in 4K but it's enough to make a 1080p image look blurry and awful on the same display.

Hexicube
Fast Inserter
Fast Inserter
Posts: 204
Joined: Wed Feb 24, 2016 9:50 pm
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by Hexicube »

posila wrote:We might also make an option to use lower-res sprites for shadows. And last but not least lot of shadow sprites could be cropped more because they contain also parts that is always occluded by machine that is casting the shadow.
Honestly, for shadows, you could probably get away with something like 4-bit greyscale.
sthalik wrote:Pixels per inch only increases from 80 to 100 in 4K but it's enough to make a 1080p image look blurry and awful on the same display.
What's the actual 4k resolution you have? There seems to be different ones, and only 3840x2160 will look reasonable with 1080p as it is exactly doubled up dimensions. This is also also likely due to the fact that screen will use bilinear scaling regardless of context and will likely not have a way to change it.

If you want, I can churn up some simple graphics to demonstrate what the issue actually is and why it looks blurry and gross instead of crisp like it should be for what is otherwise an ideal upscale scenario involving integer multiplication.

If your screen has the option, fiddle with the upscale or interpolation setting. There should be one setting that is basically "don't blend pixels" (might be called "none") and will look massively better when viewing 1080p with it as it will basically represent each 1080p pixel with a 2x2 pixel block on 4K (if you have a 3840x2160 native resolution).

sthalik
Long Handed Inserter
Long Handed Inserter
Posts: 56
Joined: Tue May 01, 2018 9:32 am
Contact:

Re: s3tc compressing atlases takes up to half a minute

Post by sthalik »

Hexicube wrote:
sthalik wrote:Pixels per inch only increases from 80 to 100 in 4K but it's enough to make a 1080p image look blurry and awful on the same display.
What's the actual 4k resolution you have? There seems to be different ones, and only 3840x2160 will look reasonable with 1080p as it is exactly doubled up dimensions. This is also also likely due to the fact that screen will use bilinear scaling regardless of context and will likely not have a way to change it.

If you want, I can churn up some simple graphics to demonstrate what the issue actually is and why it looks blurry and gross instead of crisp like it should be for what is otherwise an ideal upscale scenario involving integer multiplication.

If your screen has the option, fiddle with the upscale or interpolation setting. There should be one setting that is basically "don't blend pixels" (might be called "none") and will look massively better when viewing 1080p with it as it will basically represent each 1080p pixel with a 2x2 pixel block on 4K (if you have a 3840x2160 native resolution).
Do you know of d3d or gl wrappers that do resizing with a different interpolator while keeping everything else intact? I have plenty of sprite-based RPGs that look too small in the native resolution, while looking totally awful in any lower. Though 1440p is bearable for sprites. I'm thinking of force-resized windows or dxwnd and then reshade. dxwnd never worked right on my end though.

Nearest-neighbor inteprolation might turn out to be good, but there's also the potential to go for bilinear, bicubic (ringing...) or something more modern and complex[1] -- and applying LumaSharpen and AdaptiveSharpen with tuned coefficients only after the scaling is done.

Neither my Radeon, nor the screen itself has any options like this. Sad panda.

[1] With the madVR presenter I can watch 1080p movies with decent quality on 2160p. It applies luma and chroma postprocessing but only after resizing the image with custom interpolators that give much sharper and artifact-free images than bicubic/Catmull-Rom/Mitchell-Netravali.

Post Reply

Return to “Not a bug”