Page 1 of 1

How to diagnose desktop crash (Mint)?

Posted: Thu Apr 10, 2025 2:13 am
by NineNine
A few times now, when I'm looking at something in the game where there's a lot going on (biggest platform on its way to shattered planet), my entire OS has completely crashed. It looks like my desktop (Cinnamon) dies, because I get a login box (but I can't actually log in). There are no Factorio logs showing what caused it. I have never experienced this sort of crash in any other application. How do I start to diagnose this? I'm running the most up to date version of Mint, using the most recent 5.15.* kernel.

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 10, 2025 4:31 am
by eugenekay
This sounds like the classic “Out of Memory”, either as a result of System Memory Exhaustion, or you have hit a per user / cgroups limit leading to your entire Desktop Session being terminated.

System Logs are the best place to start (/var/log/syslog); as well as the output from the ‘dmesg’ command.

Good Luck!

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 3:40 am
by NineNine
Thanks for the suggestion! I found this in /var/log/syslog. My guess is that I need to get a new graphics card (or at least re-seat my existing one). I have to wonder if my biggest platform made it melt a little bit...?

Apr 16 16:41:04 Moby kernel: [33151.327338] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
Apr 16 16:41:04 Moby kernel: [33156.201055] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1688785, emitted seq=1688787
Apr 16 16:41:04 Moby kernel: [33156.201193] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Factorio 2 Spac pid 20127 thread Factorio 2:cs0 pid 20140
Apr 16 16:41:04 Moby kernel: [33156.201288] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Apr 16 16:41:04 Moby kernel: [33156.456498] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Apr 16 16:41:04 Moby kernel: [33156.456557] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Apr 16 16:41:05 Moby kernel: [33156.657723] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Apr 16 16:41:05 Moby kernel: [33156.657779] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Apr 16 16:41:05 Moby kernel: [33156.858661] [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx
Apr 16 16:41:05 Moby kernel: [33156.884298] [drm] free PSP TMR buffer
Apr 16 16:41:05 Moby kernel: [33156.929222] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0c4600 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929232] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0c5600 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929240] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0d0600 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929247] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0d1700 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929253] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0e4700 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929260] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0e5700 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929266] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dee0e5e00 flags=0x0030]
Apr 16 16:41:05 Moby kernel: [33156.929272] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dc4a9fa00 flags=0x0010]
Apr 16 16:41:05 Moby kernel: [33156.929279] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dc4a96300 flags=0x0010]
Apr 16 16:41:05 Moby kernel: [33156.929285] amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xf7dc4a97000 flags=0x0010]
Apr 16 16:41:05 Moby kernel: [33156.929341] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Apr 16 16:41:05 Moby kernel: [33156.929344] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Apr 16 16:41:05 Moby kernel: [33156.929398] amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Apr 16 16:41:05 Moby kernel: [33157.439707] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 5:38 am
by eugenekay
Never seen those errors before! It looks like a GPU driver issue/crash; whether this is due to a Hardware issue is something you would need to determine. It does not look like a PCIE disconnect/reconnect - you would probably experience a total system power-reset - so I don’t think re-seating the card itself would help.

I would suggest a temperature-check (and dusting of all fans/filters), GPU Driver/Linux kernel update. You can also try a 3D GPU benchmark / stress-test to see if it’s reproducible that way.

Good Luck!

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 6:17 am
by pioruns
5.15 kernel doesn't sound like most recent Linux Mint. Please show result of the following command:

Code: Select all

inxi -Fm
Install inxi package if it throws error.

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 12:41 pm
by NineNine
pioruns wrote: Thu Apr 17, 2025 6:17 am 5.15 kernel doesn't sound like most recent Linux Mint. Please show result of the following command:

Code: Select all

inxi -Fm
Install inxi package if it throws error.
There's an active 5.15.x kernel track and a 6.8.x kernel track, and I use the more conservative one. The 5.15.x kernel is supported through April 2027, at least.

Here are the results from inxi -Fm:

System:
Host: Moby Kernel: 5.15.0-136-generic x86_64 bits: 64
Desktop: Cinnamon 6.0.4 Distro: Linux Mint 21.3 Virginia
Machine:
Type: Desktop Mobo: Micro-Star model: MAG B650 TOMAHAWK WIFI (MS-7D75)
v: 1.0 serial: <superuser required> UEFI: American Megatrends LLC. v: 1.L0
date: 12/11/2024
Memory:
RAM: total: 61.9 GiB used: 2.88 GiB (4.7%)
RAM Report:
permissions: Unable to run dmidecode. Root privileges required.
CPU:
Info: 8-core model: AMD Ryzen 7 7800X3D bits: 64 type: MT MCP cache:
L2: 8 MiB
Speed (MHz): avg: 2968 min/max: 3000/5049 cores: 1: 2879 2: 2879 3: 2876
4: 3599 5: 2877 6: 2880 7: 2881 8: 2880 9: 2880 10: 2909 11: 2878 12: 3597
13: 2879 14: 2876 15: 2851 16: 2877
Graphics:
Device-1: AMD driver: amdgpu v: kernel
Device-2: AMD driver: N/A
Display: x11 server: X.Org v: 1.21.1.4 driver: X: loaded: amdgpu,ati
unloaded: fbdev,modesetting,radeon,vesa gpu: amdgpu resolution: 2560x1440
OpenGL: renderer: AMD Radeon RX 6650 XT (navi23 LLVM 15.0.7 DRM 3.42
5.15.0-136-generic)
v: 4.6 Mesa 23.2.1-1ubuntu3.1~22.04.3
Audio:
Device-1: AMD Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT]
driver: snd_hda_intel
Device-2: AMD driver: snd_hda_intel
Device-3: AMD Family 17h HD Audio driver: snd_hda_intel
Device-4: Micro Star USB Audio type: USB
driver: hid-generic,snd-usb-audio,usbhid
Sound Server-1: ALSA v: k5.15.0-136-generic running: yes
Sound Server-2: PulseAudio v: 15.99.1 running: yes
Sound Server-3: PipeWire v: 0.3.48 running: yes
Network:
Device-1: MEDIATEK driver: mt7921e
IF: wlp13s0 state: down mac: e8:fb:1c:b3:25:e9
Device-2: Realtek RTL8125 2.5GbE driver: r8169
IF: enp14s0 state: up speed: 1000 Mbps duplex: full
mac: 04:7c:16:51:22:ab
Device-3: MEDIATEK driver: mt7921e
IF: wlp15s0 state: down mac: f0:a6:54:4f:5c:5b
Bluetooth:
Device-1: MediaTek Wireless_Device type: USB driver: btusb
Report: hciconfig ID: hci0 rfk-id: 0 state: down
bt-service: enabled,running rfk-block: hardware: no software: yes
address: F0:A6:54:4F:5C:5C
Drives:
Local Storage: total: 931.51 GiB used: 553.64 GiB (59.4%)
ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 PRO 1TB
size: 931.51 GiB
Partition:
ID-1: / size: 915.32 GiB used: 553.64 GiB (60.5%) fs: ext4
dev: /dev/nvme0n1p2
ID-2: /boot/efi size: 511 MiB used: 6.1 MiB (1.2%) fs: vfat
dev: /dev/nvme0n1p1
Swap:
ID-1: swap-1 type: file size: 2 GiB used: 0 KiB (0.0%) file: /swapfile
Sensors:
System Temperatures: cpu: N/A mobo: N/A gpu: amdgpu temp: 50.0 C
Fan Speeds (RPM): N/A gpu: amdgpu fan: 0
Info:
Processes: 391 Uptime: 5m Shell: Bash inxi: 3.3.13

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 12:45 pm
by NineNine
eugenekay wrote: Thu Apr 17, 2025 5:38 am Never seen those errors before! It looks like a GPU driver issue/crash; whether this is due to a Hardware issue is something you would need to determine. It does not look like a PCIE disconnect/reconnect - you would probably experience a total system power-reset - so I don’t think re-seating the card itself would help.

I would suggest a temperature-check (and dusting of all fans/filters), GPU Driver/Linux kernel update. You can also try a 3D GPU benchmark / stress-test to see if it’s reproducible that way.

Good Luck!
Thanks for the suggestions. I'm vacuuming out this computer today to see if that helps. I don't really make any system-wide changes that would break a graphics driver, and I stay up to date with all of the standard software updates, so I'm leaning towards some sort of hardware failure (as opposed to a driver issue). The errors (to me at least) read like they're waiting for the graphics card and the graphics card stops responding. The "page fault" errors to me suggest a problem with the on-board memory.

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 3:03 pm
by eugenekay
NineNine wrote: Thu Apr 17, 2025 12:45 pmThanks for the suggestions. I'm vacuuming out this computer today to see if that helps.
Vacuum cleaners generate an ENORMOUS amount of static electricity, which can be discharged directly to your computer’s sensitive parts via the plastic hose. Not a great idea when you already have hardware suspicions.

I recommend “canned air”, or an air compressor regulated to <40psi pressure. Or a soft bristle broom brush.

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 3:54 pm
by pioruns
Agreed with eugenekay, do not use vacuum cleaner. Use compressed air instead. I use a small handheld air blower for these things.

GPU core hanging and resetting suggest this is a hardware problem, as you are expecting.

Regarding Linux Mint, AMD Graphics driver and the kernel on Linux Mint 21.3 Virgina (Long Term Support) is many years old. It is possible that newer version of Linux kernel and AMD driver may work better and may fix this problem, or may at least let you recover from GPU core crashing. If it's possible, install a newer version of Mint (on another drive or make sure to have a backup). You can test things for a couple of days and then decide to stay on newer version or revert back to old one.

I use Radeon 6xxx XT series too (Radeon 6800 XT exactly), and I am on a newer 6.1 LTS kernel (Debian). No problems with that one, since I installed this Debian version 2 years ago.

Also, I just check MSI support page, and there seems to be a new BIOS version for your motherboard, providing better performance and RAM stability, you may want to install that.

Re: How to diagnose desktop crash (Mint)?

Posted: Thu Apr 17, 2025 8:52 pm
by NineNine
I've been running this software + hardware combo for a few years without any problems, so I would be surprised if there was a software issue, especially after finding those errors that seem to point to a hardware problem. Luckily, I had an extra video card laying around (Radeon RX 6600), so I slapped that in, and I'll see what happens.

I'm going to try to double the size of my biggest platforms and throw them at the shattered planet to test...

Re: How to diagnose desktop crash (Mint)?

Posted: Sat Apr 26, 2025 8:01 pm
by NineNine
I haven't seen this problem happen since I swapped out my graphics card 9 days ago. I'm going to chalk this particular problem up to hardware. Thanks for the help!

And for the record, I will still argue that a vacuum is better... You have the same amount of static electricity as blown air, but instead of blowing the dust into new places in the device, you're removing it, altogether.

...anybody want a good deal on a slightly used Radeon RX 6650 XT?

Re: How to diagnose desktop crash (Mint)?

Posted: Sat Apr 26, 2025 9:49 pm
by pioruns
NineNine wrote: Sat Apr 26, 2025 8:01 pm I haven't seen this problem happen since I swapped out my graphics card 9 days ago. I'm going to chalk this particular problem up to hardware. Thanks for the help!
Very glad to hear!
NineNine wrote: Sat Apr 26, 2025 8:01 pm And for the record, I will still argue that a vacuum is better... You have the same amount of static electricity as blown air, but instead of blowing the dust into new places in the device, you're removing it, altogether.
For the record, that's complete nonsense, my friend. Quick internet search would tell you why you are very wrong. Check out this 15 years old article for starters, no point explaining again something already said one million times.

https://www.howtogeek.com/57870/ask-how ... -keyboard/

If in doubt, check your search engine to confirm.

[Moderator note: Vacuum discussion continued in 128422]