I have assembled my desktop PC about 2 years ago. It’s fairly beefy (AMD Ryzen 9 3950X 16-Core Processor, 128Go RAM, nVidia RTX 3080 Ti). It’s running debian stable.

Once in a while (not that often, but like every 2 weeks or so), seemingly at random times, not especially under heavy loads, the system crash and freeze, irresponsive to even the linux sysrq magic keys. I never manage to find what was the cause. One interesting fact is that when it happens, for some reason it seems to “freeze my network” too, ie, other (ethernet) devices on my local network have no connectivity anymore. They’re all connected to the same router, but not through this crashing PC. Connectivity comes back as soon as I force shutdown the crashing PC.

What can cause this and how could I fix these freezes?

1 point

I have no idea, but it seems like interesting problem. Good luck finding a solution. (Just commenting to get notified of someone has a solution)

permalink
report
reply
5 points

Your network card is hammering the network with packets and not taking a break. It doesn’t give the rest of they network a chance to talk.
If this was a windows machine, I would start by reinstalling the network driver, but I don’t know Linux well enough to say.

permalink
report
reply
1 point

Oh this would explain why it kills the connectivity of all ethernet-connected devices. The ethernet interface is the one on the mobo. Drivers are included by the linux kernel AFAIK. The problem persisted across 2 debian versions so I am not sure re-installing drivers would do anything here. But thanks for the plausible explanation about the network issue!

permalink
report
parent
reply
1 point

You won’t have much luck with doing anything to the driver part of it, but you could try a custom kernel. There’s two advantages to that, one is it would be more recent than whatever kernel that Debian is using, and the second is the optimized networking stack, which speeds up processing of packets and improves the congestion handling algorithm. I’d recommend the Xanmod kernel for this: https://xanmod.org/

Alternatively, if we suspect your network is the culprit then the solution could be as simple a buying a new card and disabling the builtin one.

permalink
report
parent
reply
1 point

I like my debian vanilla but thanks for the suggestion. The other network card would be interesting to try out. I don’t really suspect the network card, since I have no idea whether the network block is a consequence or a cause here.

permalink
report
parent
reply
1 point

Anything interesting in your logs?

permalink
report
reply
1 point

Uninstall (I don’t know how, on debian) NetworkManager and reinstall it (better get a .deb)

Then sudo systemctl enable NetworkManager.service

Reboot and hope for the best.

permalink
report
reply
1 point

This has been happening for 2 years, with the previous debian version too, so I doubt this would do anything?

permalink
report
parent
reply
1 point
*

Have you been updating or reinstalling ?

Parce que si c’est update sur update ça pourrait venir de là. Dans ce cas réinstalle peut etre ?

permalink
report
parent
reply
1 point
*

Updating. I’m willing to try your solution but I am a little bit worried about not being able to reinstall anything after I sudo apt remove network-manager. Why would a package reinstallation help? Wouldn’t resetting the config files be more efficient btw?

EDIT: Ce n’est pas update sur update, y a juste eu bullseye (d’abord testing, puis stable), puis récemment je suis passé à bookworm. Mais le soucis est là depuis le début. Il est pas trop chiant parce que c’est rare, mais quand même ça m’enquiquine.

permalink
report
parent
reply
1 point

If its been happening for multiple years and os’s, maybe your network card is dead/dying? Buy a new network card and see if that helps?

permalink
report
parent
reply
2 points

Everything is 2 yo, so this would mean the mobo (well, the onboard ethernet thing) was malfunctioning from the start. Maybe!

I might try disabling and using the onboard wifi chip temporarily instead, just to see if I notice a new freeze. The issue is, I’ve never understood what triggers it, and it’s quite rare (less than once a week), so it’s really annoying to debug…

permalink
report
parent
reply
1 point
*

Check your system logs such as dmesg and journalctl immediately after the freeze (if it’s still occurring). You could filter journalctl log to show, say the last 5 minutes since the last boot, like this:

journalctl --boot=-1 --since="5 min ago" --priority=0..3

permalink
report
reply
3 points
*

It happened yesterday, and here are the latest log lines before the freeze:

Sep 14 23:30:30 licorne NetworkManager[1291]:   [1694727030.1207] device (wlp4s0): set-hw-addr: set MAC address to CA:D0:86:5F:F9:85 (scanning)
Sep 14 23:30:30 licorne NetworkManager[1291]:   [1694727030.1478] device (wlp4s0): supplicant interface state: inactive -> disconnected
Sep 14 23:30:30 licorne NetworkManager[1291]:   [1694727030.1478] device (p2p-dev-wlp4s0): supplicant management interface state: inactive -> disconnected
Sep 14 23:30:30 licorne NetworkManager[1291]:   [1694727030.1530] device (wlp4s0): supplicant interface state: disconnected -> inactive
Sep 14 23:30:30 licorne NetworkManager[1291]:   [1694727030.1530] device (p2p-dev-wlp4s0): supplicant management interface state: disconnected -> inactive
Sep 14 23:30:58 licorne syncthing[3169286]: [VY2L4] INFO: Established secure connection to REDACTED1 at [::]:22000-192.168.0.14:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256/LAN-P20
Sep 14 23:30:58 licorne syncthing[3169286]: [VY2L4] INFO: Device REDACTED1 client is "syncthing v1.23.4" named "REDACTED2.lan" at [::]:22000-192.168.0.14:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256/LAN-P20
Sep 14 23:31:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:31:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:31:11 licorne syncthing[3169286]: [VY2L4] INFO: Established secure connection to REDACTED1 at 192.168.0.98:22000-192.168.0.14:22000/tcp-client/TLS1.3-TLS_AES_128_GCM_SHA256/LAN-P10
Sep 14 23:31:11 licorne syncthing[3169286]: [VY2L4] INFO: Replacing old connection [::]:22000-192.168.0.14:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256/LAN-P20 with 192.168.0.98:22000-192.168.0.14:22000/tcp-client/TLS1.3-TLS_AES_128_GCM_SHA256/LAN-P10 for REDACTED1
Sep 14 23:31:11 licorne syncthing[3169286]: [VY2L4] INFO: Connection to REDACTED1 at [::]:22000-192.168.0.14:22000/quic-client/TLS1.3-TLS_CHACHA20_POLY1305_SHA256/LAN-P20 closed: replacing connection
Sep 14 23:31:11 licorne syncthing[3169286]: [VY2L4] INFO: Device REDACTED1 client is "syncthing v1.23.4" named "REDACTED2.lan" at 192.168.0.98:22000-192.168.0.14:22000/tcp-client/TLS1.3-TLS_AES_128_GCM_SHA256/LAN-P10
Sep 14 23:32:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:32:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:33:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:33:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:33:28 licorne systemd[1]: Started anacron.service - Run anacron jobs.
Sep 14 23:33:28 licorne anacron[4171587]: Anacron 2.3 started on 2023-09-14
Sep 14 23:33:28 licorne anacron[4171587]: Normal exit (0 jobs run)
Sep 14 23:33:28 licorne systemd[1]: anacron.service: Deactivated successfully.
Sep 14 23:34:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:34:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:35:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:35:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:36:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:36:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:37:04 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:37:04 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:37:25 licorne NetworkManager[1291]:   [1694727445.1045] device (wlp4s0): set-hw-addr: set MAC address to EE:65:E2:6E:73:D1 (scanning)
Sep 14 23:38:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.
Sep 14 23:38:03 licorne rtkit-daemon[1541]: Supervising 4 threads of 4 processes of 1 users.

permalink
report
parent
reply

techsupport

!techsupport@lemmy.world

Create post

The Lemmy community will help you with your tech problems and questions about anything here. Do not be shy, we will try to help you.

If something works or if you find a solution to your problem let us know it will be greatly apreciated.

Rules: instance rules + stay on topic

Partnered communities:

You Should Know

Reddit

Software gore

Recommendations

Community stats

  • 302

    Monthly active users

  • 279

    Posts

  • 1.7K

    Comments

Community moderators