CPU: 3700X

Motherboard: Aorus B550 Elite

RAM: 8GBx4 Corsair Vengence LPX 3200

GPU: PowerColor 5700XT

PSU: Cooler Master MWE 1050 V2

Built in 2020.

Since last month, my PC started having random reboots and giving ā€˜Machine Check Exceptionā€™ error, similar to these:

https://old.reddit.com/r/AMDHelp/comments/190mkn0/5950x_whea_error_18_machine_check_exception/

https://old.reddit.com/r/AMDHelp/comments/qia2e7/whea_18_critical_error_computer_goes_black_restart/

https://old.reddit.com/r/buildapc/comments/150m14n/pc_randomly_restarts_whealogger_id_18/

And now from the last 3 days the system doesnā€™t boot. When I power on the computer, all fans start spinning but keyboard and mouse LEDs donā€™t light up. Pressing CTRL+ALT+DEL doesnā€™t reboot system neither does pressing the power button for few seconds.

I suspect that motherboard has gone kaput and isnā€™t completing or even starting the boot process, which is why keyboard and mouse arenā€™t getting any signal or power from motherboard or why restart or power down functionality is working.

Before the system stopped booting, I was trying to solve the machine check exception error by updating BIOS, updating chipset drivers, changing BIOS settings etc. But now Iā€™m thinking none of it couldā€™ve helped because the board itself was deteriorating.

Also during that time, I would randomly get display glitches (pic below) which could only be solved by restarting the machine so I was suspecting it might be GPU that was causing the problems.

Sometimes it would show chessboard like pattern. I guess this was also because of some issue with mobo-GPU connection?

Anyway before changing the board is there anything else I can try? Changing it is a pain so Iā€™m trying to avoid that. šŸ˜‚

3 points

Remove the video card, use inboard video, and use a single stick of ram. If itā€™s still wonky try a different power supply.

permalink
report
reply

Thereā€™s no onboard video for this setup.

permalink
report
parent
reply
1 point

A second idea if you cannot source another gpu; Change PCIe slots. While I believe your video card is going bad or at a minimum: overheating: consider moving it to the other x16 slot on your mobo as it IS technically possible the slot is having issues. It also doesnā€™t hurt to spray your case out with canned air to get the heat capturing dust out.

permalink
report
parent
reply

I tried without GPU and the problem persists. Iā€™ll probably take the whole cabinet to my friendā€™s place when heā€™s available and try replacing components one by one.

permalink
report
parent
reply
3 points
*

for whatever itā€™s worth, my powercolor red devil 5700 died in a similar way to this several years ago. Got to a point where it failed to output any display signal at all.

Powercolor had an insane return rate for their NV10 GPUs, at least in Europe.

Are you getting WHEA logs here? Do they implicate a specific component?

permalink
report
reply

for whatever itā€™s worth, my powercolor red devil 5700 died in a similar way to this several years ago. Got to a point where it failed to output any display signal at all.

Thatā€™s what I thought too but as Iā€™ve mentioned, keyboard and mouse are not getting any power from motherboard.

If it was GPU issue, those would have active lights with only monitor not getting any signal from GPU.

Are you getting WHEA logs here? Do they implicate a specific component?

When the system still used to turn on properly, Iā€™d get these in Windowsā€™ event viewer:

Reported by component: Processor Core

Error Source: Machine Check Exception

Error Type: Cache Hierarchy Error

Processor APIC ID: This kept changing

Provider Name: Microsoft-Windows-WHEA-Logger

Event ID: 18

When I started looking up information about this error I found out it could be caused by literally anything. Faulty CPU, board, memory, PSU. Sometimes setting CPU voltage around 1.30-1.35 helped. In one case the guy replaced his custom power cables with default ones and that solved the issue.

permalink
report
parent
reply
1 point
*

Oh, cache hierarchy errors were decently common with Matisse too. I think this one may be your CPU my colleague hit this the other day with their olā€™ 3700X. I donā€™t suppose you could RMA that?

permalink
report
parent
reply

Thatā€™s going to be my last resort.

permalink
report
parent
reply
2 points

I agree with Brkdncr, this sounds like a video card issue.

Iā€™ve had this problem multiple times before. The combination of display glitches and the fans spinning but no numlock or keyboard functionality simply points to the video card first.

In short, during POST of the BIOS it attempts an init to the display, fails and then stops attempting the boot sequence. It, the video card, is just as important during init as the motherboard registers, RAM and CPU all starting.

So, start with video, see if it works without the 5700XT and using the onboard or some other cheap pcie card. If that fails too, then itā€™s most likely the mobo as assumed. This just doesnā€™t sound like a mobo issue though.

permalink
report
reply

Thereā€™s no onboard video for this setup* but Iā€™ll try to boot without GPU.

*Only Ryzen CPU with G suffix have integrated graphics. Plus, my CPU is Zen 2 which didnā€™t have integrated graphics at all.

In short, during POST of the BIOS it attempts an init to the display, fails and then stops attempting the boot sequence. It, the video card, is just as important during init as the motherboard registers, RAM and CPU all starting.

I wasnā€™t aware of this. I assumed display check would be last considering there can be systems without display.

Let me check if I get NUM LOCK and CAPS LOCK LEDs after removing GPU.

permalink
report
parent
reply
2 points

While I call it init, many will assume I am referring to the boot init, but I am actually referring to the bios initialization (init).

That said, most BIOS inits go in this order:

  • Power Detection check. If enough power, proceed.

  • CPU program link, CPU calls the BIOS to basically wake up and run the bios program.

  • Ram Detection check. If RAM is present, the BIOS will use about 64k to load from ROM to RAM (called the bios reserve area) that then does the next steps.

  • Hardware Detection check. Identifiers of the hardware are detected, enumerated, configured and initialized.

  • Boot Sequence is initialized whereby the BIOS does a handoff to the bootloader.

Itā€™s during the hardware detection phase when the display is initialized by the gpu and you often see it displaying the bios version, then counting RAM. If the gpu is working BUT the display out isnā€™t, itā€™ll actually continue to boot 100% of the time (it doesnā€™t care). If the gpu hardware itself doesnā€™t respond correctly to the BIOS request however, it sends the hardware detection of the BIOS into a loop or shuts the system down, never getting to the final step: boot sequence.

Depending on the bios type, it may or may not show numlock. Iā€™ve also seen it act differently on UEFI enabled systems than when itā€™s set to classic bios. So, it just depends.

Regardless, see if you can source another pcie gpu for testing this. It only takes a minute and tbh, it doesnā€™t hurt to have a cheap used pcie vid card in your pc tools for such things.

Good luck!

permalink
report
parent
reply

You canā€™t see in the pic but the white glitch columns arenā€™t continuous. Those are white lines interspersed with whatā€™s on display.

permalink
report
reply
1 point

I had a 3700x CPU that would lock the system up randomly. I was on Linux with an Nvidia GPU so the symptoms donā€™t match up. I think itā€™s worth borrowing a friendā€™s CPU, or buying a used one locally, to test. If you buy a used CPU locally for $50, you should be able to sell it for $50 after your test.

permalink
report
reply

techsupport

!techsupport@lemmy.world

Create post

The Lemmy community will help you with your tech problems and questions about anything here. Do not be shy, we will try to help you.

If something works or if you find a solution to your problem let us know it will be greatly apreciated.

Rules: instance rules + stay on topic

Partnered communities:

You Should Know

Reddit

Software gore

Recommendations

Community stats

  • 241

    Monthly active users

  • 304

    Posts

  • 1.8K

    Comments

Community moderators