Posted to Tom's Hardware as well, just trying to pool as much experience as possible.
Hey Guys. Been having a really difficult time narrowing down a problem with my rig as I can’t seem to find any consistency and hoping someone might have some ideas. This is a pretty long post, so please bear with me.
Here is my build as it currently stands:
Motherboard: Gigabyte Z370 Aorus Gaming 7 Bios F7
Cpu: Intel 8700k not overclocked
Ram: 16gb Corsair Vengeance DDR4 3000mhz (2x8gb) currently @ jedec default 2133mhz
Gpu: 8gb Nvidia 1070 Founders Edition not overclocked outputting to 3 of 4 displays
2gb Nvidia Gtx 760 Secondary used as Physx and outputting to 1 of 4 displays
Sound: Sounblaster Recon3D Pcie
Storage: 1- Samsung 970 EVO Nvme SSD
5 Mechanical Hdd’s used as storage and mechanical redundancy
I rebuild and overhauled the system back a year ago and upgraded to the Gaming 7 board, Intel 8700k, and at the time 16gb (2x8gb) Corsair Vengeance @ 2666mhz set to XMP and a 500gb Samsung SSD connected via Sata. The CPU was never overclocked, the RAM is only OC’d per the XMP, and the GTX 1070 was overclock +125 core/+400 mem. The system ran fine for 10 months with no issues or problems.
In Feb I decided to upgrade my storage to a Samsung 970 EVO Nvme. This would free a Sata slot for another HDD to give me some mechanical backup to my data. I also applied the new BIOS F13 to make sure I had the best compatibility for the Nvme.
During the fresh install of Windows I got a BSOD “Page Fault in Non Paged Area, reason Win32kbase.sys” during a restart removing some the bloatware for my printer (HP deskjet 1660) after installing the driver. I figured no big deal it never BSOD’S so I moved on, ran a SFC and CHKDSK just to be safe and imaged the install when done.
Over the next 4 months the system would randomly (1-3 of every 20 starts) BSOD during windows sign-in with the same BSOD Page Fault in Non Paged area pointing to Win32kbase.sys (80%) or Win32kfull.sys (20%). The dump would never create, it would just sit @ 0% even though my settings for the dump file were set correctly. If the system made it past sign in without crashing, it would operate normally with no issues. I could browse the web on Chrome, watch movies, do hours of heavy gaming with 100% load on my 1070 and high CPU/MEM usage and never get a BSOD. I tried removing the GPU overclock in MSI Afterburner. Ran several SFC’S, CHKDSK’S, and DISM (all returned no errors or corruption). But kept getting the BSOD.
At this point I figured I had some sort of corruption in the windows install even after passing SFC and CHKDSK, so decided to do a fresh install, and this is where the problems mounted. Any “()” below is what I was thinking, and the BSOD is always page fault in non paged area.
During the first reinstall got the BSOD after a restart doing a windows update for .net and my sound card. (Maybe my sound card is bad?)
Reinstalled windows again to see if it would duplicate and got a BSOD after restart installing the first driver which was the chipset (Ok, maybe not my sound card?).
Reinstalling windows again and got a BSOD formatting the SSD partitions in windows setup. (Is there something for the Nvme I am forgetting?).
Took the tower apart and cleaned all contacts. Made sure I had good connections and swapped to the other memory DIMMS. Restarted and installed windows. Made it through installing all the drivers. After reading that this BSOD usually means a problem with memory I did a windows memory diagnostic, and it was good no errors. Didn’t have a lot of free time to extensively test the ram, so I picked up a new kit @ Best buy. And with the only change to the system being the Nvme drive I figured I must be missing something as well. Did some reading and found out there is a driver provided by Samsung for the drive.
Installed the new RAM, set the XMP and reinstalled windows again. After the chipset, installed the Samsung Nvme driver. Got a BSOD two restarts later after installing the intel RST driver.
Did some more reading and extracted the driver to install during windows setup, loaded setup, formatted the drive, did a clean using DISKPART, then installed the Nvme driver. Continued and installed windows. Got a BSOD several restarts later. At this point began to think there might be something with the Nvme (even though SMART checked good), the MB slot, or possibly even my power supply (as it was 13yrs old). Atleast this time it created the dump file and pointed to Ntos knl 0x50.
Purchased a new Powersupply (850 watt EVGA G3), and a new 970 EVO. Installed the new power supply and 970 EVO, but this time changed to the 3rd M.2 slot. Was using the 2nd as my 1st one knocks out two of my Sata ports. Also restored all BIOS settings to default, left the RAM at non XMP 2133mhz, and disconnected EVERYTHING except both graphics cards, sound card, nvme, keyboard and mouse.
Reinstalled windows and got BSOD “memory management” on setup finalization of “getting devices ready”. (Ok, Maybe it really is my sound card as it’s the only attached device?)
Was forced to reinstall windows since the installer crashed and made it all the way through drivers, and windows updates, except BSOD this time after restart installing the MB apps for RGB Fusion (to control the LED’s) and SIV (to setup fan profiles).
Reinstalled windows without the MB apps but BSOD on restart installing the printer driver vs uninstalling unnecessary printer bloatware. (Can’t be the printer software as other crashes were prior to the printer being installed, or connected?).
At this point I was trying to reverse any other changes made from when it was stable. So I reverted back to the BIOS prior to my Nvme which is F7.
While doing more reading on the BSOD crashes, especially the 0x50 Ntos krnl related to any overclocking, I decided to look at the voltages in BIOS for the memory. The VCCIO voltage @ default “auto” non-xmp 2133mhz was @ 0.946v, and the System agent @ 1.05v. I was reading that the two voltages should be 0.05v apart. So I bumped the VCCIO voltage to 1.0v. Since then I have restarted at least 15 times, run prime95 (custom test using 15/16gb ram), and did a windows memory diagnostic, all came back clean.
So at this point I am not sure if it’s the BIOS being reverted back, me bumping up the VCCIO voltage to 1.0, or just in the lull between BSOD’s. I have sat at the computer for 15 mins continually restarting and not getting a crash. The only constant to my issues is the a BSOD will only happen while signing into windows. Windows always boots to sign-in with no issues, and once past the sign-in screen works just fine. Anyone have any advice as to what to do next if I do get another BSOD? If I don’t get another one I would like to get the memory back to XMP @ 3000mhz, but if the instability was from voltage what should I be looking for after setting the XMP or manual setup of timings?
Thanks for reading such a long post, and thank you to anyone who offers to help.
-
The first thing to do is to download and run memtest86 ( https://www.memtest86.com/download.htm -- grab the free version). You'll need to burn it onto a USB drive and boot it, since it runs as its own OS. Run a full pass or two, and see if it shows any errors. If not, then you can try running at the XMP profile; if you see errors there, then obviously the RAM won't tolerate those conditions, and if you want, you can try to find settings that will work.
If all of that comes back clean, you can then reboot Windows and run some kind of stress tests. I'm not familiar enough with Windows to know what you might do for that. I think prime95 targets the CPU rather than memory or I/O.Mr. Fox likes this. -
Have you tested without the 2GB Nvidia GTX 760 secondary GPU installed to see if anything changes? If not, try that... especially if you are using Windoze OS X. NVIDIA may be doing their notorious GPU genocide crap with Kepler. They stopped caring about Kepler performance and stability when they released Maxwell, and we're way past them caring after Pascal and Turing.
Even though you are not overclocking the memory, the errors you are experiencing sounds like a memory problem to me. It can also be voltage for the CPU, VCCIO and/or memory needs to be increased. As a general rule, undervolting anything other than the CPU core is not advised. In most cases, leaving the VCCIO, SA and other voltages on "auto" is best unless you are pushing a hefty overclock, in which case you will need to look at increasing them. Some DDR4 RAM sticks run more stable stock or with XMP profiles using 1.350V versus the 1.200V default.
Try what @rlk suggested with memtest86 and see if it errors out. If it does, try setting the RAM voltage to 1.350V and re-run the tests to see if it still errors out. You could have a bad stick of RAM, in which case no changes in settings are going to fix it.Last edited: Jul 28, 2019Papusan likes this. -
StormJumper Notebook Virtuoso
What you outa really do is Fresh install Windows O/S with motherboard drivers and let that run to see if any problem happens then you will know where to start looking before install other hardware and software and if your not doing this any help on here will do nothing to help.
-
Sorry this has taken me over 8 months to post an update but I have been very busy with family and work. Below is a recap of my issue and what was done to resolve and repair my desktop computer.
I initially started getting a BSOD 0x50 Page Fault in Non paged Area, during boot at the windows sign-in screen shortly after adding a NVME SSD, and removing my SATA SSD. The BSOD'S were never consistant and would happen 1 of every 10 to 20 boots.
A dump file was never being created no matter how many times I checked the settings within windows. And on the rare times that it did, it gave no useful information other than the 0x50 code, Win32kbase.sys or Win32kfull.sys.
If windows made it past the sign-in screen the system would run with no issues. It could be put under heavy gaming loads with >90% GPU usage and >50% CPU usage for hours with no problems. Windows update could run as well and apply updates during reboots, but crash as soon as I reached the sign-in screen.
These where my attempted fixes:
1) Run Defrag/trim and error checking
2) Run SFC /SCANNOW
3) Run DISM tool
4) Reinstall Windows from scratch (got a BSOD once during windows setup while trying to format the NVME)
5) Install Samsung provided NVME driver after windows install, and during windows setup adding driver before installing windows.
6) Restored RAM from 2666mhz XMP profile to default 2133mhz.
7) Replaced RAM with a 3000mhz kit and applied XMP. Even bumped the VCCIO and VCCSA voltages to ensure the RAM was stable. Got a crash with those settings, reverted to default 2133mhz and still got a crash.
8) Replaced 11yr old 850 Power Supply with new 850 Power Supply.
9) Replaced 1070gtx founders GPU.
10) Removed second GPU 760gtx that I was using for Physx
11) Changed slot of NVME on motherboard.
12) Replaced NVME with similar model.
13) Tried installing windows with only bare minimum connected equipment, still got crashes.
14) Ran Intel Processor diagnostic tool. Ran with no errors, on occasion would get a fail when using RAM XMP profile but was never continuous.
15) Ran Memtest86 several times with clean runs everytime.
At this point I was at a complete loss and knew the only thing left to replace would be my motherboard and CPU. The one item I didn't replace was my dedicated Soundblaster sound card as I felt it couldn't possibly cause a memory crash.
In the end I decided to replace both my motherboard and CPU at the same time, as I couldn’t stand the thought of seeing one more BSOD.
I can happily say after 6 months the system has been repaired and no longer BSOD'S. I stopped worrying once the computer made it through 40 consecutive cold boots (done over a few weeks of use) and has never crashed.
My belief is the fault may have been with the L1, L2 or L3 cache in the CPU as it did fault in the Intel Processor Diagnostic a few times when the RAM was overclocked. But unfortunately, am not sure as I replaced the board and CPU at the same time. I hope this may possibly help someone out who encounters the same problem I did, and won't have nearly the stress or aggravation in getting it sorted out -
StormJumper Notebook Virtuoso
At this point your going to have to take it to a shop to diagnosis where the problem is coming from. Trying to online diagnosis is not going to work here. Some computer fix shop has to see it running and with defaults to see what is happening to visually see the problem. Anytime the problem starts to get deeper that is a time for a computer shop to diagnosis the issues here. What I would say is update the BIOS/UEFI to the latest and set to default no OC Rams but default and use only one card the main one to install with latest Nvidia driver and see what does. If this is still causing problem then you got Board issues. Also the Windows what O/S are you using? If it's crashing during this install which isn't pushing the system then you got some serious problem going on there.
-
StormJumper, thank you for your help and input. I may have misspoke but meant to say on my previous post that my rig is fixed. All the original equipment has been reinstalled and I have not had a bluescreen since Sept of last year. The system runs flawlessly. The only new components is the MB and the CPU. I posted my recap in the hope that it helps prevent the frustration to someone else.
Help with complicated BSOD
Discussion in 'Desktop Hardware' started by Drew92983, Jul 17, 2019.