Gallery
Home lab & data vault
Share
Explore
Data recovery stories

icon picker
Recovery of HP ProBook failing NVMe

The scenario and some observations

Patient: HP ProBook 445 G7 Notebook/Laptop | Running Windows 11 previously upgraded from Windows 10 Symptom: The laptop was unable to boot windows throwing an “unmountable boot volume” blue screen / stop code [] Date: 2023-November
Typical windows startup recovery attempts did not resolve the issue - it was stuck in a recovery → blue screen loop.
HP diagnostics failed on the SMART / Drive tests. ​smartctl output that the "NVM subsystem reliability has been degraded" and the SMART self-assessment test result was: FAILED.
image.png
image.png
image(3).png
I removed the Western Digital drive from the laptop and used an NVMe enclosure to see if the drive was detected on a another system, and hoped the filesystem was mountable.
First time I’ve worked on a HP for a while. I was impressed with the diagnostics experience and QR code to call up a tailored support page with warranty coverage details.
Drive SKU: SDAPNUW-512G-1022 [
] ​
featured-image-CL-SN520_2280.png.wdthumb.1280.1280.png
💡 This drive didn’t have a hard life, one can see from the SMART info that it was an early failure and had not reached ~3% of its marketed Endurance of up to 400 TBW (Terabyte's Written). Without knowing the manufactures MTTF algorithm (Mean time to first failure [
]), it is difficult to calculate the MTTF for the drive which is published at 5 million hours. Further reading on MTTF:
Icy box enclosure SKU: IB-1817M-C31 [
]
aeZtIuWMYMzlJYneGqro-en_product_second_banner_image_71.jpg
💡 Note: that I instinctively enabled the hardware write-protect switch on the enclosure, hoping to prevent further degradation of the drive and allow a backup to be made. When I later tried the drive with the write-protect switch disabled I quickly realised that this made the drive unstable and it disconnected shortly after the operating system tried to mount it - assuming the OS was trying update/write to the NTFS tables. My hypothesis would be that the drives failure mode at that time was fortunately write sensitive - reads seemed OK apart from a dozen or so unreadable/corrupt inodes/files.

1) BiTlOcKeR ?! 😲

At first glance, it looked like the drive was BitLocker “protected”, but fortunately it was not, it was only BitLocker "prepared" but not fully protected. AFAIK, this state means that the data is encrypted, but because nothing has been set to "protect" the encryption, the drive behaves just like an unencrypted drive (perhaps equivalent to a blank password). Some useful commands to check this:
# get bitlocker drive status
manage-bde -status g:

# see what is "protecting" the drive encryption
manage-bde -protectors -get g:
where g: is the drive in question. For more details see:
💡 Had BitLocker been fully enabled and protected I would of needed at least one piece of the protection info in order to unlock the drive.

2) Backup of personal data

With the write-protect switch on the enclosure enabled, it was time to try taking a backup. The logic here was that if nothing else worked except the backup, at least some of the personal data would be recovered. The aim of the backup was not to create a fully operational set of user profiles, rather to focus on making a backup copy of the personal data.
The read-only mount of the windows volume meant that I couldn't change the ownership and permissions of the folder hierarchy even if I wanted to. So to get access to the user profiles, I needed to access to the drive as the SYSTEM user. So with PsExec64 -s -i cmd I spawned a cmd prompt running as the SYSTEM user, and from there I could spawn other processes as SYSTEM user and examine and backup the folder structure. SYSTEM user privileges were adequate for c:\Users.
A disadvantage of using the SYSTEM user was that it was difficult to give this user access to network shares requiring authentication - which would have been an ideal destination for the backups. I had to resort to using alternative locally attached storage.
💡 Note that performing a full c:\ folder copy requires elevation to the special TrustedInstaller mode. TrustedInstaller runs under nt authority\SYSTEM but does something special so that TrustedInstaller privileges are granted. Perhaps it adds something special to the env? TrustedInstaller mode is basically like god mode for reading any part of the c:\ folder hierarchy. To elevate to TrustedInstaller I used NirSoft’s AdvancedRun [
].
The :\Users backup process was successful aside from a dozen or so corrupt/unreadable inodes/files.

3) Useful 7-zip stuff

I used 7-zip with some exclusions to backup the windows volumes :\Users hierarchy - this approach skips archiving cache and other cruft.
cd /d "G:\Users"

"C:\Program Files\7-Zip\7z.exe" a -ssw -mmt=4 -ttar -snl -spf2 -xr@"E:\!backups\xxxx-laptops\hp probook\NVMe backup via enclosure\user-profile-excludes-recursive.txt" -x@"E:\!backups\xxxx-laptops\hp probook\NVMe backup via enclosure\user-profile-excludes.txt" "E:\!backups\xxxx-laptops\hp probook\NVMe backup via enclosure\user-profiles.tar" -- .

# the contents of user-profile-excludes-recursive.txt
"*cache*"
"*.tmp"
"*.dmp"
"Temporary Internet Files"
"crashdump\"
"crashdumps\"
"*crashpad*"
"Temp\"

# the contents of user-profile-excludes.txt
"*\AppData\Local\Audacity\crashreports\"
"*\AppData\Local\ElevatedDiagnostics\"
"*\AppData\Local\FACEIT\"
"*\AppData\Local\Programs\signal-desktop\"
"*\AppData\Local\ProtonVPN\DiagnosticLogs\"
"*\AppData\Local\Comms\"
"*\AppData\Local\Discord\"
"*\AppData\Local\Packages\Microsoft.Windows.Search_cw5n1h2txyewy\"
"*\AppData\Local\Packages\Microsoft.XboxGamingOverlay_8wekyb3d8bbwe\"
"*\AppData\Local\Packages\MicrosoftWindows.Client.CBS_cw5n1h2txyewy\"
"*\AppData\Local\Microsoft\Windows\Notifications\"
"*\AppData\Local\Microsoft\Windows\IEDownloadHistory\"
"*\AppData\Local\Microsoft\Windows\INetCookies\"
"*\AppData\Local\Microsoft\Windows\INetCache\"
"*\AppData\Local\Microsoft\Windows\SettingSync\"
"*\AppData\Local\Microsoft\Windows\History\"
"*\AppData\Local\Microsoft\Windows Sidebar\"
"*\AppData\Local\Microsoft\WindowsApps\"
"*\AppData\Local\LGHUB\"
"*\AppData\Local\slack\"
"*\AppData\Local\signal-desktop-updater\"
"*\AppData\Local\fman\"
"*\AppData\Local\speech\"
"*\AppData\Local\Diagnostics\"
"*\AppData\Local\ProtonVPN\Logs\"
"*\AppData\Local\ProtonVPN\Updates\"
"*\AppData\Local\Microsoft\Windows\UsrClass.dat*"
"*\AppData\Local\Packages\*\Settings\settings.dat*"
"*\AppData\Local\Application Data"
"*\AppData\Local\History"
"*\AppData\Roaming\FACEIT\"
"*\AppData\Roaming\discord\"
"*\AppData\Roaming\lghub\"
"*\AppData\Roaming\G HUB\"
"*\AppData\Roaming\Microsoft\Windows\Recent\"
"*\AppData\Roaming\Mozilla\Firefox\Crash Reports\"
"*\AppData\Roaming\Mozilla\Firefox\Pending Pings\"
"*\AppData\Roaming\TS3Client\logs\"
"*\AppData\Roaming\GoPro\GoPro Webcam\Logs\"
"*\AppData\Roaming\obs-studio\logs\"
"*\AppData\Roaming\obs-studio\crashes\"
"*\AppData\Roaming\obs-studio\profiler_data\"
"*\AppData\Roaming\obs-studio\updates\"
"*\AppData\Roaming\Mozilla\Firefox\Desktop Background.bmp"
"*\OneDrive\Erste Schritte mit*"
".\Default User"
".\All Users"
".\Default"

Command line switches

-snl stores symbolic links as links, which avoids a number of symlink challenges in the windows user profiles folder hierarchy.
-spf2 changes the path handling behaviour of 7-Zip. By default, 7-Zip uses relative paths. Note that this changes not only the behaviour of specifying what to include in an archive, but also the inclusion and exclusion path specifications. -spf2 tells 7-Zip to use full paths WITHOUT drive letters. See the manual for more details. One useful facet of this feature is that relative paths are not converted to fully-qualified, so it is possible to specify relative and full-qualified paths.
CAUTION -spf* switches also change the behaviour of archive extraction!
💡 Note: From trial and error: when matching a symlink path, you CANNOT use the \ suffix to match symlinked folders (as they are not folders).
-x exclude filenames General observation: AFAIK, in a simple single-hierarchy archive, the archive root is determined by the path being archived (the source hierarchy) (the archive root might also be inherited from pwd?), this is relevant for -i include and -x exclude patterns. The include and exclude patterns must be relative to the archive root for pattern matching to work as expected. I have had very limited success using the .\ anchored suffix for pattern matching, and the same with -spf* switches and absolute/fully qualified include and exclude patterns.
-x is a subset of the -i functionality. It does not does not support changing wildcard and mark type behaviour. AFAIK, -x supports ONLY wildcards and it is not possible to disable this, even with the -spd switch. I have had very limited success in using .\ anchored or absolute paths with the with the -x option, either directly or with the @listfile style (see the end of user-profile-excludes.txt). Relative wildcards work fine, but you have to be aware of this limitation and keep the archive root in mind and make sure patterns are relative to the archive root.
# PowerShell to list symlinks in a hierarchy
Get-ChildItem -Path "g:\Users" -Force |
Where-Object { $_.LinkType -ne $null -or $_.Attributes -match "ReparsePoint" } |
ft FullName,Length,Attributes,Linktype,Target

# cite: https://superuser.com/a/1652788/59966

Recursive vs. non-recursive exclusion patterns

One can see two exclude specification files being used in this job. ​user-profile-excludes-recursive.txt AND ​user-profile-excludes.txt
This separation is intended to allow fine-grained control of exclusions. Relative non-recursive pattern matching and recursive pattern matching. This mitigates the possibility of patterns matching too much, and erroneously excluding a path.
Relative non-recursive matches a path exactly, relative to the archive root.
Recursive patterns will traverse the hierarchy and try to match a pattern in any part of the path. Good for general wildcards and patterns where the relative path is unknown or not relevant. Use carefully to avoid inadvertently excluding a path.

4) OneDrive challenges

Someone had set up OneDrive for certain folders for the main user of the laptop. This can lead to a scenario where one or more items are not 'local' and are only stored in the “cloud” ☁.
I'd faced this in the past and knew that it was possible to get a list of the hierarchy and attributes of each item and from that determine which items were only in the cloud, and then be able skip those cloud only items.
This problem effectively caused 7-Zip to abort the archive creation job, preventing the backup.
I remembered that the 4199968 attribute meant cloud only, and fortunately there was only one file with this problem. Ironically, it was a standard file provided by Microsoft called "Erste Schritte mit OneDrive.pdf" (Getting Started with OneDrive.pdf) and even more ironically, the name contained a non-printing space (HEX 0xa0) which made it difficult to write a file name exclusion for the file, so I decided to use a wildcard to work around the problem.
💡 For this situation on this laptop as of writing 2023-12-02 the 4199968 attribute was valid but it is possible that other/newer attributes superspeed this one (introduced with newer versions of OneDrive?). Your mileage may vary - do your own research! For example the cited article uses different attributes.
# PowerShell to create a tab seperated file with a folder hierarchy listing including Attributes
Get-ChildItem -Force -Recurse -Path . | Select-Object FullName,Attributes,Mode,Extension | Export-Csv -Delimiter "`t" -Path "E:\!backups\xxxx-laptops\hp probook\NVMe backup via enclosure\onedrive.tsv" -NoTypeInformation -Encoding UTF8

5) Recovery of Windows and restoration of the user data

After the :\Users folder hierarchy backup was complete from the faulted NVMe, it was time to get the laptop working again. So, I researched a suitable NVMe replacement... “A few moments later” I had picked the Crucial P5 Plus 1TB drive for ~50EUR.
51A8VjNN0uL._AC_SL1500_.jpg
My first attempt was to create an image/clone of the faulty NVMe (src) but attempts with dd stopped reading around ~3% of the drive, and the drive disconnected and/or went into a defunct state. The same issues occurred with other drive imaging tools that I tried.
It wasn’t just a case of having to ignore read errors, because the drives failure mode in this scenario was to disconnect and become defunct. So I couldn’t just run an unstoppable copy.
This failure mode meant that in order to recover windows, a filesystem level copy would be required.
Challenges associated with filesystem level copy include:
NTFS permissions of key hierarchies like :\Program Files and %windir%
File ownership - must use TrustedInstaller elevation
How to get the system to boot again
How to resolve windows 11 issues, system file corruption, missing or incorrect NTFS permissions, broken start menu and store apps.

Gandalf’s Windows 10PE to the rescue

So, I used Gandalf PE (Windows 10PE x64 Redstone 7 - Spring 2022 Edition [
]) on a bootable USB thumb drive to obtain and windows pre-installation environment. Windows PE [
] is a type of “Live CD” [
] operating system which does not require installation, it just boots “live” and stores in required files in RAM. For those familiar with Hiren’s Boot CD there is also a community developed PE version of Hiren’s available [
].
In short Windows PE provides a familiar cut down version of windows GUI to perform installations, troubleshooting and recovery tasks.
On my bootable USB thumb drive I had a few additional programs including:
Beyond Compare [
]
NirSoft’s AdvancedRun [
]
AOMEI Partition Assistant 10.2.1 [
]

Copying the src partition table and partitions

With the new NVMe installed in the laptop (dst) and the defective NVMe in the enclosure (src) with write-protection enabled, I used AOMEI Partition Assistant 10.2.1 to copy the partition table and all partitions byte-for-byte from the src, except the main windows partition (which, as mentioned, kept failing). The partition copy included the boot/EFI partitions and the necessary partition flags. During the partition copy process I made the necessary adjustments to expand the windows partition to use the full size of the dst NVMe. The dst windows partition was formatted with NTFS and ready to receive files from src.
With the partitions ready, I then used NirSoft’s AdvancedRun to launch Beyond Compare running as nt authority\SYSTEM with TrustedInstaller elevation. I made a few exclusions (similar to those listed in the 7-Zip section, and it doesn’t make sense to copy things like pagefile.sys) and then started a mirror job from the src to dst drive. Its important to adjust the Beyond Compare session defaults before starting the job so Beyond Compare considers system and hidden files and also copies NTFS permissions. During the job there was one error in the job regarding NTFS permissions not being able to be set, I assume this was for a single object in the hierarchy (unfortunately there wasn’t further details to check). This mirror job naturally included :\Users which means the personal data was also copied/restored.
At the end of the process there should now be “in theory” a bootable windows 11 system including personal data from the src drive. “In theory” the dst drive should be a near clone of the src drive and act and behave in the same way as it did with the src drive.

Booting Windows 11 after recovery

Glossary: ​SFC = System File Checker BCD = Boot Configuration Data MBR = Master Boot Record BOOTREC = Bootrec.exe (Boot Recovery) tool [
] BCDBOOT = tool to configure boot files [
] BCDEDIT = tool for managing BCD stores which describes boot applications and boot application settings. [
]
The first boot failed with some error, so I booted the machine from a Windows 11 installer USB thumb drive and used the recovery tools (command line) to perform the following steps:
1) I used diskutil to inspect the disks, partitions and volumes to I could orientate how windows had detected disk
# For example
diskutil
list disk
select disk X
list part
list vol
2) mount the efi partition (which is likely not mounted)
mountvol s: /s
3) configure the boot files specifying the windows and system partition sources
BCDBOOT c:\windows /s s:
4) Repair the boot MBR on the system partition
BOOTREC /FIXMBR
5) Attempt to write a new boot sector to the system partition
BOOTREC /FIXBOOT
FAILED with access denined error, it was unclear if this was partially successful
6) Scans all disks for installations of windows and also displays the entries that are currently NOT in the BCD store.
BOOTREC /ScanOS
7) make a backup of the BCD
BCDEDIT /export c:\bcdbackup
8) Rebuild the BCD store
BOOTREC /rebuildbcd
9) exit the command prompt, shutdown and boot the laptop again
10) Now presented with a boot menu with the recovered windows installation, choose this, laptop boots to windows login prompt
11) login
💡 I tried various combinations and subsets of these commands without success. The above sequence documents what was successful in getting recovered the windows install to boot.

Repairing Windows 11 after recovery

After logging in to the primary user account the usual applications like desktop, taskbar and explorer were running. The users files are present as expected.
However the start menu, settings, notifications, connectivity menu and all store-like apps are broken. So I went through a process of trying to detect and repair windows installation issues using sfc and DISM. Both reported issues, sfc could not repair the detected issues, and DCIM reported there were issues that could be repaired BUT /Cleanup-Image /RestoreHealth options were not able to perform repairs. The main error from DCIM was not being able to find a valid source but attempts to provide a valid source failed.
I also noted that newly created users profiles was broken, adding a new user and logging in worked but the new user setup process failed and hung on a blank screen.
My suspicion of the root cause was two things:
Corruption in the windows install from the failing src drive
Incorrect NTFS permissions in the %windir% and other folder hierarchies.
A log of the failed repair attempts:
# check disk - was OK - it was a newly created and formatted partition after all
chkdsk

# attempt to repair windows with System File Checker - failed citing repair was not possisble
# I've read that from Windows 8 that one should run the DISM commands first. [
]
# Regardless, after DSIM commands are successful, run sfc again to ensure everything is OK
sfc /scannow

# check for issues
DISM /Online /Cleanup-Image /ScanHealth
DISM /Online /Cleanup-Image /CheckHealth

# attempt repair with online (windows update) source - failed
DISM /Online /Cleanup-Image /RestoreHealth

# attempt to repair with a local source (by-pass windows update issues)
# 1. get installed windows version info
DISM /Online /Get-CurrentEdition

# 2. mount windows 11 ISO - to be used as DISM source
# 3. get version info from a mounted ISO
DISM /Get-WimInfo /WimFile:e:sources/install.wim
Share
 
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.