Patient: HP ProBook 445 G7 Notebook/Laptop | Running Windows 11 previously upgraded from Windows 10
Symptom: The laptop was unable to boot windows throwing an “unmountable boot volume” blue screen / stop code [
Typical windows startup recovery attempts did not resolve the issue - it was stuck in a recovery → blue screen loop.
HP diagnostics failed on the SMART / Drive tests.
smartctl output that the "NVM subsystem reliability has been degraded" and the SMART self-assessment test result was: FAILED.
I removed the Western Digital drive from the laptop and used an NVMe enclosure to see if the drive was detected on a another system, and hoped the filesystem was mountable.
First time I’ve worked on a HP for a while. I was impressed with the diagnostics experience and QR code to call up a tailored support page with warranty coverage details.
💡 This drive didn’t have a hard life, one can see from the SMART info that it was an early failure and had not reached ~3% of its marketed Endurance of up to 400 TBW (Terabyte's Written). Without knowing the manufactures MTTF algorithm (Mean time to first failure [
💡 Note: that I instinctively enabled the hardware write-protect switch on the enclosure, hoping to prevent further degradation of the drive and allow a backup to be made. When I later tried the drive with the write-protect switch disabled I quickly realised that this made the drive unstable and it disconnected shortly after the operating system tried to mount it - assuming the OS was trying update/write to the NTFS tables. My hypothesis would be that the drives failure mode at that time was fortunately write sensitive - reads seemed OK apart from a dozen or so unreadable/corrupt inodes/files.
1) BiTlOcKeR ?! 😲
At first glance, it looked like the drive was BitLocker “protected”, but fortunately it was not, it was only BitLocker "prepared" but not fully protected. AFAIK, this state means that the data is encrypted, but because nothing has been set to "protect" the encryption, the drive behaves just like an unencrypted drive (perhaps equivalent to a blank password). Some useful commands to check this:
# get bitlocker drive status
manage-bde -status g:
# see what is "protecting" the drive encryption
manage-bde -protectors -get g:
where g: is the drive in question. For more details see:
💡 Had BitLocker been fully enabled and protected I would of needed at least one piece of the protection info in order to unlock the drive.
2) Backup of personal data
With the write-protect switch on the enclosure enabled, it was time to try taking a backup. The logic here was that if nothing else worked except the backup, at least some of the personal data would be recovered. The aim of the backup was not to create a fully operational set of user profiles, rather to focus on making a backup copy of the personal data.
The read-only mount of the windows volume meant that I couldn't change the ownership and permissions of the folder hierarchy even if I wanted to. So to get access to the user profiles, I needed to access to the drive as the SYSTEM user. So with PsExec64 -s -i cmd I spawned a cmd prompt running as the SYSTEM user, and from there I could spawn other processes as SYSTEM user and examine and backup the folder structure. SYSTEM user privileges were adequate for c:\Users.
A disadvantage of using the SYSTEM user was that it was difficult to give this user access to network shares requiring authentication - which would have been an ideal destination for the backups. I had to resort to using alternative locally attached storage.
💡 Note that performing a full c:\ folder copy requires elevation to the special TrustedInstaller mode. TrustedInstaller runs under nt authority\SYSTEM but does something special so that TrustedInstaller privileges are granted. Perhaps it adds something special to the env? TrustedInstaller mode is basically like god mode for reading any part of the c:\ folder hierarchy. To elevate to TrustedInstaller I used NirSoft’s AdvancedRun [
-snl stores symbolic links as links, which avoids a number of symlink challenges in the windows user profiles folder hierarchy.
-spf2 changes the path handling behaviour of 7-Zip. By default, 7-Zip uses relative paths. Note that this changes not only the behaviour of specifying what to include in an archive, but also the inclusion and exclusion path specifications. -spf2 tells 7-Zip to use full paths WITHOUT drive letters. See the manual for more details. One useful facet of this feature is that relative paths are not converted to fully-qualified, so it is possible to specify relative and full-qualified paths.
⚠ CAUTION-spf* switches also change the behaviour of archive extraction!
💡 Note: From trial and error: when matching a symlink path, you CANNOT use the \ suffix to match symlinked folders (as they are not folders).
-x exclude filenames
General observation: AFAIK, in a simple single-hierarchy archive, the archive root is determined by the path being archived (the source hierarchy) (the archive root might also be inherited from pwd?), this is relevant for -i include and -x exclude patterns. The include and exclude patterns must be relative to the archive root for pattern matching to work as expected. I have had very limited success using the .\ anchored suffix for pattern matching, and the same with -spf* switches and absolute/fully qualified include and exclude patterns.
-x is a subset of the -i functionality. It does not does not support changing wildcard and mark type behaviour. AFAIK, -x supports ONLY wildcards and it is not possible to disable this, even with the -spd switch. I have had very limited success in using .\ anchored or absolute paths with the with the -x option, either directly or with the @listfile style (see the end of user-profile-excludes.txt). Relative wildcards work fine, but you have to be aware of this limitation and keep the archive root in mind and make sure patterns are relative to the archive root.
One can see two exclude specification files being used in this job.
user-profile-excludes-recursive.txt
AND
user-profile-excludes.txt
This separation is intended to allow fine-grained control of exclusions. Relative non-recursive pattern matching and recursive pattern matching. This mitigates the possibility of patterns matching too much, and erroneously excluding a path.
Relative non-recursive matches a path exactly, relative to the archive root.
Recursive patterns will traverse the hierarchy and try to match a pattern in any part of the path. Good for general wildcards and patterns where the relative path is unknown or not relevant. Use carefully to avoid inadvertently excluding a path.
4) OneDrive challenges
Someone had set up OneDrive for certain folders for the main user of the laptop. This can lead to a scenario where one or more items are not 'local' and are only stored in the “cloud” ☁.
I'd faced this in the past and knew that it was possible to get a list of the hierarchy and attributes of each item and from that determine which items were only in the cloud, and then be able skip those cloud only items.
This problem effectively caused 7-Zip to abort the archive creation job, preventing the backup.
I remembered that the 4199968 attribute meant cloud only, and fortunately there was only one file with this problem. Ironically, it was a standard file provided by Microsoft called "Erste Schritte mit OneDrive.pdf" (Getting Started with OneDrive.pdf) and even more ironically, the name contained a non-printing space (HEX 0xa0) which made it difficult to write a file name exclusion for the file, so I decided to use a wildcard to work around the problem.
💡 For this situation on this laptop as of writing 2023-12-02 the 4199968 attribute was valid but it is possible that other/newer attributes superspeed this one (introduced with newer versions of OneDrive?). Your mileage may vary - do your own research! For example the cited article uses different attributes.
# PowerShell to create a tab seperated file with a folder hierarchy listing including Attributes
5) Recovery of Windows and restoration of the user data
After the :\Users folder hierarchy backup was complete from the faulted NVMe, it was time to get the laptop working again. So, I researched a suitable NVMe replacement... “A few moments later” I had picked the Crucial P5 Plus 1TB drive for ~50EUR.
My first attempt was to create an image/clone of the faulty NVMe (src) but attempts with dd stopped reading around ~3% of the drive, and the drive disconnected and/or went into a defunct state. The same issues occurred with other drive imaging tools that I tried.
It wasn’t just a case of having to ignore read errors, because the drives failure mode in this scenario was to disconnect and become defunct. So I couldn’t just run an unstoppable copy.
This failure mode meant that in order to recover windows, a filesystem level copy would be required.
Challenges associated with filesystem level copy include:
NTFS permissions of key hierarchies like :\Program Files and %windir%
File ownership - must use TrustedInstaller elevation
How to get the system to boot again
How to resolve windows 11 issues, system file corruption, missing or incorrect NTFS permissions, broken start menu and store apps.
Gandalf’s Windows 10PE to the rescue
So, I used Gandalf PE (Windows 10PE x64 Redstone 7 - Spring 2022 Edition [
] operating system which does not require installation, it just boots “live” and stores in required files in RAM. For those familiar with Hiren’s Boot CD there is also a community developed PE version of Hiren’s available [
With the new NVMe installed in the laptop (dst) and the defective NVMe in the enclosure (src) with write-protection enabled, I used AOMEI Partition Assistant 10.2.1 to copy the partition table and all partitions byte-for-byte from the src, except the main windows partition (which, as mentioned, kept failing). The partition copy included the boot/EFI partitions and the necessary partition flags.
During the partition copy process I made the necessary adjustments to expand the windows partition to use the full size of the dst NVMe. The dst windows partition was formatted with NTFS and ready to receive files from src.
With the partitions ready, I then used NirSoft’s AdvancedRun to launch Beyond Compare running as nt authority\SYSTEM with TrustedInstaller elevation. I made a few exclusions (similar to those listed in the 7-Zip section, and it doesn’t make sense to copy things like pagefile.sys) and then started a mirror job from the src to dst drive. Its important to adjust the Beyond Compare session defaults before starting the job so Beyond Compare considers system and hidden files and also copies NTFS permissions.
During the job there was one error in the job regarding NTFS permissions not being able to be set, I assume this was for a single object in the hierarchy (unfortunately there wasn’t further details to check).
This mirror job naturally included :\Users which means the personal data was also copied/restored.
At the end of the process there should now be “in theory” a bootable windows 11 system including personal data from the src drive. “In theory” the dst drive should be a near clone of the src drive and act and behave in the same way as it did with the src drive.
Booting Windows 11 after recovery
Glossary:
SFC = System File Checker
BCD = Boot Configuration Data
MBR = Master Boot Record
BOOTREC = Bootrec.exe (Boot Recovery) tool [
The first boot failed with some error, so I booted the machine from a Windows 11 installer USB thumb drive and used the recovery tools (command line) to perform the following steps:
1) I used diskutil to inspect the disks, partitions and volumes to I could orientate how windows had detected disk
# For example
diskutil
list disk
select disk X
list part
list vol
2) mount the efi partition (which is likely not mounted)
mountvol s: /s
3) configure the boot files specifying the windows and system partition sources
BCDBOOT c:\windows /s s:
4) Repair the boot MBR on the system partition
BOOTREC /FIXMBR
5) Attempt to write a new boot sector to the system partition
BOOTREC /FIXBOOT
FAILED with access denined error, it was unclear if this was partially successful
6) Scans all disks for installations of windows and also displays the entries that are currently NOT in the BCD store.
BOOTREC /ScanOS
7) make a backup of the BCD
BCDEDIT /export c:\bcdbackup
8) Rebuild the BCD store
BOOTREC /rebuildbcd
9) exit the command prompt, shutdown and boot the laptop again
10) Now presented with a boot menu with the recovered windows installation, choose this, laptop boots to windows login prompt
11) login
💡 I tried various combinations and subsets of these commands without success. The above sequence documents what was successful in getting recovered the windows install to boot.
Repairing Windows 11 after recovery
After logging in to the primary user account the usual applications like desktop, taskbar and explorer were running. The users files are present as expected.
However the start menu, settings, notifications, connectivity menu and all store-like apps are broken. So I went through a process of trying to detect and repair windows installation issues using sfc and DISM. Both reported issues, sfc could not repair the detected issues, and DCIM reported there were issues that could be repaired BUT /Cleanup-Image /RestoreHealth options were not able to perform repairs. The main error from DCIM was not being able to find a valid source but attempts to provide a valid source failed.
I also noted that newly created users profiles was broken, adding a new user and logging in worked but the new user setup process failed and hung on a blank screen.
My suspicion of the root cause was two things:
Corruption in the windows install from the failing src drive
Incorrect NTFS permissions in the %windir% and other folder hierarchies.
A log of the failed repair attempts:
# check disk - was OK - it was a newly created and formatted partition after all
chkdsk
# attempt to repair windows with System File Checker - failed citing repair was not possisble
# I've read that from Windows 8 that one should run the DISM commands first. [
Mount the windows 11 ISO - ideally the ISO version should match the installed windows version
Launch the setup.exe from the ISO
Don’t click Next - instead click the link under the paragraph to change how setup will download updates
On the next screen choose “now now” and “next”
setup checks the existing installation
agree to the licence
setup does some more checks
⚠ on the next screen choose to keep personal data and applications
setup will check and prep the in-place upgrade/repair
eventually the windows 11 installation will open in full-screen, and will replace the the normal windows GUI
the installation is performed
at least one reboot will be required
💡 The in-place install/repair will place the old windows install in a folder: :\windows.old. If everything is OK will the upgrade you can use the “Disk Clean-up” utility: "%windir%\system32\cleanmgr.exe" to remove the old windows installation. This is easier than trying to delete the folder manually which requires nt authority\SYSTEM and TrustedInstaller elevation.
After the in-place upgrade/repair was completed and logging in with the primary user, all the issues appeared to be resolved. Now its time to perform final integrity checks - with an elevated command prompt:
# attempt to repair windows with System File Checker - success- open issues were repaired
sfc /scannow
# check for windows install issues - success- no issues detected
DISM /Online /Cleanup-Image /ScanHealth
6) Windows won’t update after the repair
After the repair windows update was throwing a 0x80248007 error code. A quick bit of googling discovered a guide to resolve this issue [