Case of crashing wbengine and system state backup…

While going through patch reports I noticed that 2 windows 2008 r2 sp2 servers had missed 2 patch cycles. Soon it was found that system state backup was not happening for these servers. No backup so no patching.

So I started with system state backup.

A simple command, wbadmin start systemstatebackup –backuptarget:c: gave following error,

The Windows Backup engine could not be contacted. Retry the operation.
The RPC server is unavailable.

cmd

Quick look at event viewer revealed more, wbengine.exe crashing with ntdll.dll module

Log Name:      Application
Source:        Application Error
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Description:
Faulting application name: wbengine.exe, version: 6.1.7601.17514, time stamp: 0x4ce79951
Faulting module name: ntdll.dll, version: 6.1.7601.17725, time stamp: 0x4ec4aa8e
Exception code: 0xc0000374
Fault offset: 0x00000000000c40f2

1000-1

We found VSS writers were stable and did not report any errors

%windir%\logs\windowsserverbackup did not reveal any logs

With no leads, we decided to treat this as faulting application and crashing dll scenario and patch both these files to latest.

Quick search on support.microsoft.com revealed couple of fixes matching our scenario

http://support.microsoft.com/kb/2182466 “2155347997 (0x8078001D)” error code when you perform a system state backup operation in Windows 7 or in Windows Server 2008 R2

http://support.microsoft.com/kb/2512352 Windows Server Backup utility does not back up some newly created files in Windows 7 or in Windows Server 2008 R2

http://support.microsoft.com/kb/2545627 A multithreaded application might crash in Windows 7 or in Windows Server 2008 R2

With several other application crashing on ntdll.dll KB 2545627 was perfect fit for our server and being the latest KB 2512352 was selected.

image

After the updated we found issue with other apps failing on ntdll.dll was fixed but it made no difference to primary issues of failing backup. We noticed same event 1000, this time with higher DL version numbers.

Log Name:      Application
Source:        Application Error
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Description:
Faulting application name: wbengine.exe, version: 6.1.7601.21667, time stamp: 0x4d65d41c
Faulting module name: ntdll.dll, version: 6.1.7601.21861, time stamp: 0x4ec4a6c2
Exception code: 0xc0000374
Fault offset: 0x0000000000c4192

1000-2

Looking again at %windir%\logs\windowsserverbackup revealed Wbadmin.etl file which i had missed earlier.

I used tracerpt to analyze etl file but it did not revealed any information about crash issue.

In the mean time onsite team also did sfc /scannow and reinstalled windows backup module but it did not made any difference.

Some more search on TechNet forum pointed out “Manage Engine Asset Explorer Agent” as possible cause. We had this egent installed on this server.

Quick check at installed date of this Agent and last successful backup confirmed that the same.

We uninstalled “Manage Engine Asset Explorer Agent” and were relieved the see that wbengine was not crashing anymore 🙂

However, this time backup failed at enumeration of files,

Summary of backup:
——————
Backup of system state failed [date time]

Log of files successfully backed up
‘C:\Windows\Logs\WindowsServerBackup\SystemStateBackup date time.log’

Log of files for which backup failed
‘C:\Windows\Logs\WindowsServerBackup\SystemStateBackup_Error date time.log’

I found following event, but it did not helped much,

Event ID: 519
Description: The backup operation that started at “Time” has failed to back up volume(s) . Please review the event details for a solution, and then rerun the backup operation once the issue is resolved.

image

Quite interestingly, running the system state backup for GUI via backup module revealed more detailed error,

Event ID: 517
Description: The backup operation that started at “Time” has failed with following error code ‘2155347997’. Please review the event details for a solution, and then rerun the backup operation once the issue is resolved.

image

KB http://support.microsoft.com/kb/2182466 ““2155347997 (0x8078001D)” error code when you perform a system state backup operation in Windows 7 or in Windows Server 2008 R2” reffer to the exact same issue. However we already had installed higher version of wbengine and this articles was not applicable for us anymore.

I also found interesting article http://networkadminkb.com/KB/a467/how-to-fix-windows-2008-r2-system-state-backup-fails.aspx which refers to the same issue for OS virtualized using VMWare ESX.

As the backup was failing during enumerating files i followed http://blogs.technet.com/b/askcore/archive/2010/06/18/reasons-why-the-error-enumeration-of-the-files-failed-may-occur-during-system-state-backup.aspx Reasons why the error Enumeration of the files failed may occur during System State backup.

Checking all Image Paths for correct value is pain in itself, and its further complicated by multiple valid syntaxes. Thanks to Tom Acker  for proving a nice and easy way to find invalid image paths with GetInvalidImagePath script.

Running this script revealed multiple image paths with space which needed to be enclosed in quotes and few more keys with incorrectly added forward slash “/” in image paths.

Once Image Paths were cleaned, system state worked like a charm 🙂

With valid backup available, now these servers are good the receive long awaited patches missed for previous and current cycle.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s