This article discusses the basics of troubleshooting failed system services, including verifying an error message and tracking down information in the event logs.
If you would like read the next part of this article series please go to Troubleshooting Windows Server 2008 R2 Service Startup Issues (Part 2).
Troubleshooting a service failure can sometimes be a frustrating experience. Thankfully, there are some techniques that you can use to get to the cause of the problem and get your server up and running relatively quickly. In this article, I want to discuss various techniques that you can use to troubleshoot service failures.
Before I Begin
Before I get started, I just want to quickly mention that all of the screen shots presented in this article series are based on Windows Server 2008 R2. Even so, most of these techniques will work on other versions of Windows as well. The exact steps may not always match up perfectly from one operating system to another, but the basic concepts are relevant across the board.
Verify the Failure
Even though it sounds silly, the very first thing that you should do when you see an error message sighting a service failure is to verify that the error is accurate. I have seen several real world examples of buggy application of the report service failures when the services is actually running. Likewise, it is very common to see an error message when Windows is booted indicating that one or more services have failed to start. This message is often erroneous.
To verify a service failure, you need to open the Service Control Manager by selecting the Services command from the Administrative Tools menu. The Service Control Manager lists every service that is installed on the machine, as well as the services current state. You can see with the Service Control Manager looks like in Figure A.
Figure A: The Services console displays all of the system services.
If the error message that you have received relates to a specific service then you can simply locate the service within the Service Control Manager (services are arranged alphabetically) and check to see whether or not the service is started. If on the other hand you have received a generic error message stating that one or more services failed to start then you need to look to find out whether or not the services that should be running really are.
As you look at the figure above, you might notice that not all of the services are running. This is normal and has to do with the service’s startup type. Windows offers four different startup types for services (some of the older versions of Windows only use three startup types). These include:
Automatic – Services with a startup type of Automatic should start automatically when Windows is booted.
Automatic (Delayed Start) – Automatic services that are configured with the delayed start wait until all of the other automatic services have started before they begin initializing. Even at that, automatic services that use a delayed start use a low priority thread to ensure that the server remains responsive while the services are starting.
Manual – Services that are configured to start manually do not start unless they are instructed to do so either by you, by the operating system, or by an application.
Disabled – If a service is disabled it will not start even if you attempt to manually start the service. Some services are disabled for security reasons, but there are also documented instances of malware disabling system services in order to prevent them from running. If you need to start a disabled service, you can do so by changing the startup type to either Manual or Automatic (or Automatic Delayed Start) and then starting the service.
If you are trying to determine whether or not the necessary services are running, then simply scroll through the list of services and make sure that every service that has a startup type of Automatic or Automatic Delayed Start is running. If a service is configured to run automatically, but is not started the mess services likely the cause of the error.
Manually Start the Service
If you notice that a service that should be running is not running, then the first thing that you should do is to attempt to manually start the service. To do so, just right click on the service and choose the Start command from the resulting shortcut menu. Often times, the service will start without any problems.
Check the Event Log
So what you do if you attempt to manually start a system service, but it does not start? The first thing that I recommend doing in such situations is to check the Event Viewer. In most cases when a service fails to start, one or more event log entries will be created. These log entries can be invaluable in helping you to determine the root cause of the problem.
The location in which the event log entry is created really depends on the type of service that you are having trouble with. There are three main event logs that could potentially contain information about the service that you’re having trouble with. These include:
- The System Log space – The System Log contains events related to the Windows operating system. If you are having trouble starting a service related to the Windows Operating System then the System Log is the best place to look for information.
- The Applications and Services Logs space – Newer versions of Windows include a set of logs known as the Application and Services Logs. These logs are application specific. In other words, if you are looking for log entries related to a certain application, then this is the first place that you should look. The Applications and Services Logs container contains dedicated logs for things like Internet Explorer, Microsoft Office, and Windows PowerShell.
- The Application Log space – most applications do not create a dedicated logs beneath the Applications and Services Logs container. Instead, application related logging information is usually written to the Application log.
Even though the event logs can be a valuable resource for troubleshooting a service that fails to start, it can sometimes be tough to find the information that you are looking for. After all, there are typically thousands of event log entries scattered across a dozen or more logs. If you have trouble locating information related to the service that you are having trouble with, then I recommend using the Event Viewer’s Find feature (which is located in the Actions pane). The Find feature works like a search engine and allows you to search for text related to the problem that you are having, as shown in Figure B.
Figure B: You can search the event logs for specific text.
When you find a log entry related to your problem, just double-click on the entry to view it. Sometimes the log entry will tell you exactly what the problem is. For example, the log entry shown in Figure C indicates that the service was disabled. This problem is easy enough to fix by re-enabling the service. Sometimes however, the solution is not quite so clear-cut. In these situations it is sometimes useful to make note of the event ID number so that you can look it up on the Internet if necessary. Often times, Microsoft provides TechNet articles with comprehensive solutions for specific event IDs.
Figure C: Sometimes event log entries will tell you exactly why a service failed to start.
Now that I have talked about the basics of troubleshooting a stubborn service, I want to move on to some of the more intermediate and advanced troubleshooting techniques. I will discuss these techniques in Part two.
This article discusses five more methods that you can use to diagnose and repair service startup issues.
In my first article in this series I talked about some really basic techniques for troubleshooting problems with services that refuse to start. In this article, I want to conclude the series by talking about five more things that you can do to get a stubborn service to start.
Check the Dependency Services
Sometimes a service may fail to start due to a problem with a dependency. Services can sometimes form a hierarchical architecture in which other services must be running in order for a service to start. Granted, not all services have dependencies associated with them, but dependency services are common enough that they certainly warrant a look if you are having trouble starting a service.
In the old days it was really tough to track down problems with dependency services, but most of the newer versions of Windows make it easy. To check service dependencies, open the Service Control Manager, right click on the service that you are having trouble starting, and select the Properties command from the resulting shortcut menu. When you do, Windows will display the service’s properties sheet.
As you can see in Figure A, this properties sheet contains a Dependencies tab. The Dependencies tab is divided into two sections. The top portion lists the services that must be running in order for the service that you have selected to start. The bottom portion of the tab lists services that cannot be started until the selected service is running. In this particular screen capture you can see that the Windows Firewall service cannot start unless the Base Filtering Engine and the Windows Firewall Authorization Driver have started. You can also see that there are no services that directly depend on the Windows Firewall service.
Figure A: Sometimes the failure of a dependency service may prevent a service from starting.
One thing that is important to keep in mind as you troubleshoot service dependencies is that sometimes the dependencies can form a multilevel hierarchy. If you look back at the figure above, you will notice that there is a plus sign to the left of the listings for the Base Filtering Engine service and the Windows Firewall Authorization Driver service. If you click on these icons then Windows will list any other dependencies that exist within the service hierarchy. As you can see in Figure B, there are multiple dependencies for the Base Filtering Engine service, but no additional dependencies for the Windows Firewall Authorization Driver service.
Figure B: Services can have several levels of dependencies.
Check for Authentication Failures
Services can also fail to start as a result of authentication failures. Most services do not run under the context of the user that is currently logged in. If they did then services would be unable to run in the background while no one is logged in. Likewise, services often require special permissions that are beyond those assigned to standard user accounts. As such, every service is linked to an account that provides the necessary permissions for the service to run.
You can see which account is linked to a service by opening the Service Control Manager, right clicking on the service that you are having trouble with, and choosing the Properties command from the resulting shortcut menu. When you do, Windows will display the properties sheet for the service. You can see which account is in use by going to the Log On tab, shown in Figure C.
Figure C: The Log On tab allows you to specify the account used by the service.
As you can see in the figure, Windows gives you the option of running the service using the Local System account or a specific account. In this particular case, an account called Local Service is being used. In case you are wondering, the Local System account is a very high level account that is used only when the service in question needs to act as a part of the operating system. In contrast, the Local Service account has rights that are more similar to those of a standard user. On occasion you might also see a service configured to use the Network Service account. The Network Service account uses the credentials associated with the machine’s computer account.
Normally if a service is configured to use the Local System, Local Service, or Network Service account then you won’t have to worry about managing the credentials for that service. Windows takes care of this automatically on your behalf (assuming that nothing is broken within the operating system). What can be a problem however, is that some services run under the context of either a local user account or a domain user account. When such service accounts are used, passwords can and sometimes do expire.
When a service account password expires, the problem might not be noticed immediately. However, the next time that the machine is rebooted the service which has been assigned an expired password will fail to start. You can fix the problem by going to the service’s Log On tab and manually specifying the new password.
Keep in mind that a service can fail to authenticate even if the password is correct if the machine in question is unable to communicate with the domain controller on which the service account resides.
Certain types of malware infestations can cause system services to fail to start. For example, some antivirus products run as system services. If a virus wants to avoid detection then it may check for the existence of such a service, shut the service down, and then damage the system in a way that prevents the service from being started in the future.
Although antivirus related services are by far the most common target, they are certainly not the only type of service that can be attacked by a virus. Viruses can attack virtually any system service. For example, I once saw a virus that attacked the Windows Firewall Service.
If you are having trouble getting a service to start then another thing that I recommend doing is checking the system for hard disk corruption. I once ran into a situation in which a system seemed to be perfectly healthy aside from the inability of one particular service to start. No matter what I tried I just could not get this service running. Out of desperation I ran the CHKDSK. Upon doing so, I discovered that the system volume was corrupt and that several operating system files had been damaged.
Unfortunately, CHKDSK was unable to fix the problem. I was however able to make a list of the files that have been damaged and then copy those files from another system that was running the same version of Windows (and the same set of patches).
Time Sync Issues
If all else fails, check the system clock and make sure that the time matches the time that is displayed on your domain controllers. If a service uses the Kerberos protocol for authentication then the authentication process can fail if the computer’s clock falls out of sync with the clocks on your domain controllers. In order for Kerberos to function properly, clocks cannot be out of sync by more than five minutes.
As you can see, there are any number of potential causes for service failures. Fortunately, it is usually relatively easy to get a failed service running again using the steps that I have described in this article series. If you have trouble getting a service running, don’t forget that the Event Viewer may contain valuable clues as to the nature of the problem.
If you would like to read the first part in this article series please go to Troubleshooting Windows Server 2008 R2 Service Startup Issues (Part 1).