Synopsis
We recently had an issue with one of
AD server, the server time jumping previous dates. After couple of hours the
server returning with original time and it is following domain time hierarchy. So,
the time drift is not consistent with forest PDC. Initially we had an issue with one
server later it triggered such issues multiples domain controllers.
Time drift
that causes on domain controller, that lead potential issues to business few service
below for reference.
Authentication
and authorization,
Domain
controller replication will break
Group
Managed Service account must reconfigure.
Investigation
We started
analysis how / who initiating this time jump on domain controllers. we checked
the below factors,
Ø Is there any network connection issue between PDC to Domain Controller, both have logically disconnected each other, we assumed possible network glitches but upon checking within same site and same subnet another domain controller we don’t see any time drift from PDC. hence network or fireall not a concern.
Ø
Verified once server back to original time are they taking time from PDC or local CMOS. Luckily the
server taking time from PDC, not from local CMOS.
Ø
The
very first server we observed this issue, domain controller installed with
physical server, the HP product team they documented the list of products
affected this bug and they provided solution to follow.
The
below link that will help you to check if your physical box falls under this
category.
https://support.hpe.com/hpesc/public/docDisplay?docId=emr_na-c04557232
We have checked our product version
is not impacted and observed this issue also affecting Virtual domain
controllers too. Hence hardware bug not an issue to us.
We finally
give up with all basic troubleshooting and opened support case to Microsoft
team,
Microsoft team
captured the w32 debug logs and found some interesting issue about STS (secure
time seeding) that causing this issue.
To collect
debug logs we must run the below command with Administrator cmd.
Command
to enable w32time debug logs:
w32tm
/debug /enable /file:%SystemRoot%\temp\W32Time.log /size:10485760
/entries:0-1003 (we need to restart the time service in order for logs to be
collected)
Commands
to stop and start the time service:
net stop
w32time - to stop the time service
net start w32time – to start the time service
Solution: -
The issue is not consistent also we won’t be able to
reproduce the issue to capture the logs, it took some time to capture it. Finally,
we had an enough logs that prove it caused secure time seeding, hence Microsoft recommend
turning off this STS feature.
Before implementing we captured what present registry value on
each domain controller (Our domain controller hosted on windows server 2019)
and found this feature is turned on all domain controllers.
Registry value
Registry Key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config
Value Name: UtilizeSslTimeData
Value Type: REG_DWORD
Value: 1 = enabled (default), 0 = Disabled
We implemented this registry disable though below
group policy option to push all Domain Controller.
Group Policy and the corresponding registry to disable
STS(reboot required):
Setting:
- Global Configuration Settings (Computer Configuration\
Administrative Templates\ System\ Windows Time Service)
Sub Setting:
- UtilizeSslTimeData
Explain Text
- This parameter controls whether W32time will use time data computed
from SSL traffic on the machine as an additional input for correcting the
local clock.
ADMX File:
- W32Time.admx file.
Reboot Requirements
- Reboot required.
Note: Changes on the registry value requires reboot. Hence
plan your implementation accordingly.
Reference Notes:
The below
reference article that explains more about how this STS feature that causing this
issue and why Microsoft made this feature default as turned on state.
My favorite
articles are here.
Thank you for addressing such a critical issue! Domain controller time synchronization problems can cause a cascade of issues across an environment, from authentication failures to replication delays. I appreciate the clear explanation of the root cause and steps to mitigate this.
ReplyDeleteYour emphasis on proper NTP configuration and monitoring is spot on. I’d love to know if you have recommendations for tools or scripts that can proactively alert admins to time sync discrepancies before they become major problems.
Great post—looking forward to more valuable insights like this!
Epicforce Tech