Introduction to FRS
File Replication Service (FRS): is a Microsoft Windows Server service for distributing shared files and Group Policy
Objects. It replaced the (Windows NT) Lan Manager Replication service,[1] and
has been partially replaced by Distributed
File System Replication. It is also known as NTFRS after the name of
the executable file that runs the service.
File Replication Service
is a multithreaded replication engine. Multithreaded means that several
processes can run at the same time to handle multiple tasks. This allows FRS to
replicate different files between different computers simultaneously.
When the
File Replication Service (FRS) detects a change to a file, such as the creation
of a new file or the modification to an existing file, it replicates it to
other servers in the group. To deal with conflicts
(when two copies of the files are edited at the same time on different servers)
the service resolves any issues by using the file with latest date and time.
One of the
main uses of FRS is for the SYSVOL directory share. The SYSVOL directory share
is particularly important in a Microsoft network as it is used to distribute
files supporting Group Policy and scripts to client computers on the
network. Since Group Policies and scripts are run each time a user logs on to
the system, it is important to have reliability. Having multiple copies of the
SYSVOL directory increases the resilience and spreads the workload for this
essential service.
FRS does not guarantee
the order in which files arrive. Files begin replication in sequential order
based on when the files are closed, but file size and link speed determine the
order of completion. Because FRS replicates only whole files, the entire file
is replicated even if you change only a single byte in the file.
Replicating SYSVOL
Although
Active Directory replication and File Replication service are separate mechanisms,
they are conceptually similar.
When you add, remove, or modify the contents of the
Sysvol folder on a domain controller, those changes are replicated to the
Sysvol folders on all other domain controllers in the domain.
FRS uses the same connection objects as the Active
Directory ™ directory service when it replicates
SYSVOL content. Therefore, it uses the same schedule as Active Directory for
intersite replication. However, unlike Active Directory, replicated content
between sites is not compressed
What happens in a Journal Wrap?
FRS has an internal database that contains all the files and
folders it is replicating and each of these has a unique global ID
(GUID). The database also contains a pointer to the last NTFS disk
operation (in the USN Journal/NTFS Journal) that the FRS service processed.
If a user changes a file or folder on a disk, the following
happens:
The operation is picked up by NTFS and an entry
is made in the NTFS Journal
FRS monitors the NTFS Journal for changes and
notes that a change has been made to that file
FRS keeps a record of the last NTFS Journal
event that it processed and checks if it has processed it already
If it hasn’t processed it already, it looks at
whether it is a file that it should replicate
If it should be replicated, the file goes into
the normal process of staging, replicating, etc.
FRS increments the entry in its database about
the NTFS Journal event that it has processed so it won’t consider it again
If there is a situation that the replication files has got few
changes and the DC's doesn't communicate with each other because replications
partners was shut down for a long time, FRS was not running or because of a
communication failure in the network. When the communication is reestablished,
FRS still knows the last NTFS Journal entry that it processed and it will
compare this with the current NTFS Journal the next time it restarts.
The next time the FRS service starts, it sees that it has missed
NTFS operations on the disk (It compares the its last processed NTFS operation
and current NTFS journal database). This is when FRS complains it has reached a
Journal Wrap state, the NTFS Journal log has wrapped around and it doesn’t know
the current state of things on the disk.
How to troubleshoot
journal wrap errors on Sysvol and DFS replica sets
The USN journal is a log of fixed size
that records all changes that occur on NTFS 5.0-formatted partitions. NTFRS
monitors the NTFS USN journal file for closed files in FRS replicated
directories as long as FRS is running.
Journal wrap errors occur if a sufficient number of changes that occur while FRS is turned off in such a way that the last USN change that FRS recorded during shutdown no longer exists in the USN journal during startup. The risk is that changes to files and folders for FRS replicated trees may have occurred while the service was turned off, and no record of the change exists in the USN journal. To guard against data inconsistency, FRS asserts into a journal wrap state.
To perform maintenance on FRS replica set members, administrators may stop the FRS service for long periods of time. In this case, administrators may not realize the potential impact. Also, error conditions may cause the FRS service to shut down, and this causes a journal wrap error. In very large replica sets, replica members may encounter the following error during an authoritative restore (BURFLAGS=D4):
Journal wrap errors occur if a sufficient number of changes that occur while FRS is turned off in such a way that the last USN change that FRS recorded during shutdown no longer exists in the USN journal during startup. The risk is that changes to files and folders for FRS replicated trees may have occurred while the service was turned off, and no record of the change exists in the USN journal. To guard against data inconsistency, FRS asserts into a journal wrap state.
To perform maintenance on FRS replica set members, administrators may stop the FRS service for long periods of time. In this case, administrators may not realize the potential impact. Also, error conditions may cause the FRS service to shut down, and this causes a journal wrap error. In very large replica sets, replica members may encounter the following error during an authoritative restore (BURFLAGS=D4):
To recover, the affected replica member
must be reinitialized with a nonauthoritative restore (BURFLAGS=D2) where it
will synchronize files from an existing inbound partner. This re-initialization
can be time-consuming for large replica sets.
The non-authoritative restore process must
be invoked manually. To do this, you must set BURFLAGS=D2 in the Windows NT
registry.
By
default, versions of the Ntfrs.exe file from Windows 2000 Service Pack 3 (SP3)
and from Windows 2000 SP3 hotfix do not perform an automatic non-authoritative
restore, when journal wrap errors are detected. SP3 versions of NTFRS may be
configured to function like SP2 when the "Enable journal wrap automatic
restore" registry entry is set to 1 in the following registry subkey:
HKLM\System\Ccs\Services\Ntfrs\Parameters
Important MS do not recommend that you use
this registry setting, and this setting should not be used versions of Windows
after the Service Pack 3 version of Windows 2000. The recommended method for
performing a non-authoritative restore on FRS members of DFS or SYSVOL replica
sets is to use the FRS BurFlags registry value.
The
following are appropriate options to reduce journal wrap errors:
Put
the FRS-replicated content on less busy volumes.
Keep
the FRS service running.
Avoid
making changes to FRS-replicated content while the service is turned off.
Increase
the USN journal size.
FRS is a service that must always be
running on Windows domain controllers and members of FRS-replicated DFS sets.
LAB for Journal Wrap Error.
In My test lap I have 2
domain controllers and 2003 environment. To produce journal wrap error I have
mad below changes in registry. The lower value domain controller will become
journal wrap state, hence I changed value 8MB.
HKLM\System\CCS\Services\NTFRS\Parameters\"Ntfs Journal size
in MB" (REG_DWORD)
Valid settings range from 8 to 128 megabytes (MB). The default is 32 MB.
In Windows 2000 Service Pack 2, valid
settings range between 8 and 128 MB, and the default is 32 MB. In Windows 2000
Service Pack 3, valid settings range between 4 and 10,000 MB, and the default
is 512 MB. These settings apply to all volumes that host an FRS replica
tree.
As a guideline, Microsoft suggests that you configure 128 MB of journal for every 100,000 files that are managed by replication on that volume.
As a guideline, Microsoft suggests that you configure 128 MB of journal for every 100,000 files that are managed by replication on that volume.
This setting applies to all volumes that
are hosting an FRS replica tree. You have to stop and then restart the NTFRS
service for the increases to the USN journal size to occur. However, to
decrease the USN journal size, you must reformat all volumes that contain
FRS-replicated content.
If you
increase the USN journal size, and therefore you increase the number of changes
that the journal can hold before the journal "wraps," this reduces
the possibility that the USN journal wrap will occur. The USN journal size can
be changed by setting the following registry key:
Once desired registry
settings are configured restart domain controller to ensure new value will take
effect.
Current SYSVOL setup
The above bat files I have created for file loop
creation. Before run bat files
I just stopped NTFRS service, so that the domain
controller will lose NTFS journal entries in his database.
Now start bat files and leave it for couple of
minutes.
Bat files are started, journal wrap state occur when
domain controller log will reach journal wrap state.
After couple of minutes domain controller logged and
it went to journal wrap error state
When domain controller
went to journal wrap state, the updated file will not be propagate to other
domain controller.
The below domain controller missed the update, which
we created by using bat file.
Here is the problem comes
in, the one user would say the respective group policy has been working fine,
in contrast the others the policy won’t be apply those who are authenticated
via the problematic domain controller.
Using
the BurFlags registry key to reinitialize File Replication Service replica sets
What is the
best way to do authoritatively and non-authoritative restore the Sysvol in the
domain?
Basically the difference is, the D4 is the
source DC (the good DC), and D2 would go on the bad DC, which tells it to
pull from the source DC (D4). The D2 option on the bad DC will do two things:
Copies current stuff in the Sysvol folder and puts it in a
folder called "Pre-existing." That folder is exactly what it says it
is, it is your current data. This way if you have to revert back to it, you can
use the data in this folder.
Then it replicates
(copies) good data from a GOOD DC (D4).
In another example, if
you set Burflags to D4 on a single domain controller and set Burflags to D2 on
all other domain controllers in that domain, you can rebuild the SYSVOL from
that specific D4 DC (the source DC).
There are two places
where we can configure BURFLAG, now which one is the best one.
Replica set specific re-initialization:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Cumulative
Replica Sets\GUID
Or
Global re-initialization:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process
at Startup
if you’re using DFS replica sets that holds a large amount of data that is healthy, go for the “Replica set specific re-initialization”. If you set the Global BurFlags, FRS will re-initialize all replica sets, including the DFS namespace the member holds. If they hold a large amount of data… that might take some time.
if you’re using DFS replica sets that holds a large amount of data that is healthy, go for the “Replica set specific re-initialization”. If you set the Global BurFlags, FRS will re-initialize all replica sets, including the DFS namespace the member holds. If they hold a large amount of data… that might take some time.
In short if
you are using DFS use below key
HKLM\System\CurrentControlSet\Services\NtFrs\Parameters\Cumulative Replica Sets\GUID
HKLM\System\CurrentControlSet\Services\NtFrs\Parameters\Cumulative Replica Sets\GUID
To find the
GUID of SYSVOL, look for the “Replica Set Name” named “Domain System Volume
(SYSVOL SHARE)” under the subkey “HKLM\..\..\Replica Sets”:
In my case I don’t
have DFS, hence I made Global Re-Initialization
After you
have set the BurFlags key to D2, you have to restart the NTFRS service on the
affected DC.
Overview
of what happens:
The Burflags is set to 0
Event ID 13565 is logged. Non-authoritative restore has started
The local FRS database is rebuilt
The “bad
DC” will compare all files (file ID and MD5 sum) it has in the
Pre-existing folder with the files from an upstream partner.
If a match is found, it will copy the file from the Pre-Existing folder to the original location. If they don’t match, it will pull the file from the upstream partner.
If a match is found, it will copy the file from the Pre-Existing folder to the original location. If they don’t match, it will pull the file from the upstream partner.
Event ID 13553 is logged
FRS notifies (SysvolReady reg.key = 1)
the Netlogon service that SYSVOL is ready and can be shared
The Netlogon service will share SYSVOL
and Netlogon.
Event ID 13516 is logged (finished)
When you have verified that SYSVOL is shared and in sync, you can delete
the content in the Pre-Existing folder to free up space.
Authoritative
Restore is one of the final way to re initializing SYSVOL. Normally we don’t
prepare the method unless if there is critical.
1) Normally
for an Authoritative Restore you stop at NTFRS services on all DCs.
2) Set BurFlags to D4 on a known good Sysvol (or at this time restore Sysvol data from backup then set burflags to D4) then start NTFRS on this server. You may want to rename the old folders with .old extensions prior to restoring good data.
3) Clean up the folders on all the remaining servers (Policies, Scripts, etc) - renamed them with .old extensions.
4) Set burflags to D2 on all remaining servers and start NTFRS.
5) Wait for FRS to replicate.
6) Clean up the .old stuff if things look good.
2) Set BurFlags to D4 on a known good Sysvol (or at this time restore Sysvol data from backup then set burflags to D4) then start NTFRS on this server. You may want to rename the old folders with .old extensions prior to restoring good data.
3) Clean up the folders on all the remaining servers (Policies, Scripts, etc) - renamed them with .old extensions.
4) Set burflags to D2 on all remaining servers and start NTFRS.
5) Wait for FRS to replicate.
6) Clean up the .old stuff if things look good.
FRS Protocol
Service
|
UDP
|
TCP
|
LDAP
|
389
|
|
RPC
|
Dynamic
|
By default, FRS replication over remote procedure calls
(RPCs) occurs dynamically over an available port by using RPC Endpoint Mapper (also
known as Remote Procedure Call Server Service or RPCSS) on port 135; the
process is the same for Active Directory replication.
FRS Tools
The following tools are associated with
File Replication service (FRS).
FRSDiag.exe: FRS
Diagnostics
Ntfrsutl.exe:File
Replication Utility
Sonar.exe: Sonar
Topchk.cmd: DFS and
SYSVOL Replication Topology Analysis Tool
Ultrasound.exe:
Ultrasound
Replmon
FRS
to DFS-R Migration
Why
Migrate?
Access-based
enumeration: Access-based enumeration allows users to
see only files and folders on a file server to which they have permission to
access. This feature is not enabled by default for namespaces (though it is
enabled by default on newly-created shared folders in Windows Server 2008), and
is only supported in a DFS namespace when the namespace is a standalone
namespace hosted on a computer running Windows Server 2008, or a domain-based
namespace by using the Windows Server 2008 mode.”
Improved
command-line tools: DFS Namespaces in Windows Server 2008
includes an updated version of the Dfsutil command and the new Dfsdiag command,
which you can use to diagnose namespace issues.
Content
Freshness: DFS Replication in Windows Server 2008 has a new
feature called Content Freshness, which prevents a server that was offline for
a long time from over-writing fresh data when it comes back online with stale
(out-of-date) data.
Propagation
report: DFS Management in Windows Server 2008 includes a new
type of diagnostic report called a propagation report. This report displays the
replication progress for the test file created during a propagation test.
Replicate
now:
DFS Management now includes the ability
to force replication to occur immediately, temporarily ignoring the replication
schedule.
SYSVOL
replication using DFS Replication: DFS Replication replaces
the File Replication Service (FRS) as the replication engine for replicating
the AD DS SYSVOL folder in domains that use the Windows Server 2008 domain
functional level.
To determine whether DFSR or FRS
To determine whether DFSR or FRS is
being used on a domain controller that is running Windows Server 2008,
check the value of the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\DFSR\Parameters\SysVols\Migrating
Sysvols\LocalState registry
subkey. If this registry subkey exists and its value is set to 3 (ELIMINATED), DFSR is being used. If the subkey does not exist, or if it has
a different value, FRS is being used.
Migration
to DFS-R therefore consists of four stages or states:
0 (start). The default
state of a domain controller. Only FRS is used to replicate SYSVOL.
1 (prepared). A copy of
SYSVOL is created in a folder called SYSVOL_DFSR and is added to a replication
set. DFS-R begins to replicate the contents of the SYSVOL_DFSR folders on all
domain controllers. However, FRS continues to replicate the original SYSVOL
folders and clients continue to use SYSVOL
2 (redirected) SYSVOL
share is redirected to SYSVOL_DFSR for client use.
SYSVOL is still
replicated by FRS for failback.
3 (eliminated).
Replication of the old SYSVOL folder by FRS is stopped. The original SYSVOL
folder is not deleted. Therefore, if you want to remove it entirely, you must
do so manually.
You move the DCs through
these stages or states, by using the DFSMig command.
You will use three options with dfsrmig.exe:
getglobalstate state
The setglobalstate option
configures the current global DFSR migration state, which applies to all domain
controllers. The state is specified by the state parameter, which is 0–3. Each
domain controller will be notified of the new DFSR migration state and will
migrate to that state automatically.
getglobalstate
The getglobalstate option
reports the current global DFSR migration state.
getmigrationstate
The getmigrationstate
option reports the current migration state of each domain controller. Because
it might take time for domain controllers to be notified of the new global DFSR
migration state, and because it might take even more time for a domain
controller to make the changes required by that state, domain controllers will
not be synchronized with the global state instantly. The getmigrationstate
option enables you to monitor the progress of domain controllers toward the
current global DFSR migration state.
If there is a problem
moving from one state to the next higher state, you can revert to previous
states by using the setglobalstate option. However, after you have used the
setglobalstate option to specify state 3 (eliminated), you cannot revert to the
earlier states
To migrate SYSVOL
replication from FRS to DFS-R, perform the following steps:
1. Open the Active
Directory Domains and Trusts snap-in.
2. Right-click the domain
and choose Raise Domain Functional Level.
3. If the Current domain
functional level box does not indicate Windows Server 2008, select Windows Server
2008 or Windows Server 2008 R2 from the Select an available domain functional
level list.
4. Click Raise. Click OK
twice in response to the dialog boxes that appear.
5. Log on to a domain
controller and open a command prompt.
6. Type dfsrmig
/setglobalstate 1.
7. Type dfsrmig
/getmigrationstate to query the progress of domain controllers toward the prepared
global state. Repeat this step until the state has been attained by all domain
controllers.
This can take 15 minutes
to an hour or longer.
8. Type dfsrmig
/setglobalstate 2.
9. Type dfsrmig
/getmigrationstate to query the progress of domain controllers toward the
Redirected global state.
Repeat this step until the state has been attained by all domain controllers.
This can take 15 minutes
to an hour or longer.
10. Type dfsrmig
/setglobalstate 3.
After you begin migration
from state 2 (prepared) to state 3 (replicated), any changes made to the
SYSVOL folder will have
to be replicated manually to the SYSVOL_DFSR folder.
11. Type dfsrmig
/getmigrationstate to query the progress of domain controllers toward the
Eliminated global state.
Repeat this step until the state has been attained by all domain controllers.
This can take 15 minutes
to an hour or longer.
12. For more information
about the dfsrmig.exe command, type dfsrmig.exe /?.
Migration States
Stable States: There are four migration states which are
defined as ‘Stable Migration States’ as alluded to above. During the process of
migration, the administrator uses the migration tool (dfsrmig.exe) to set a
migration directive in Active Directory. This directive essentially sets a
domain wide migration state (also called global migration state) in Active
Directory. This global migration state can be any one of the four stable
migration states shown in the table below.
Stable Migration
States
0 ‘START’
state
1 ‘PREPARED’
state
2 ‘REDIRECTED’
state
3 ‘ELIMINATED’
state
Transition
States: During migration, each
domain controller takes appropriate actions locally so that it can attain the
migration stable state which has been selected for the domain by the
administrator. This operation causes the domain controller to cycle through
intermediate states called ‘Transition States’. These transition states are
shown in the table below.
Transition States
4 ‘PREPARING’
state
5 ‘WAITING
FOR INITIAL SYNC’ state
6 ‘REDIRECTING’
state
7 ‘ELIMINATING’
state
8 ‘UNDO
REDIRECTING’ state
9 ‘UNDO
PREPARING’ state
End