Friday 30 October 2015

File Replication Service FRS

Introduction to FRS
File Replication Service (FRS): is a Microsoft Windows Server service for distributing shared files and Group Policy Objects. It replaced the (Windows NT) Lan Manager Replication service,[1] and has been partially replaced by Distributed File System Replication. It is also known as NTFRS after the name of the executable file that runs the service.
File Replication Service is a multithreaded replication engine. Multithreaded means that several processes can run at the same time to handle multiple tasks. This allows FRS to replicate different files between different computers simultaneously.
When the File Replication Service (FRS) detects a change to a file, such as the creation of a new file or the modification to an existing file, it replicates it to other servers in the group. To deal with conflicts (when two copies of the files are edited at the same time on different servers) the service resolves any issues by using the file with latest date and time.
One of the main uses of FRS is for the SYSVOL directory share. The SYSVOL directory share is particularly important in a Microsoft network as it is used to distribute files supporting Group Policy and scripts to client computers on the network. Since Group Policies and scripts are run each time a user logs on to the system, it is important to have reliability. Having multiple copies of the SYSVOL directory increases the resilience and spreads the workload for this essential service.
FRS does not guarantee the order in which files arrive. Files begin replication in sequential order based on when the files are closed, but file size and link speed determine the order of completion. Because FRS replicates only whole files, the entire file is replicated even if you change only a single byte in the file.

Replicating SYSVOL

Although Active Directory replication and File Replication service are separate mechanisms, they are conceptually similar.
When you add, remove, or modify the contents of the Sysvol folder on a domain controller, those changes are replicated to the Sysvol folders on all other domain controllers in the domain.
FRS uses the same connection objects as the Active Directory  directory service when it replicates SYSVOL content. Therefore, it uses the same schedule as Active Directory for intersite replication. However, unlike Active Directory, replicated content between sites is not compressed
What happens in a Journal Wrap?
 FRS has an internal database that contains all the files and folders it is replicating and each of these has a unique global ID (GUID).  The database also contains a pointer to the last NTFS disk operation (in the USN Journal/NTFS Journal) that the FRS service processed.
If a user changes a file or folder on a disk, the following happens:
The operation is picked up by NTFS and an entry is made in the NTFS Journal
FRS monitors the NTFS Journal for changes and notes that a change has been made to that file
FRS keeps a record of the last NTFS Journal event that it processed and checks if it has processed it already
If it hasn’t processed it already, it looks at whether it is a file that it should replicate
If it should be replicated, the file goes into the normal process of staging, replicating, etc.
FRS increments the entry in its database about the NTFS Journal event that it has processed so it won’t consider it again

If there is a situation that the replication files has got few changes and the DC's doesn't communicate with each other because replications partners was shut down for a long time, FRS was not running or because of a communication failure in the network. When the communication is reestablished, FRS still knows the last NTFS Journal entry that it processed and it will compare this with the current NTFS Journal the next time it restarts.

The next time the FRS service starts, it sees that it has missed NTFS operations on the disk (It compares the its last processed NTFS operation and current NTFS journal database). This is when FRS complains it has reached a Journal Wrap state, the NTFS Journal log has wrapped around and it doesn’t know the current state of things on the disk.



How to troubleshoot journal wrap errors on Sysvol and DFS replica sets
The USN journal is a log of fixed size that records all changes that occur on NTFS 5.0-formatted partitions. NTFRS monitors the NTFS USN journal file for closed files in FRS replicated directories as long as FRS is running.

Journal wrap errors occur if a sufficient number of changes that occur while FRS is turned off in such a way that the last USN change that FRS recorded during shutdown no longer exists in the USN journal during startup. The risk is that changes to files and folders for FRS replicated trees may have occurred while the service was turned off, and no record of the change exists in the USN journal. To guard against data inconsistency, FRS asserts into a journal wrap state. 

To perform maintenance on FRS replica set members, administrators may stop the FRS service for long periods of time. In this case, administrators may not realize the potential impact. Also, error conditions may cause the FRS service to shut down, and this causes a journal wrap error. In very large replica sets, replica members may encounter the following error during an authoritative restore (BURFLAGS=D4):

To recover, the affected replica member must be reinitialized with a nonauthoritative restore (BURFLAGS=D2) where it will synchronize files from an existing inbound partner. This re-initialization can be time-consuming for large replica sets.
The non-authoritative restore process must be invoked manually. To do this, you must set BURFLAGS=D2 in the Windows NT registry.

By default, versions of the Ntfrs.exe file from Windows 2000 Service Pack 3 (SP3) and from Windows 2000 SP3 hotfix do not perform an automatic non-authoritative restore, when journal wrap errors are detected. SP3 versions of NTFRS may be configured to function like SP2 when the "Enable journal wrap automatic restore" registry entry is set to 1 in the following registry subkey:
HKLM\System\Ccs\Services\Ntfrs\Parameters
Important MS do not recommend that you use this registry setting, and this setting should not be used versions of Windows after the Service Pack 3 version of Windows 2000. The recommended method for performing a non-authoritative restore on FRS members of DFS or SYSVOL replica sets is to use the FRS BurFlags registry value.

The following are appropriate options to reduce journal wrap errors:
Put the FRS-replicated content on less busy volumes.
Keep the FRS service running.
Avoid making changes to FRS-replicated content while the service is turned off.
Increase the USN journal size.
FRS is a service that must always be running on Windows domain controllers and members of FRS-replicated DFS sets.



LAB for Journal Wrap Error.

In My test lap I have 2 domain controllers and 2003 environment. To produce journal wrap error I have mad below changes in registry. The lower value domain controller will become journal wrap state, hence I changed value 8MB.



HKLM\System\CCS\Services\NTFRS\Parameters\"Ntfs Journal size in MB" (REG_DWORD)

Valid settings range from 8 to 128 megabytes (MB). The default is 32 MB.
In Windows 2000 Service Pack 2, valid settings range between 8 and 128 MB, and the default is 32 MB. In Windows 2000 Service Pack 3, valid settings range between 4 and 10,000 MB, and the default is 512 MB. These settings apply to all volumes that host an FRS replica tree. 
As a guideline, Microsoft suggests that you configure 128 MB of journal for every 100,000 files that are managed by replication on that volume.
This setting applies to all volumes that are hosting an FRS replica tree. You have to stop and then restart the NTFRS service for the increases to the USN journal size to occur. However, to decrease the USN journal size, you must reformat all volumes that contain FRS-replicated content.
If you increase the USN journal size, and therefore you increase the number of changes that the journal can hold before the journal "wraps," this reduces the possibility that the USN journal wrap will occur. The USN journal size can be changed by setting the following registry key:



Once desired registry settings are configured restart domain controller to ensure new value will take effect.
Current SYSVOL setup


The above bat files I have created for file loop creation. Before run bat files
I just stopped NTFRS service, so that the domain controller will lose NTFS journal entries in his database.

Now start bat files and leave it for couple of minutes.

Bat files are started, journal wrap state occur when domain controller log will reach journal wrap state.
After couple of minutes domain controller logged and it went to journal wrap error state


When domain controller went to journal wrap state, the updated file will not be propagate to other domain controller.

The below domain controller missed the update, which we created by using bat file.
Here is the problem comes in, the one user would say the respective group policy has been working fine, in contrast the others the policy won’t be apply those who are authenticated via the problematic domain controller.
Using the BurFlags registry key to reinitialize File Replication Service replica sets
What is the best way to do authoritatively and non-authoritative restore the Sysvol in the domain?
Basically the difference is, the D4 is the source DC (the good DC), and D2 would go on the bad DC, which tells it to pull from the source DC (D4). The D2 option on the bad DC will do two things:
Copies current stuff in the Sysvol folder and puts it in a folder called "Pre-existing." That folder is exactly what it says it is, it is your current data. This way if you have to revert back to it, you can use the data in this folder.
Then it replicates (copies) good data from a GOOD DC (D4).

In another example, if you set Burflags to D4 on a single domain controller and set Burflags to D2 on all other domain controllers in that domain, you can rebuild the SYSVOL from that specific D4 DC (the source DC).  
There are two places where we can configure BURFLAG, now which one is the best one.

Replica set specific re-initialization:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Cumulative Replica Sets\GUID
Or
Global re-initialization:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup
if you’re using DFS replica sets that holds a large amount of data that is healthy, go for the “Replica set specific re-initialization”. If you set the Global BurFlags, FRS will re-initialize all replica sets, including the DFS namespace the member holds. If they hold a large amount of data… that might take some time.
In short if you are using DFS use below key 
HKLM\System\CurrentControlSet\Services\NtFrs\Parameters\Cumulative Replica Sets\GUID
To find the GUID of SYSVOL, look for the “Replica Set Name” named “Domain System Volume (SYSVOL SHARE)” under the subkey “HKLM\..\..\Replica Sets”:
In my case I don’t have DFS, hence I made Global Re-Initialization
After you have set the BurFlags key to D2, you have to restart the NTFRS service on the affected DC.
Overview of what happens:
The Burflags is set to 0


Event ID 13565 is logged. Non-authoritative restore has started



 The content of SYSVOL are moved to the pre-existing folder


The local FRS database is rebuilt


The “bad DC” will compare all files (file ID and MD5 sum) it has in the Pre-existing folder with the files from an upstream partner.
If a match is found, it will copy the file from the Pre-Existing folder to the original location. If they don’t match, it will pull the file from the upstream partner.

Event ID 13553 is logged
FRS notifies (SysvolReady reg.key = 1) the Netlogon service that SYSVOL is ready and can be shared
The Netlogon service will share SYSVOL and Netlogon.
Event ID 13516 is logged (finished)

When you have verified that SYSVOL is shared and in sync, you can delete the content in the Pre-Existing folder to free up space.
Authoritative Restore is one of the final way to re initializing SYSVOL. Normally we don’t prepare the method unless if there is critical.
1) Normally for an Authoritative Restore you stop at NTFRS services on all DCs.
2) Set BurFlags to D4 on a known good Sysvol (or at this time restore Sysvol data from backup then set burflags to D4) then start NTFRS on this server.  You may want to rename the old folders with .old extensions prior to restoring good data.
3) Clean up the folders on all the remaining servers (Policies, Scripts, etc) - renamed them with .old extensions.
4) Set burflags to D2 on all remaining servers and start NTFRS.
5) Wait for FRS to replicate.
6) Clean up the .old stuff if things look good.
FRS Protocol
Service
UDP
TCP
LDAP

389
RPC

Dynamic
By default, FRS replication over remote procedure calls (RPCs) occurs dynamically over an available port by using RPC Endpoint Mapper (also known as Remote Procedure Call Server Service or RPCSS) on port 135; the process is the same for Active Directory replication.

FRS Tools

The following tools are associated with File Replication service (FRS).

FRSDiag.exe: FRS Diagnostics
Ntfrsutl.exe:File Replication Utility
Sonar.exe: Sonar
Topchk.cmd: DFS and SYSVOL Replication Topology Analysis Tool
Ultrasound.exe: Ultrasound
Replmon
FRS to DFS-R Migration
Why Migrate?
Access-based enumeration: Access-based enumeration allows users to see only files and folders on a file server to which they have permission to access. This feature is not enabled by default for namespaces (though it is enabled by default on newly-created shared folders in Windows Server 2008), and is only supported in a DFS namespace when the namespace is a standalone namespace hosted on a computer running Windows Server 2008, or a domain-based namespace by using the Windows Server 2008 mode.”
Improved command-line tools: DFS Namespaces in Windows Server 2008 includes an updated version of the Dfsutil command and the new Dfsdiag command, which you can use to diagnose namespace issues.
Content Freshness: DFS Replication in Windows Server 2008 has a new feature called Content Freshness, which prevents a server that was offline for a long time from over-writing fresh data when it comes back online with stale (out-of-date) data.
Propagation report: DFS Management in Windows Server 2008 includes a new type of diagnostic report called a propagation report. This report displays the replication progress for the test file created during a propagation test.
Replicate now:  DFS Management now includes the ability to force replication to occur immediately, temporarily ignoring the replication schedule.
SYSVOL replication using DFS Replication: DFS Replication replaces the File Replication Service (FRS) as the replication engine for replicating the AD DS SYSVOL folder in domains that use the Windows Server 2008 domain functional level.
To determine whether DFSR or FRS
To determine whether DFSR or FRS is being used on a domain controller that is running Windows Server 2008, check the value of the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\DFSR\Parameters\SysVols\Migrating Sysvols\LocalState registry subkey. If this registry subkey exists and its value is set to 3 (ELIMINATED), DFSR is being used. If the subkey does not exist, or if it has a different value, FRS is being used.
Migration to DFS-R therefore consists of four stages or states:
0 (start). The default state of a domain controller. Only FRS is used to replicate SYSVOL.
1 (prepared). A copy of SYSVOL is created in a folder called SYSVOL_DFSR and is added to a replication set. DFS-R begins to replicate the contents of the SYSVOL_DFSR folders on all domain controllers. However, FRS continues to replicate the original SYSVOL folders and clients continue to use SYSVOL
2 (redirected) SYSVOL share is redirected to SYSVOL_DFSR for client use.
SYSVOL is still replicated by FRS for failback.
3 (eliminated). Replication of the old SYSVOL folder by FRS is stopped. The original SYSVOL folder is not deleted. Therefore, if you want to remove it entirely, you must do so manually.
You move the DCs through these stages or states, by using the DFSMig command.
 You will use three options with dfsrmig.exe:
getglobalstate state
The setglobalstate option configures the current global DFSR migration state, which applies to all domain controllers. The state is specified by the state parameter, which is 0–3. Each domain controller will be notified of the new DFSR migration state and will migrate to that state automatically.
getglobalstate
The getglobalstate option reports the current global DFSR migration state.
getmigrationstate
The getmigrationstate option reports the current migration state of each domain controller. Because it might take time for domain controllers to be notified of the new global DFSR migration state, and because it might take even more time for a domain controller to make the changes required by that state, domain controllers will not be synchronized with the global state instantly. The getmigrationstate option enables you to monitor the progress of domain controllers toward the current global DFSR migration state.
If there is a problem moving from one state to the next higher state, you can revert to previous states by using the setglobalstate option. However, after you have used the setglobalstate option to specify state 3 (eliminated), you cannot revert to the earlier states
To migrate SYSVOL replication from FRS to DFS-R, perform the following steps:
1. Open the Active Directory Domains and Trusts snap-in.
2. Right-click the domain and choose Raise Domain Functional Level.
3. If the Current domain functional level box does not indicate Windows Server 2008, select Windows Server 2008 or Windows Server 2008 R2 from the Select an available domain functional level list.
4. Click Raise. Click OK twice in response to the dialog boxes that appear.
5. Log on to a domain controller and open a command prompt.
6. Type dfsrmig /setglobalstate 1.
7. Type dfsrmig /getmigrationstate to query the progress of domain controllers toward the prepared global state. Repeat this step until the state has been attained by all domain controllers.
This can take 15 minutes to an hour or longer.
8. Type dfsrmig /setglobalstate 2.
9. Type dfsrmig /getmigrationstate to query the progress of domain controllers toward the
Redirected global state. Repeat this step until the state has been attained by all domain controllers.
This can take 15 minutes to an hour or longer.
10. Type dfsrmig /setglobalstate 3.
After you begin migration from state 2 (prepared) to state 3 (replicated), any changes made to the
SYSVOL folder will have to be replicated manually to the SYSVOL_DFSR folder.
11. Type dfsrmig /getmigrationstate to query the progress of domain controllers toward the
Eliminated global state. Repeat this step until the state has been attained by all domain controllers.
This can take 15 minutes to an hour or longer.
12. For more information about the dfsrmig.exe command, type dfsrmig.exe /?.
Migration States
Stable States: There are four migration states which are defined as ‘Stable Migration States’ as alluded to above. During the process of migration, the administrator uses the migration tool (dfsrmig.exe) to set a migration directive in Active Directory. This directive essentially sets a domain wide migration state (also called global migration state) in Active Directory. This global migration state can be any one of the four stable migration states shown in the table below.
Stable Migration States
 0         ‘START’ state
 1         ‘PREPARED’ state
 2         ‘REDIRECTED’ state
 3         ‘ELIMINATED’ state 
Transition States: During migration, each domain controller takes appropriate actions locally so that it can attain the migration stable state which has been selected for the domain by the administrator. This operation causes the domain controller to cycle through intermediate states called ‘Transition States’. These transition states are shown in the table below.
Transition States
 4         ‘PREPARING’ state
 5         ‘WAITING FOR INITIAL SYNC’ state
 6         ‘REDIRECTING’ state
 7         ‘ELIMINATING’ state
 8         ‘UNDO REDIRECTING’ state
 9         ‘UNDO PREPARING’ state

End