While working with a client, we ran into two front-end servers and an edge server that would not replicate. The CMS was hosted in another country with plenty of firewalls in between, which definitely complicated the issue. However, the root cause wasn’t a network issue or firewall.
We started by tackling the front-end servers. Two out of four front-end servers in a pool were showing out of date in the topology. From the server hosting the CMS I verified I could ping each front-end server by name and to my surprise I could also telnet to them on port 445.
Now that we knew the path used to connect was valid and the server was listening for the connection, I started a packet capture using NetMon, and ran the invoke-csmanagementstorereplication CMDlet to kick off replication. I captured data for 30 seconds and applied a Display Filter so I could look at the SMB traffic first.
There was “Access Denied” all over the logs. I tried to view the properties of the RTCReplicaRoot folder (by default this folder is on the root of the drive Lync is installed on), but didn’t have permission. Although this would seem like an error, it is actually expected behavior and it is best not to modify permissions to this directory. I discussed the build process with the client and determined a security script meant to tighten NTFS permissions had inadvertently broken CMS replication for those two servers. Instead of trying to fix the NTFS issue and risking other problems, we removed the boxes from the topology, re-installed the OS, and added them back as Lync servers after a clean rebuild without the security script. For more information on this blog, go here
By Kevin Peters