IBM Support

How a Sametime server establishes a community connection with other Sametime servers

Technote (FAQ)


Question

This document describes how a Lotus® Sametime® server establishes community connections with other Sametime servers. The document discusses resolving firewall issues and resolving network address translation (NAT) conflicts.

Answer

Typically, E Server entries such as "Failed to connect to server" with a reason code, or "has wrong net address" are entered in the sametime.log file for either server where the general symptom is a lack of awareness between servers.

Successful connections display as:
I Server 13/May, 07:01:30 Logged in to server 1.1.1.30




How a Sametime server joins a Sametime community

NOTE: Sametime servers should have manager access to their own Domino Directory (names.nsf). If this does not exist, then the server will not be able to properly read in the Server documents to build a list of Sametime servers.

1. A Sametime server starts and checks for a Yes value on the Basics tab "Is this a Sametime Server" field in all Server documents located in its Domino Directory (names.nsf) and via Directory Assistance other resolved names.nsf files. Information gathered in this step is recorded in the CommunityConfig.txt file on each server. This file is the output listing trusted Sametime server IPs for connections. When another Sametime server contacts a particular Sametime server, this server will check it's trusted IP's to determine if the IP requesting the connection is included as a trusted ip.

Note this is stored in memory and not the community-config.txt, but this file will show you what this particular server does trust.

Errors such as""Rejecting server <IP Address> not in servers list" " list indicates this connecting server was not trusted or recognized with the connecting IP. (see NAT section below)

2. The Sametime server stmux service determines its IP address via DNS or HOSTS file resolution and then does a lookup on the Fully qualified Internet Hostnames (net address value) of the other resolved Sametime servers. A comparison is done to determine what IP address other Sametime servers stmux services should be loaded on.

NOTE the Sametime server can be set manually by using the Sametime.ini
The Entry under the [config] section of the sametime.ini: SametimeCluster=CN=Server1/O=Acme
Can be used to set the Sametime server server name.

This can be seen in the community-config.txt line:
DATE TIME Reseting Trusted IPs
DATE TIMEsecurity level = 25
DATE TIMEServer configured to listen on all ips of the machine.
DATE TIME Local ip =
DATE TIME Bind ip =
DATE TIME Local port = 1516
DATE TIME Local SSL port = 0
DATE TIME clusterName = cn=SametimeCluster/o=Acme
DATE TIME Local ips = 127.0.0.1

3. Based on the IP addresses and the last Octet of the IP addresses, the Sametime server decides which server should contact which server. If a server IP's last Octet is less than the server that just came online, then the first server online will initiate the connection to that server. If a server's last Octet is greater than the last octet of the first server online, the first server online waits to be contacted by that server. In other words, lower octets listen for higher octets to call.

Without this algorithm in place, you would either have servers with multiple connections to and from each other or you would have servers that do not connect at all.



Firewall considerations


Server to server community connections are always over port 1516 internally and through firewalls. When connecting through a firewall, 1516 port traffic must be allowed in both directions. Typically administrators do not allow a firewall to pass the initial handshake external to internal. This initial call can be controlled/restricted once you determine which server is to initiate the call, and configure the external server to have a lower resolved octet. Optionally, the firewall can be configured with a rule to allow the handshake over the specific 1516 port.



NAT conflicts

Network Address Translation (NAT) conflicts are the result of an internal IP conflicting with the resolved NATted IP and two servers waiting or calling each other.

The following diagram demonstrates areas where NATting conflicts can occur and how NATting should be done to resolve the servers.

Note: The direction of the arrows shows how each server is resolving via DNS or HOSTS file lookup to the other server's IP, and is not a representation of Sametime community traffic.

Example

Diagram 1 below

The Sametime01.test.com server starts and determines that its IP is .30. The stmux service will load on this IP. When Sametime01 does a lookup of Sametime2.abc.com it determines that Sametime2 will is using .25. In this case, .30 is greater than .25, therefore, Sametime01 will initiate a call to Sametime2.

Sametime2.abc.com loads and determines its IP is .25 but Sametime01 is .2.
Since .25 is greater than .2, Sametime2 will initiate the call to Sametime01.
A conflict occurs as both servers are calling each other.


NOTE: An easy method to configure NATting rules is to have the internal IP and the NAT IP share a common last octet. (1.1.1.30 and 2.2.2.30). This provides an easy reference to determine which server is communicating over the network. This is not a requirement if NAT is properly configured.


8.0.2 NATTED Special Considerations for servers that have community to community awareness via natted IPs
8.0.2 changed in the way we calculate IPs. In 8.0.2 we now use left to right comparisons, where in previous versions we used right to left.
It's important to understand these calculations when upgrading or mixing versions of Sametime servers.

The below discusses an instance where 8.0.2 servers fail to connect and their reasons why.
In mixed server environments it's important to understand your NAT mapping configurations to avoid conflicts.

Explanation of the problem
The algorithm of connection initiation for each server is as follows:
Each servers iterates through the list of all known servers, and initiates a connection, to the servers with the "lesser" IP value than of its own.


Examples below indicate a working then failing connection using the same IPs

Sample network layout:
ST1 Sametime Mux binds to this server on 100.20.20.20
STcluster1- Sametime Mux binds to this server on 10.2.50.80
But due to Natting ST1 sees STcluster1 (ping) as it's NAT 100.20.20.100


Working Example.
For pre 8.0.2 servers the comparison for IP addresses was from right to left, so our servers will be connected in the following manner:
STcluster .80 > .20 Stcluster calls ST1
ST1 .20 < .100 (nat) ST1 listens.
Both servers agree that ST1 is less than STCLUSTER1s IP so STCluster1 calls St1 which is listening for the connection..


Failing Example due to NAT IP calculation.
For 8.0.2 servers the comparison for IP addresses is from left to right, so our server will be connected in the following manner:
STcluster 10 < 100 Stcluster waits for ST1
ST1 100.20.20.20 < 100.20.20.100 (nat) ST1 listens.
This connection will fail as both servers think the other server should initiate the connection and both are waiting for the other to call.








Rejecting server Not in servers list errors.

A common failure error in sametime.log is "Rejecting server <IP Address> not in servers list" when this is failing.

In NAT configurations this can occur when an unexpected IP is found.
Using the example above.

Sametime2 resolves to Sametime1 and associates the IP of 2.2.2.2 This shows in the community-config.txt for Sametime2.

If Sametime1 contacts Sametime2 using the internal IP address of 1.1.1.30 (its' internal IP) this IP is not trusted by Sametime2 so the connection will fail.

In order to resolve this you should have the server document for Sametime1 using the expected IP address. Commonly this is done by editing the server document for Sametime1 on Sametime2's names.nsf and changing Net Address to be the Natted value. This change should not be allowed replicate out. so you will need to prevent replicating this to other servers.

When Sametime 2 loads it will read in Sametime1's server document and build the list using this NAT value.




Troubleshooting

General Tips

1. Determine the true behavior of the issue. For example:

  • Were the servers once working and now have stopped? This often indicates a network or configuration change or an upgrade to software.
  • Is the issue intermittent? This often indicates a network-related issue, or a failure of a specific Sametime service.
  • Is this a new install where the servers have never worked? This often indicates either a network or configuration issue.

2. Understand how DNS is resolving to each server.

HOSTS files should be used to bypass any DNS table issues regardless of whether or not pings return the correct hostname. This would be for troubleshooting only and can be removed once awareness is established.

3. Directory Assistance should be temporarily disabled to confine lookup within the local names.nsf files only, as conflicting information can be found in secondary address books.

4. Troubleshooting steps (telnets, pings, nslookups, ipconfigs, netstats) should be performed on each server. It is always best to isolate two servers and resolve these first before continuing on to other servers having similar issues.

5. Draw a diagram of the connections between the servers to help map the information out for you and for Support if a PMR needs to be opened.

Some basic troubleshooting: (done on the servers)

1. Disable stsecurity or review information on vp_security_level settings and how they can affect server and client awareness.

2. Check the sametime.log file on both servers for any errors thrown during server startup and search Support Technotes about those errors.

3. Run ipconfig /all on each server and make a note of all available IPs. Compare these to the sametime.log file to confirm the Stmux.exe service is loading on the correct IP address. This can be on port 1516, or 80 for tunneled connection.

    Excerpt from a tunneled connection:
    I Mux 14/May/05, 07:08:03 Received Http server hostname update: hostname=Sametime01.test.com, ip=1.1.1.30
    I Mux 14/May/05, 07:08:03 Received Http server port update: port=8088
    I Mux 14/May/05, 07:08:03 Received meeting CBR port update: port=8081
    I Mux 14/May/05, 07:08:03 Received broadcast CBR port update: port=554
    I Mux 14/May/05, 07:08:04 Mux configured to listen on ports 1533 443 on all ips of the of machine,listenerType=VP
    I Mux 14/May/05, 07:08:04 Mux configured to listen on ports 80 on all ips of the of machine,listenerType=VP
    I Mux 14/May/05, 07:08:04 CBR configuration update:
    I Mux 14/May/05, 07:08:04 CBR entry: url=/meetingcbr, serverIp=1.1.1.30, port=8081
    I Mux 14/May/05, 07:08:04 CBR entry: url=/broadcastcbr, serverIp=1.1.1.30, port=554
4. Ping the IP for each of the servers and confirm that they answer successfully. Next ping the Fully Qualified Internet Hostname of both servers and make a note of the IP address returned.

5. Run a telnet test to the Fully Qualified Host Name of the other server over port 1516 and verify the answers (telnet fqhn.otherserver.com 1516)

6. Open the CommunityConfig.txt file on both servers and search to verify that the servers are listed here. If the other servers do not appear in this file, this is typically due to a names.nsf lookup issue.
    Example CommunityConfig.txt:
    14/May/05, 07:07:46 security level = 25
    14/May/05, 07:07:46 Server configured to listen on all ips of the machine.
    14/May/05, 07:07:46 Local ip = 1.1.1.30
    14/May/05, 07:07:46 Bind ip = 1.1.1.30
    14/May/05, 07:07:46 Local port = 1516
    14/May/05, 07:07:46 Local SSL port = 0
    14/May/05, 07:07:46 clusterName = cn=Sametime01/test
    14/May/05, 07:07:46 Local ips = 1.1.130 127.0.0.1
    14/May/05, 07:07:46 Trusted ips =
    14/May/05, 07:07:46 Allowed Login Types =
    14/May/05, 07:07:48 Adding server clusterName=CN=aaa/O=alt, hostName=, ip=0.0.0.0
    14/May/05, 07:07:48 Adding server clusterName=CN=Sametime01/O=test, hostName=sametime01.test.com, ip=1.1.1.30
    14/May/05, 07:07:48 Adding server clusterName=CN=Sametime02/O=abc, hostName=sametime02.abc.com, ip=6.6.6.25
    14/May/05, 07:07:48 Adding Trusted Ip=
7. Run Netstat -an on each server and confirm that the mux is listening on the correct IP and port. Additionally, a third party tool could be used to display this information. You should typically see 1516 listed in the local IP and port and any established connections to other servers in the Remote IP port, depending on the direction the connection was made.
    Example of successful connection in netstat -an
    TCP 1.1.1.30:1516 1.1.1.30:3189 ESTABLISHED
    TCP 1.1.1.30:1516 1.1.1.30:3195 ESTABLISHED
    TCP 1.1.1.30:3189 1.1.1.30:1516 ESTABLISHED
    TCP 1.1.1.30:3195 1.1.1.30:1516 ESTABLISHED
    TCP 1.1.1.30:3259 127.0.0.1:1516 ESTABLISHED
    TCP 1.1.1.30:3384 6.6.6.25:1516 ESTABLISHED (.30 called to .25 and made a successful connection)
8. If servers are using multiple network cards (NIC) or multiple IP addresses, consider disabling these to test under a single NIC single IP environment. Load both servers and rerun the tests above.

Related information

A simplified Chinese translation is available

Document information

More support for: IBM Sametime
Install/Configuration

Software version: 8.0, 8.0.1, 8.0.2, 8.5, 8.5.1, 8.5.1.1, 8.5.1.2, 8.5.2, 8.5.2.1, 9.0, 9.0.0.1

Operating system(s): AIX, IBM i, Solaris, Windows

Reference #: 1206752

Modified date: 09 April 2014