Q2533 Tuning Windows NT's TCP/IP suite to suit Citect

Summary:

For the most part, TCP/IP on Windows NT is tuned for use to suit a desktop operating system or application server. Most of it's typical network load comprises large, file based blocks of data. The associated network architecture is mostly fairly simple, and it is not usually found in mission critical situations. In contrast, Citect systems often exchange large amounts of data using many small network packets. Networked Citect often requires complex switched topologies, with redundancy on top of that. Even so, NT on it's own has desirable qualities for use with Citect, being familiar, reliable and so on. How can I get the best out of TCP/IP on NT and therefore assure myself of optimal performance when it is used in Citect solutions?

Solution:

This subject never used to be an issue, really. Everybody used NetBEUI or IPX with their Citect systems, which are largely self tuning or even non-tunable. In the meantime, the Internet has taken off, WAN requirements are up and Microsoft has done a fair bit of work on their TCP/IP suite to improve it's performance. So much so that now we usually recommend using TCP/IP because it seems to be the fastest transport protocol there is at present. Even though Citect's networking features extensive tuning possibilities, it is still dependent on the provision of networking services by NT. As a result, getting the best out of Citect on NT requires a bit of knowledge regarding NT's operation.

Piggybacking Acknowledgments

In order to guarantee delivery of network traffic, TCP dictates message acknowledgments to confirm the receipt of data. Trying to kill two birds with one stone (not to mention cutting down on net traffic), TCP will often hold on waiting for further traffic to a certain destination in order to include the ack for a previous transmission in a subsequent message. More information on this subject can be found in Internet RFC1122. So, the situation is as follows. A Citect client sends a data request to a Citect I/O server. The I/O server responds with the data and includes acknowledgment of the request receipt in with the reply. Client has nothing further to ask for at that time but NT's TCP holds the acknowledgment (of receipt of the data) for a short while in case any more traffic to the I/O server is forthcoming. This short while (100ms) is shorter than the TCP timeout so no re-transmission occurs; nevertheless a delay is caused to the I/O server, waiting for the completion of the transaction. Despite much pestering by CiT, Microsoft has decided against allowing tuning of this behaviour. It is here to stay, and the timeout cannot even be shortened. To allow some latitude therefore, CiT has included in Citect a means of forcing an acknowledgment. This is accomplished by sending a dummy packet (called an 'oink' in deference to the original piggyback) to which NT can attach it's acknowledgment. Using a network protocol analyser or sniffer it is possible to see these oinks as they are transmitted about the place. It is true that they do cause extra network traffic, however the reasoning is that you can turn it off, plus for optimal performance an extra packet (the pure ack) would've had to have been transmitted separately anyway. In tests this feature has been found to improve network performance in some circumstances but by no means all; in some cases throughput is actually degraded. In the latest version of Citect this method of 'killing' piggybackacks is turned on by default. If you are desperately concerned with your network performance, or just want to fiddle around trying to get better throughput, try turning this setting off using [Lan]KillPiggyBackAck=0 in the Citect.ini file.

Nagling

Windows NT TCP/IP implements the Nagle algorithm described in Internet RFC896. When an application does two sends of less than a transport Maximum Transmission Unit (1460 bytes on Ethernet), the second send is delayed until an ACK is received from the remote host. The delay occurs in case the application does another small send. TCP can then coalesce the two small sends into one larger packet. This concept of collecting small sends into larger packets is called Nagling.

Unfortunately, it is not possible to change this setting from the registry, and because Citect networking does not use Windows sockets, Citect's developers do not have the option of setting the TCP_NODELAY socket option. This option is available to Citect device drivers which use sockets to communicate with I/O devices; it tells TCP/IP to send always, regardless of packet size. The result is sub-optimal use of the physical network in as much as it allows the transmission of many small packets where perhaps a larger one would suffice, but it will avoid the delay of waiting for an ACK.

Redundancy and LanA

New large installations of Citect are increasingly demanding. An increasing number of customers are requiring complete Lan redundancy, so that if one Lan fails Citect can keep going on the other one. In this case each NT machine must have two network interface cards (NICs) installed. This can be done in two ways, either with one subnet or two. In the case of a single subnet, the NICs in each NT machine are addressed such that they both have the same network address. The network architecture does not need to route packets from one subnet to another - it simply requires a duplication of network hardware. Presumably the network will have switches using spanning tree to manage possible circular paths. With two subnets, each machine is defined as a multihomed computer. One of these machines can be designated as a router, or a dedicated router can be installed. The network still requires a duplication of network hardware (to facilitate proper redundancy), although the possibilities of circular paths or looping packets are simplified.

In either case, NT presents Citect with basically the same result - two LanA numbers, one for each card. Citect has always been able to swap LanA numbers as needs be, and the same occurs here. As soon as a problem is detected on one LanA, Citect swaps to the other one, and back, and forth and so on until a connection is made. In general, I feel that the dual subnet idea is more conceptually pure and therefore probably the better way to go, however it does not necessarily offer better performance and is probably more expensive (router costs). I have seen each solution work and work well. Suitablility should be considered in each case before deciding upon one way or the other. Other factors may influence the decision, such as file server redundancy, ease of maintenance and so on. One thing I will mention however is that it is (in maintenance headache terms) probably not worth having redundant paths to each of your redundant servers. Therefore I recommend you put just one NIC in each server. In the single subnet scenario, half your servers should be on one physical 'side' of your redundant Lan, the remainder should be on the other.

NT Networking in General

At this point it may be helpful to point out that Citect is completely beholden to NT as far as network resources are concerned. Citect applies to NT for network access, relies on NT to inform it of problems and waits while NT does it's stuff. Therefore even though intrinsically Citect may be capable of very fast switchover in the event of a failure, it must wait for NT to tell it that the failure has occurred before taking action. This can mean that Citect appears slow, and is why you may need to tune NT rather than Citect for good network operations. The following information is the result of research conducted to better support large Citect deployments over vast and complex networks.

The TCP/IP protocol suite implementation for Windows NT 3.5x/4.0 reads all of its configuration data from the registry. This information is written to the registry by the Network Control Panel Applet (NCPA) as part of the setup process. Some of this information is also supplied by the Dynamic Host Configuration Protocol (DHCP) client service if it is enabled. This reference defines all of the registry parameters used to configure the protocol driver, TCPIP.SYS, which implements the standard TCP/IP network protocols.

The implementation of the protocol suite should perform properly and efficiently in most environments using only the configuration information gathered by the NCPA and DHCP. Optimal default values for all other configurable aspects of the protocols have been encoded into the drivers. There may be some circumstances in customer installations where changes to certain default values are appropriate. To handle these cases, optional registry parameters can be created to modify the default behaviour of some parts of the protocol drivers.

The parameters mentioned below are installed with default values by the NCPA during the installation of the TCP/IP components. They may be modified using the Registry Editor (regedt32.exe). WARNING - Modifying the registry of NT incorrectly may cause NT to behave unreliably or crash. As a result, it is possible you will be unable to change settings back to their original values. A complete reinstall of NT may be necessary to correct things. Therefore, take extra care when using the registry editor and when adding or changing parameters in the registry.

All of the TCP/IP parameters are registry values located under one of two different subkeys of HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services:

Tcpip\Parameters
NetBT\Parameters

In networked systems which feature a number of redundant components, catastrophic failure of certain switching hardware can cause NT to think a session is still okay when it most definitely is not. Under these circumstances, a Citect client may think it still has a healthy session to the I/O server when in fact the I/O server is turned off! This behaviour is referred to as 'keepalive'. There are two distinct keepalive concepts in NT. One is the TCP keepalive and the other is the NetBT keepalive. A TCP keepalive packet is simply an 'ack' with the sequence number set to one less than the current sequence number for the connection. A system receiving one of these acks will respond with an ack for the current sequence number. Keepalives can be used to verify that the computer at the remote end of a connection is still available. TCP keepalives can be sent once every KeepAliveTime (defaults to 7,200,000 milliseconds or two hours), if no other data or higher level keepalives have been carried over the TCP connection. If there is no response to a keepalive, it is repeated once every KeepAliveInterval seconds. KeepAliveInterval defaults to 1 second.

KeepAliveInterval Key: Tcpip\Parameters Value Type: REG_DWORD - Time in milliseconds Valid Range: 1 - 0xFFFFFFFF Default: 1000 (one second) Description: This parameter determines the interval between keep alive re-transmissions until a response is received. Once a response is received, the delay until the next keep alive transmission is again controlled by the value of KeepAliveTime. The connection will be aborted after the number of re-transmissions specified by TcpMaxDataRetransmissions have gone unanswered. KeepAliveTime Key: Tcpip\Parameters Value Type: REG_DWORD - Time in milliseconds Valid Range: 1 - 0xFFFFFFFF Default: 7,200,000 (two hours) Description: The parameter controls how often TCP attempts to verify that an idle connection is still intact by sending a keep alive packet. If the remote system is still reachable and functioning, it will acknowledge the keep alive transmission. Keep alive packets are not sent by default. This feature may be enabled on a connection by an application.

NetBT connections, such as those used by many Microsoft networking components, send NetBIOS keepalives more frequently, so normally no TCP keepalives will be sent on a NetBIOS connection. Since Citect uses NetBT it is recommended that the SessionKeepAlive parameter is used instead of KeepAliveTime. As can be seen below, this parameter defaults to 1 hour which is a bit long to be waiting for a client to decide that things have gone awry with it's data connection. Therefore, try changing this to 90 seconds or so. Depending on available network bandwidth it can go even lower, but be aware that there is a trade off between responsiveness and network utilisation. If you require a keepalive time of less that 60 seconds, use the TCP KeepAliveTime parameter instead.

SessionKeepAlive Key: Netbt\Parameters Value Type: REG_DWORD - Time in milliseconds Valid Range: 60,000 - 0xFFFFFFFF Default: 3,600,000 (1 hour) Description: This value determines the time interval between keepalive transmissions on a session. Setting the value to 0xFFFFFFF disables keepalives.

One parameter which is spectacular by its omission is any kind of raw transmission timeout parameter. In fact, it is set to a minimum of 3 seconds and cannot be changed by any means that I know of. According to the following retry parameters, the timeout period is doubled for every successive re-transmissions. This means that you could wait as long as 21 seconds before NT comes back and says it couldn't connect, thereby releasing Citect to it's own methods. A twenty second pause in comms may alarm some operators in the event of a Citect I/O server (or device) failure and therefore these retry parameters can be lowered to 1 or even 0. This will speed up the failure detection and cause Citect to respond more quickly to a problem on the network. Be advised though that one of the good things about TCP/IP as a transport is that it is very good at handling all manner of transmission speeds and problems - removing any possibility of a re-transmissions is probably not worth the extra 6 seconds it gives you in the long run. So stick with at least 1 in each of the following parameters.

TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0xFFFFFFFF Default: 3 Description: This parameter determines the number of times TCP will re-transmit a connect request (SYN) before aborting the attempt. The re-transmission timeout is doubled with each successive re-transmission in a given connect attempt. The initial timeout value is three seconds. TcpMaxDataRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0xFFFFFFFF Default: 5 Description: This parameter controls the number of times TCP will re-transmit an individual data segment (not connection request segments) before aborting the connection. The re-transmission timeout is doubled with each successive re-transmission on a connection. It is reset when responses resume. The base timeout value is dynamically determined by the measured round-trip time on the connection.

Citect uses NetBIOS names to achieve communications over a network. The I/O server has a name (usually "IOServer"), the alarm, trend and report servers have names, and the client has a name (usually a group name). On a simple TCP/IP network, NT will broadcast to resolve these names to IP addresses. The following parameters influence how persistent NT is in this operation.

BcastNameQueryCount Key: Netbt\Parameters Value Type: REG_DWORD - Count Valid Range: 1 to 0xFFFF Default: 3 Description: This value determines the number of times NetBT broadcasts a query for a given name without receiving a response. BcastQueryTimeout Key: Netbt\Parameters Value Type: REG_DWORD - Time in milliseconds Valid Range: 100 to 0xFFFFFFFF Default: 0x2ee ( 750 decimal) Description: This value determines the time interval between successive broadcast name queries for the same name.

Advanced network traffic management devices such as routers and switches usually squelch IP broadcasts in order to prevent broadcast storms on the WAN. This means that Citect machines must be configured to use other means for address resolution. One of the methods available is the use of name servers (WINS). Of particular note is the last one, WinsDownTimeout. This is set by default to 15 seconds, a considerable time if your system is relying on WINS for resolution. However, once a name is set in the local machine NetBIOS name cache, reliance on the WINS machine is not as important.

NameSrvQueryCount Key: Netbt\Parameters Value Type: REG_DWORD - Count Valid Range: 0 - 0xFFFF Default: 3 Description: This value determines the number of times NetBT sends a query to a WINS server for a given name without receiving a response. NameSrvQueryTimeout Key: Netbt\Parameters Value Type: REG_DWORD - Time in milliseconds Valid Range: 100 - 0xFFFFFFFF Default: 1500 (1.5 seconds) Description: This value determines the time interval between successive name queries to WINS for a given name. WinsDownTimeout Key: Netbt\Parameters Value Type: REG_DWORD - Time in milliseconds Valid Range: 1000 - 0xFFFFFFFF Default: 15,000 ( 15 seconds) Description: This parameter determines the amount of time NetBT will wait before again trying to use WINS after it fails to contact any WINS server. This feature primarily allows computers that are temporarily disconnected from the network, such as laptops, to proceed through boot processing without waiting to timeout out each WINS name registration or query individually.

There are four 'node' types to choose from when it comes to determining how a NT machine handles the resolution of NetBIOS names over a WAN or LAN. You can specify which node type a particular machine might be by using the NodeType parameter. All node types use broadcasts except p-node, which uses WINS only.

NodeType Key: Netbt\Parameters Value Type: REG_DWORD - Number Valid Range: 1,2,4,8 (b-node, p-node, m-node, h-node) Default: 1 or 8 based on the WINS server configuration Description: This parameter determines what methods NetBT will use to register and resolve names. A b-node system uses broadcasts. A p-node system uses only point-to-point name queries to a name server (WINS). An m-node system broadcasts first, then queries the name server. An h-node system queries the name server first, then broadcasts. Resolution via LMHOSTS and/or DNS, if enabled, will follow these methods. If this key is present it will override the DhcpNodeType key. If neither key is present, the system defaults to b-node if there are no WINS servers configured for the client. The system defaults to h-node if there is at least one WINS server configured.

See Q1949 for more information as well as some Windows 95 settings.

Finally, I have found two tools to be of use in troubleshooting NT TCP/IP networks with Citect. One is Microsoft's Network Monitor, which allows you to capture, filter and examine individual network packets. You can also use it to analyse network timing - it is useful for trying to figure out why a delay is happening. The full SMS version is definitely much more useful than the default NT version because it shows all traffic on the wire, not just traffic pertinent to the local machine. The other is the NBTStat command from the command line of NT. NBTStat is a useful tool for troubleshooting NetBIOS name resolution problems. Use NBTStat -? to display a list of commands. NBTStat -n displays the names that were registered locally on the system by applications, such as Citect. NBTStat -c shows the NetBIOS name cache, which contains name-to-address mappings for other computers. NBTStat -R purges the name cache and reloads it from the LMHOSTS file. NBTStat -a <name> performs a NetBIOS adaptor status command against the computer specified by name. The adaptor status command returns the local NetBIOS name table for that computer plus the MAC address of the adaptor card. NBTStat -S lists the current NetBIOS sessions and their status, including statistics.

References:

Microsoft KnowledgeBase/Technet:

A White Paper from the Enterprise Technical Support and the Desktop and Business Systems Division

Microsoft Windows NT 3.5, 3.51, 4.0 - TCP/IP Implementation Details

Keywords:

Q2533 Tuning Windows NT's TCP/IP suite to suit Citect

Related Links

Attachments