What makes CitectSCADA
a high performance solution







Technical Paper

Presented by:

Jens Nasholm

Stephen Flannigan


This paper provides an in-depth understanding of CitectSCADA and its different parts with emphasis on what makes it a high performance SCADA system. You will also understand what tools are available to get the best possible data performance out of your CitectSCADA system.






Introduction. 4

CitectSCADA Architecture. 4

The Client 5

The Network (Client to Server) 6

The I/O Server 7

The Channel (I/O server to I/O device) 9

The I/O Device. 10

Conclusion. 11




This paper provides an in-depth understanding of CitectSCADA and its different parts with emphasis on what makes it a high performance SCADA system. You will also understand what tools are available to get the best possible data performance out of your CitectSCADA system. It covers a number of areas including optimizing:

· Data transfer between the CitectSCADA I/O Server/s and device/s

· Network data transfer between CitectSCADA servers and clients

· Trend server performance

· Alarm server performance

CitectSCADA Architecture

Designed from the start with a true client-server architecture, CitectSCADA is a real-time monitoring and control system that ensures high performance response and integrity of data. To take full advantage of a client-server architecture, data must be understood and acted on at the task level. Each task works as a distinct client and/or server module, performing its own role, and interfacing with the other tasks through the client-server relationship.


CitectSCADA has five fundamental tasks which handle: communications with I/O Devices, collection and monitoring of alarm conditions, collection and storage of trend data, Report/Event management and execution; and displaying data to the user. Each of these tasks is independent, performing its own processing. Due to this unique architecture, you have control over which computers in your system perform which tasks. For example, you can nominate one computer to perform the display, and report tasks, while your second computer performs display, I/O, and trends.


Citect recommends that you to use a centralized database when designing a networked CitectSCADA system. Having one global database is obviously beneficial, since you only make changes at one location — which are then updated everywhere. CitectSCADA also supports having separate configurations on each computer, or even a mixture of both. While CitectSCADA has a reputation for installations involving networks and large amounts of data, many users have smaller single server/client installations.


So how fast will my CitectSCADA run? How much load will it introduce on to the supporting hardware? Well.... it depends. It depends on how you are making use of the data the system can provide. In this paper we will consider each component, from client to PLC.

The Client

CitectSCADA as a system is almost entirely client driven, in that the users of data in a CitectSCADA system determine how fast data should be delivered and the nature of that data. A client displaying a graphic will require the data needed to animate the page at a rate defined by the scan time. It requests only this data of the associated I/O server(s) which endeavors to produce it. An alarm server or trend server is also a user of CitectSCADA data. These machines have a relatively constant requirement so that they can continue with their respective tasks. As it happens, some alarm information is pushed out to clients from the server but this is the only instance of this kind of data transaction.


Typically a client of data is one of a number of types. It can be a plain display client, which only requires data for displaying graphic pages, or occasionally alarm or trend pages. It can be a CitectSCADA server, i.e. an alarm or trend server which has a constant demand for certain data points. It can also be a combination of these, say an alarm server which is also used as a display. Maybe it is a trend server as well as an alarm server. Each of these configurations will have unique data requirements, not to mention transient when you also attempt to estimate the load introduced by the display.


Display clients are fairly basic. They simply ask for the (usually small) amount of data needed to continually draw the screen in front of the operator. The speed with which this happens is usually defined by the [Page]ScanTime parameter (default 250 ms), but if the I/O server cannot deliver the data within this time then this turnaround time becomes the determining factor. The page refresh routine in the client submits a list of points it needs to the request manager which then sends the request to the I/O server. You can influence the speed of the request manager by modifying the delay under [Req]Delay (default 50 ms). By setting the [Page]Scantime and [Req]Delay to zero and one respectively, the screen will update as fast as the I/O server can furnish the data necessary. Each scan of the page generates an associated request to the I/O server, so heavy use of this parameter can cause excessive network load and significantly increased CPU usage.


At project compile time, CitectSCADA prepares a number of requests which are associated with particular pages, alarms and trends. In general, CitectSCADA will prepare one request for all the page's digital data from unit1, another request for all the page's analog data from unit1, still another request for all the page's digital data from unit2 and so on. The compiler will also look at your trends and will try and create requests for trends with the same sample time; similar requests are made for the alarms. At runtime this set of requests is asked for to supply the data to the display page, trend server and alarm server. These requests are short and a large number of requests can fit into one TCP/IP frame. Therefore pages need to be quite complex and require data from many different sources before this becomes an issue.


In the same way, the rate at which alarms are scanned can also be altered through use of the [Alarm]ScanTime (default 500 ms) parameter. Typically, it will make little or no difference to efficient and safe plant operation if this is set as high as 2000 (2 seconds). This setting has the potential to radically affect system performance due to the amount of data which is required by the alarm server. For example, setting this parameter to 1000 (1 second) will apparently reduce the amount of data required by the alarm server by 50%. The actual reduction in load on the I/O delivery system will be somewhat less than this, depending on various optimizations, but it will still be significant.


The Trend server has a similar functionality, but to maintain the configured trend sample times this function staggers the requests for data. [Trend]StaggerRequestSubgroups parameter reduces peak loading on I/O Servers by spacing out trend sample requests. If you set this parameter to 1, all trends with the same sample period and on the same I/O device will request samples from the I/O Server at once - each sample period. This will cause large peaks of traffic both on the network between the trend server and I/O server and also on the media to the devices in the field. For values > 1, the trends will be divided into that number of sub-groups. The sub-groups will request samples at different times. Their sample period will remain the same, but their requests will be staggered. The stagger time is determined by dividing the sample period by the value of this parameter. Using a value above 1 will still cause the trend server to request the same amount of data, but instead spread the requests over a period time so that the peaks of traffic are reduced significantly.


There are other ways a client can be a user of data, including report serving, Cicode and events. However considering the unpredictable nature, due to the user configurability, of these functions they are not considered further in this paper.

The Network (Client to Server)

A large number of installed CitectSCADA systems are networked to provide redundancy and 'data everywhere'. Generally, even large CitectSCADA systems rarely suffer from network performance problems. This is because CitectSCADA is an efficient user of network bandwidth. However, you can apply ordinary network tuning principles to improve the networking performance further. High performance sites make use of switches, bridges and hubs to compartmentalize the load and provide dedicated trunks between I/O servers and heavy users of CitectSCADA data, such as alarm and trend servers. Citect always recommends that device networks and office networks are physically separated to avoid communication failures and reduced data access performance for both CitectSCADA and your office applications.

Under certain circumstances the protocol in use may be hindering good performance. Microsoft has tuned most of its protocols to deliver good performance when used in a typical office environment. File transfer and print sharing requires a few large chunks of data, not the hundreds of small packets that CitectSCADA generates. NetBEUI, IPX and TCP/IP can sometimes be manually modified to provide better real-time response, however more recent revisions of these protocols have adaptive algorithms which can modify behavior on the fly. In general terms, the protocols seek to save bandwidth by joining an acknowledgement packet with another packet which happens to be going the same direction. If no convenient packet is forthcoming, the protocol eventually tires of waiting and sends an acknowledgement on its own. This wait period can be up to 100 milliseconds, translating into a substantial performance hit to real-time data distribution.

Recent versions of both CitectSCADA and Windows have seen the increase (or complete removal) of many network related memory limitations. For example, [LAN]WritePool and [LAN]ReadPool buffers for CitectSCADA v6 are now 1024 by default, with a maximum of 32760

Other settings which may be used to modify network performance are [LAN]SesSendBuf and [LAN]SesRecBuf. These settings determine the number of working buffers (inside CitectSCADA's NetBIOS layer) devoted to handling transmit operations and receive operations with respect to each network session. For example, if SesSendBuf = 4, then the system will allocate 4 buffers to handle transmissions. The buffers will be occupied as each send operation is performed, until none remain.


CitectSCADA will then wait until a response returns before continuing. In this way CitectSCADA can handle momentary surges in network traffic; setting this higher may not yield better performance under other circumstances. It also allows a kind of pending command process by which overhead incurred in the transmission of network messages is minimized. From CitectSCADA version 5 and Windows NT and later, SesSendBuf is 128 by default (in previous versions and under previous operating systems it is 2). SesRecBuf has an equivalent function but for receival of messages, and defaults to 32. Raising this has no benefit and may depress performance. Conversely, setting SesRecBuf to less than 32 doesn't really affect performance greatly but can improve network reliability.

The I/O Server

As the central data gathering task of your CitectSCADA system, the I/O server is critical to performance.


There are a number of ways to tune your I/O server(s), including the use of driver parameters, server distribution and load splitting.


The protocol driver is a chunk of software which manages communication between the CitectSCADA I/O server and the I/O device. This relatively small subsystem operates in semi-separation from the rest of CitectSCADA, receiving requests from the I/O server, queuing them, sending them and receiving the associated responses. Each driver shipped with CitectSCADA has a number of parameters which you can change to affect the way the driver performs each of the aforementioned tasks. As before with the network defaults, the driver settings are tested thoroughly to provide best possible performance under most circumstances and in general you should not need to change these settings. The CitectSCADA driver online help outlines the available parameters for specific drivers.


Below you can find explanations to the most common driver parameters,

Block is a definition of the basic chunk size which the I/O server can expect the I/O device to deal with. Due to speeds on some proprietary communications mediums, protocol overheads sometimes add a fair bit of time to general communications. To get around this, the I/O server will attempt to ask for one big chunk of data which can satisfy several different requests. Typically this parameter is a trade off between the number of requests between the I/O server and the I/O device and the size of those requests. The I/O server is capable of considering requests from several different clients or even wildly different requests from the same client and combining them into a few large 'Block' sized requests, thus making the most efficient use of the channel. Unless you are using a protocol in a manner not originally envisaged by our developers, do not modify this parameter.

TransmitDelay (or simply Delay) is a parameter to actually slow the driver down so that the I/O device can keep up. It is mainly designed for serial protocols - some yield a timeout if this is set too small. Ordinarily, drivers will submit a new request immediately after receiving a reply. This can bog some devices down and therefore this delay is introduced to keep the I/O device happy. The driver will wait this amount of time before proceeding with the next request. Most drivers have this set to zero (milliseconds) anyway but there are some notable exceptions.

MaxPending is an abbreviation of Maximum Pending Commands. Some I/O devices maintain a kind of internal queue of commands which are serviced one after another. The driver therefore knows it can send a certain number of requests before expecting any response. This means that during the time the I/O device is busy building the response to a previous request, the channel can be used to prepare another request for immediate action. It is like sending requests in parallel. This parameter can be especially effective when the channel is a little slower i.e. when communication time is a meaningful fraction of the total request service time. Also, the ability to queue pending commands tends to distort the channel usage for the driver (numbers over 100% are not uncommon to be seen in the Kernel).

Typically, drivers fall into one of three categories with respect to MaxPending. Category one contains those that do not support more than one pending command, and must respond to a request before another is sent. This is true of MODBUS and most others. In this case a MaxPending queue may be implemented within the driver to 'fake it', thereby at least saving some driver overhead. In fact, MODBUS performance will be hindered if the MaxPending is set to anything but 2. Regarding category two, some drivers are designed to interface to devices which do support a certain number of pending commands. Allen-Bradley devices are an example of this. Depending on the media over which communications is taking place, it may be justifiable to modify (raise) this number. If no benefits are forthcoming however, it is preferable to return to default settings. The third category contains more advanced drivers which allow the packing of multiple individual requests into one communication frame. TITCP/IP can do this, as up to 14 NITP frames are packed into one CAMP frame. This is the third form of max pending.

The Timeout parameter determines how long the driver will wait for a response before declaring it overdue and asking again (if Retries is set - see next section). If your communications method is particularly slow you may wish to push this time out somewhat to avoid getting too many timeout errors. Conversely, if you have adapted a protocol usually used over slower links for a high speed link you may wish to shorten this down to gain a truer report on link performance.

Retries is the number of attempts the driver will make to the I/O device before considering that it has not responded. As mentioned above, CitectSCADA will wait for [Timeout] until performing a retry. Once this Timeout x Retries period has expired CitectSCADA will raise a hardware alarm notifying you that the device has entered an error state and is not responding. This parameter can be increased if you feel CitectSCADA needs to persist more before causing an alarm, but normally the default setting is appropriate.

PollTime determines how often the driver checks the port for incoming traffic. For the best performance this should be set to zero, which means the driver waits for the port to interrupt it when something comes in. This 'interrupt mode' is the most efficient means of operation, but not all drivers support it. If a driver requires a PollTime, reductions in this setting may deliver some benefit. Having this number set low (without setting it to zero) will be accompanied by an increase in CPU usage. As before, this setting is preset to deliver the best performance under most circumstances.

In many smaller networked systems, one machine will be an I/O, Alarm, Trend and Report server and the other machines will be pure display clients. Under these circumstances, your network will be very lightly loaded, while depending on the project, the server machine might be struggling a bit. In this case, performance may be improved by experimenting with different server locations. By this I mean try moving a large processing load like the trend server off the I/O server machine and onto one of your client machines. The same could be done for the alarm server. Very large systems sometimes have up to eight or ten servers dedicated to these tasks. Two will be I/O servers; two will be alarm servers etc.


This kind of task delegation is easy to do since all it requires is a modification to the .ini file. Use the Setup Wizard to do this. Also, if you are considering moving your trend server to another machine, the trend files (if they are held locally) will have to be moved also.


You may find that the system is requiring a large amount of data from a particular device and this is overloading the I/O server communications channel, even after you have moved your alarm and trend servers to different locations. In this sort of situation you may want to consider moving some of the load from one I/O server to another. This option will require a separate channel to the I/O device in question, but with the growth of Ethernet communications in process control this is trivial. You can then define the tags to be coming from the same device but via a different server. This will force clients to appeal to I/O server 1 for some of their data and I/O server 2 for the rest. This plan can be extended to (practically) any number of I/O servers, thereby limiting your communications performance bottleneck to the port on the I/O device itself.


There is another thing which you may want to consider, that is caching. The I/O server can cache data from any particular I/O device and service client requests out of the cache instead of sending a request to the device itself. Typically the cache is set to around 300 ms but you can configure this to any number you desire. This can be tuned to correspond with how quickly you expect the data to change in the PLC. Using the cache can also be handy if you have data from different I/O device displayed on the same page. One device might respond quickly and the other less so. To facilitate fast page displays but avoid having the fast always waiting for the slow, set the cache on the slow device(s) to a number a little less than the device response time. That way when the client requests the data for the next display, the I/O server will be more likely to hit the cache the 'slower' data.

The Channel (I/O server to I/O device)

Each type of I/O Device uses a unique protocol to communicate with higher level equipment such as CitectSCADA. The speed with which data can be transferred depends on, and is limited by, the I/O Device and the protocol design. The limitation comes from the fact that I/O Devices do not respond immediately to requests for data, and many protocols are inefficient. The following strategies allow CitectSCADA to maximize data transfer. CitectSCADA’s communication is demand based — reading only those points which are requested by the clients. More importantly, the I/O Server rationalizes requests from clients, for example, combining them into one request where possible. This reduces needless communication, giving screen update times up to eight times faster (than without). Only a restricted volume of data can be returned in one request. If all requested data is grouped together, then fewer requests are required, and the response is faster. But what happens when two required registers are separated? CitectSCADA uses a blocking constant to calculate whether it is quicker to read them separately, or in the same ‘block’. By compiling a list of the registers that must be read in one scan, CitectSCADA automatically calculates the most efficient way of reading the data. The client-server processing of CitectSCADA allows further performance increases, through the use of a cache on the I/O Server. When an I/O Server reads registers, their values are retained in its memory for a user defined period (typically 300ms). If a client requests data that is stored in the cache, the data is provided without the register being re-read. In a typical two client system, this will occur 30% of the time. The potential performance increase is therefore 30%. CitectSCADA also uses read ahead caching, updating the cache if it gets accessed — predicting that the same information will be requested again!


BLOCKING EXAMPLE: Citect requires registers 1012 and 1020. The I/O device has a read

overhead of 60ms — which is independent of the number of registers read.


CitectSCADA supports many protocols and methods with respect to communicating to I/O devices, even within one manufacturers offering, there may be half a dozen different methods of communicating with the device. Given this, your choice of hardware and software to facilitate communications may have a large effect on your system performance.


Ethernet in general offers superior performance and many device manufacturers have Ethernet communications solutions as part of their current product lineup. Using Ethernet allows flexibility and in some cases economy in communications. Some plants deploy fibre networks with hubs, switches and so on to manage I/O server to PLC communications and this is probably about as good as you can get.

The I/O Device

Last in the line, but most important, is the PLC itself. CitectSCADA is always trying to optimize communications, especially by asking for large chunks of data with a mind to satisfying multiple client requests in one go. You can significantly improve CitectSCADA’s chances by grouping similar data in the I/O device memory. By this I mean designating a set of registers specifically intended for alarms. Yet another set could hold all your trend tags. In these two blocks a large amount of your data resides.


CitectSCADA can then make one read request for all the digital alarms, and another for all the trends. Since these two requests are always going to be happening, you can make things easier by grouping them conveniently. Once CitectSCADA has to make multiple requests for these basic requirements you start to lose that best possible speed. This sort of thing must be done at the start of a project however, and is not usually an option for a retrofit on a legacy system. Remember, the aim here is to reduce the number of reads CitectSCADA has to do to support typical operation. This is probably the single most important thing you can do to improve the speed of your system.


In general, CitectSCADA can support a very fast update on a page, but the limiting factor will be the speed with which the associated device(s) can turn a request around. In fact typically it is this which limits a system's performance. You should not expect great communications from a PLC which is already struggling to maintain a set scan time. Some PLCs allow you to set aside a certain amount of time per scan to handle communications. You may be able to bias your machine in favor of communications if this really becomes a problem.


In general you will do well if you take a holistic approach to CitectSCADA performance. Remember that CitectSCADA is client driven, and eventually all that data has to come from somewhere. Consider which data is frequently used and start with that. You may need segmented networks and extra I/O servers to keep up. Some data loads are constant like alarms and trends and you can plan carefully for these.


Others are transient like display page data - these are more difficult to optimize but in general these do not impose a large load on the system. CitectSCADA will optimize at compile time and run time to allow best possible performance from the I/O server, but you will have to back it up by providing good fast channels to your devices, optimize the device programming so that data that is likely to be requested at the same time by CitectSCADA is located in the same physical memory are and ensuring the devices are not too busy to answer.
































Disclaimer of All Warranties 

Disclaimer of Liability