3-Tier for Improved Network Performance

Improved network performance is often cited as one of the minor benefits of 3-Tier software architectures, yet few (if any) authors choose to pursue this in any detail. This technical brief takes up this issue. (A separate technical brief takes us the issue of 3-Tier for Improved Database Throughput.)

Network Characteristics

The two charactertistics of a network that are important to this discussion are bandwidth and latency. Bandwidth is simply the amount of data that can pass through the network in a particular period of time. Roughly speaking, then, the impact of an application with respect to network bandwidth is determined by the number of bytes of data that it requires be moved. Latency is the amount of time between when a packet of network data is transmitted and the time it is received. There is some latency in the network protocol stack software, but this is typically small and becomes important only when very large amounts of data (numbers of packets) are being transmitted. Similarly, the latency in network routing and switching hardware should be very, very small. Most latency, then, comes from the actual transmission media, and is most noticeable in WAN environments. Applications typically do not impact latency, but rather are impacted by the latency that is intrinsic to the network itself.

Software Architectures

While there are an great number of architectural possibilities, this paper will discuss three common topologies: 2-Tier File Sharing, 2-Tier RDBMS and 3-Tier RDBMS.

2-Tier File Sharing

The 2-Tier File Sharing architecture was the first foray into client/server computing for many organizations. It represented a nearly transparent migration from standalone desktop software to shared data. The applications were typically written in C or an XBase variant and stored data and b+ tree indexes in files. If record locking was correctly implemented, the data files could be moved to a network drive on a LAN server to facilitate multiuser access.

To the extent that the user base was small, these applications performed quite well over the network. When a bottleneck arose, however, it was usually traced to network bandwidth. The architecture fundamentally relies on moving great amounts of data over the network, as both index and data records must be transported to the client system for any database access. The effect of latency, on the other hand, was gradually mitigated as NOS vendors, notably Novell, optimized their network I/O for WAN connections. The improvements were typically accomplished by using relatively large packet sizes, which reduces the number of packets that must be transmitted (and acknowledged).

2-Tier RDBMS

When the mass market press or corporate executives (or relatively naive IT professionals!) speak of client/server, they are usually refering to a 2-Tier RDBMS topology, in which the server runs an RDBMS engine that performs database access in response to requests from client software. Three major benefits to this are frequently cited. The first, that the client and server converse using the standard SQL language, while true several years ago, is now specious. Many file-based RDBMS engines, such as Access, Paradox and Visual FoxPro, support SQL for data manipulation. The second, that the system is more scalable, is true to mainly to the extent that the third benefit (more efficient network utilization) applies, as disk I/O and CPU usage scale similarly for LAN file serving and RDBMS database access. In fact, the RDBMS solution requires greater scalability of the server CPU, since processing is shifted away from the desktops that are relatively idle and often very powerful.

The third major benefit, more efficient network utilization, is the real breakthrough for 2-tier topologies. In the file sharing model, looking up a single record in a large table requires that several index records (or pages) be transferred from the server to the client before the desired data record is fetched. The number of reads scales logarithmically with the number of data records. Using the RDBMS, the database engine processes the index records locally on the server, then passes only the results to the client. This is far better bandwidth utilization, in that fewer bytes of data must be transmitted, and since fewer packets are exchanged, high latency networks have a less noticeable impact on system response.

3-Tier RDBMS

The networking advantages of 3-Tier RDBMS are realized only when the non-client tiers execute on the same physical platform. In the previous section, the 2-Tier RDBMS was described as needed to send only the result data across the network. At a finer level of detail, one finds that, in fact, there are three types of data that are transmitted. One is the data itself. Another is session management and/or flow control, which are data packets used to keep the client and server synchronized and informed of one anothers state. The third is metadata, or "data about data", specifically information about things data type and data length for each column of data in each row in the result set.

While the session management and metadata are required to support some of the advanced functionality in client/server RDBMS systems, they come with a price: network utilization. Clearly, it is necessary to send more data, so more bandwidth is used. Fortunately, the overhead is relatively small, especially as the amount of real data in the result increases. The impact of network latency, however, can be very large, as session management and metadata are typically transmitted in separate packets from the data. The acknowledgment handshaking that must occur, whether coded explicitly in the RDBMS protocol or existing implicitly in TCP/IP, magnifies the effect of latency on system responsiveness.

The 3-Tier RDBMS architecture addresses these network issues by isolating the RDBMS traffic to a single machine. (Some RDBMS engines provide direct, no-network access to processes running on the same machine as the engine, eliminating all network overhead.) The client can converse with the middle tier using a protocol that is highly optimized for the network. There is no need for metadata, since the client and middle tier can use data formats that are hard-coded. Session management can be non-existent in many applications, or can be specifically tweaked for WAN environments and seldom-connected mobile computing.

Industry Summary

All major client/server development tool vendors are providing "next generation" environments that support application partitioning. While the marketing push is to use 3-tier topologies to separate business logic or implement business servers, the performance improvements that are possible have not been lost on the technical staffs at companies like Forte. It's as much as 18 months ago that their product demonstrations included vivid examples of the performance benefits of having a middle tier software component on the same hardware platform as the RDBMS marshalling the results before forwarding them more efficiently to the requesting client.

ODBC vendor OpenLink is capitalizing on this as well, with an implementation that puts a lightweight component on the client which uses a proprietary, RDBMS-independent RPC mechanism to connect to a RDBMS-specific middle tiers running on the RDBMS platform. More recently, Visigenic and NobleNet have introduced similar technologies.


Copyright (c) 1996 Scott Nichol.
16-Jun-96