DevX Home    Today's Headlines   Articles Archive   Skill Building   Tip Bank   Sourcebank   Forums   
AMD64 DevSource
Core Technology
Tools and Techniques
Driver Developer
Migration
AMD64 Developer Update

More Newsletters
Privacy Statement
 Print Print
Average Rating: 5/5 | Rate this item | 6 users have rated this item.
Transport Your Application to Hyper Performance
The performance of an application is dependent on a large number of factors, many of which are beyond the programmers control. Although software is continuing to benefit from processor performance doubling roughly every eighteen months, the same cannot be said about the slower advances in the performance of I/O bus technologies. Allan McNaughton explains HyperTransport, an exciting technology that significantly boosts I/O bandwidth in the AMD Opteron processor - and leaves PCI/X in the dust. 

As developers create ever-faster processors, multiple-processor systems, and more demanding applications, progress places significant burdens on traditional bus technologies, driving the urgent need to improve the communications link between interconnected devices—and a faster transport mechanism between the processor and main memory. Keeping AMD's high-performance 64-bit Opteron processor, for example, fed with data and instructions is a task that can quickly saturate conventional I/O busses based on older PCI technology. Add in the bandwidth demands of a new generation of multi-processor-capable applications, and you have a real bottleneck.

To overcome the problem of bus saturation, AMD designed the Opteron chip not only to be a fast processor, but also to support the efficient transport of data between interconnected processors, supporting chips, and I/O devices. To reach this goal AMD, working with a consortium of industry vendors, created the HyperTransport technology I/O bus—which transports data at speeds up to 6.4 GB/s. Figure 1 shows some of the basic statistics about HyperTransport—see http://www.hypertransport.org/ for details on the supporting organization.


Figure 1: Click to enlarge.

The HyperTransport bus has benefits that go beyond the obvious. It uses a "packetized" design, which means that addresses, data, and commands are sent along the same wires, allowing for a much narrower link. PCI and its derivatives, by contrast, are wider, slower busses that require dedicated pins and traces for data, addresses, and sideband information. Although the HyperTransport bus may require more cycles to move a given amount of data, the pure speed of the HyperTransport connection ensures a far higher effective data transfer rate—6.4GB/s versus 1GB/s—leaving poor old PCI-X bus in the dust.

The simplicity of HyperTransport technology enables hardware designers to build less complex systems, as routing a narrow bus is far easier than routing a wider bus. Narrower busses also reduce the need to add layers and additional costs to system board designs, so lower-cost four-layer circuit boards can be used. Although all this may be interesting from a hardware architect point of view, what this really means to you, the software developer, is that systems using HyperTransport technology offer the outstanding performance at low cost that youve been looking for.

Another problem that the Opteron processor addresses is the relatively slow connection between the processor and supporting circuitry. This connection, commonly called the "front-side bus," is the transport mechanism for all data traveling between the processor and main memory, graphics card, and all types of I/O devices. The front-side bus transfer rate on prior generation AMD processors is on the order of 2.1GB/s—fast, but still capable of being saturated by the demands of a server configured with multi-processors, high-speed network cards, and fast storage devices. So the Opteron processor replaces the front-side bus with a HyperTransport connection that dramatically extends communication bandwidth up to 6.4GB/s.

Even with Opteron processors forward-looking design, the past has not been forgotten. Just as 64-bit Opteron chips can handle prior-generation 32-bit x86 applications with confidence, AMDs implementation of HyperTransport cleanly supports existing I/O technologies such as PCI-X, AGP-8x, USB 2.0, 10/100 Ethernet, and EIDE/ATA.

With HyperTransport, system architects are also freed from the design constraints imposed by traditional bus architectures—specifically with the inherent limitations of the popular Northbridge/Southbridge design. Using HyperTransport technology as a building block, one can easily construct a daisy-chained interconnect between system components. With this approach, illustrated by Figure 2, a server could support as many high-speed interfaces (such as Fibre Channel, IEEE-1394 FireWire, Gigabit Ethernet or InfiniBand) as desired.


Figure 2: Click to enlarge.

Faster Applications
Applications that are optimized for multi-processor environments are often constructed using a message-passing architecture. Keeping large numbers of application threads in sync can result in high levels of bus traffic as messages are sent back and forth between processors. Prior to the advent of the Opteron processors HyperTransport architecture, these messages needed to compete for attention with other bus traffic.

And when you add in SMP, everything just gets better: With a multi-processor Opteron system, message-passing applications can achieve their true potential as HyperTransport technology provides a high-speed, chip-to-chip interconnect that significantly reduces the I/O performance bottleneck, with ample performance headroom for future growth. Figure 3 shows one possible way of architecting an SMP system using HyperTransport.


Figure 3: Click to enlarge.

Application performance is further enhanced by fact that the Opteron processor has a direct connection to main memory—no bus needed. The integration of a memory controller into the processor core significantly reduces memory latency because it alleviates the need for memory transactions to traverse the traditional memory access path through the "Northbridge" chip. The effect of the reduction in memory latency, coupled with the additional increase in memory bandwidth available directly to the processor, cannot be overstated, as it tremendously benefits system performance across all application segments.

With HyperTransport technology it is now possible to build servers and technical workstations that are faster, cheaper, and simpler than ever before. Not only will your typical application benefit from the Opteron processors high performance I/O bus and direct to memory interface, youll also find the HyperTransport technology-based processor interconnect can yield significant improvements in the performance of multi-processor capable applications. Want to learn more about how it works? See HyperTransport Technology I/O Link: A High-Bandwidth I/O Architecture (PDF).

Page 1 of 1
Allan McNaughton is the principal analyst at Technical Insight LLC. He can be reached at allan@technical-insight.com.