Brief: Advances in caching architectures have improved storage performance on the AS/400 system and have enabled a high-performance, RAID-5 data-protection technique. This article defines cache and explains how the multiple-level cache implementation on the AS/400 works.
The function of a cache is to stage data into a region that is faster but more costly than the region being served by the cache. In computer systems, cache structures can be associated with main storage, software, communications I/O, and disk I/O; there are instruction caches, data caches and more. In the case of virtual memory systems, the disk can be viewed as the region being served by the cache, and the mainstore (memory) can be viewed as the cache. This structure has been modified by IBM and others by adding read caches and write caches to the disk subsystem to enhance I/O performance.
Cache Effectiveness
Cache effectiveness in the context of an individual cache experiment can be defined as the number of cache hits divided by the total number of requests to the region being serviced by the cache. A hit means that a data access was satisfied by the cache, theoretically providing better performance at the system level.
Cache effectiveness at a system level is determined by the cache design, the workload it is operating under, and system resource utilization and availability. Workload is the work being done by the system and the sequence in which it is done. Workload is specified by a set of parameters that describe the data and the system during execution time. The important parameters for cache effectiveness are locality of reference, sequence, priority, record size, and workload synchronization. The system resources that influence performance are the CPU, mainstore, input/output processors (IOPs), and disks. All of these variables have a measurable effect on the operation of a cache and are unique for each customer and application.
Locality of reference is a measure of the data's randomness. It describes how physical references to data are located over a given period of time or in a range of storage space. Workload synchronization describes the degree of independence of a program from the I/O it performs. Synchronous access requires the program to wait for the I/O step to complete before excecuting the next instruction. Asynchronous access allows the program to continue processing while the I/O operations are in process. A cache is more beneficial to synchronous access; however, cache benefits synchronous and asynchronous workloads since both are accessing the same set of disks.
If a workload is highly sequential and has a high percentage of reads, a large read cache can be beneficial. If the workload is characterized by small, randomly scattered records, a large read cache can be a hindrance. If the CPU or the mainstore is 100 percent utilized, no amount of cache will help because the disk I/O is not limiting system performance. The challenge is to design an architecture that manages storage resources in a manner that delivers optimal system performance. Workload often covers the extremes during the course of an hour and varies greatly depending on the application and system configuration. For this reason, a cache design must be considered as part of a system architecture.
Once a system cache architecture is chosen, the designer must consider several factors: o Cache management overhead vs. cache size. o The number of segments or channels in the cache. o Cache adaptability to varying workload conditions. o The cache management architecture (which data gets moved out of the cache when, which data should get logically grouped for optimal data movement, and which data is left alone).
It is important to realize that cache effectiveness should be viewed only in the context of overall system performance. Making assessments of subunits within an overall architecture is not a reliable way to predict system performance.
Read Cache
The function of read cache in disk I/O operations is to fetch data from disk storage and hold this data resident in solid-state storage in anticipation of a future request. If a hit occurs, the response is based on microsecond access times as opposed to millisecond access times. The net effect is a decrease in the average response time and an increase in throughput from the associated disk array.
The first of three types of read cache covered here is located at the disk level. Referred to as arm buffers, these caches range in size up to 1 megabyte (MB) and are highly efficient compared to a controller cache. The efficiency is based on the arm buffer's ability to abort current read-ahead activity with very little penalty. It works by reading ahead and gathering more data than the system is requesting from the current operation.
The second, and more typical, implementation of read cache places a fairly large cache between disk arrays and the system bus (on the disk subsystem controller). This read cache operates independently from the system. Dynamic optimization is limited to sampling the data patterns for the subset of system disk it is bound to. For example, if the host issues a 1 kilobyte (KB) read request, the controller read cache may perform a 4KB read; if the host issues a 16KB read, the controller read cache may perform a 64KB read. The amount it reads is based on averaging the characteristics of the data passing through it.
Data integrity is not an issue with read cache because a copy of the data is secure on the disk. The typical, controller-based read cache pictured in 1 is fairly easy to implement and has been widely available. The disadvantages of this approach are the independence of the system or lack of "knowledge" about what the system is doing or is going to do, the limited span of optimization (being tied to a physical subset of the total storage space), and the cache's inability to react effectively to changes in system workload activity.
Data integrity is not an issue with read cache because a copy of the data is secure on the disk. The typical, controller-based read cache pictured in Figure 1 is fairly easy to implement and has been widely available. The disadvantages of this approach are the independence of the system or lack of "knowledge" about what the system is doing or is going to do, the limited span of optimization (being tied to a physical subset of the total storage space), and the cache's inability to react effectively to changes in system workload activity.
The third and most sophisticated type of read cache takes advantage of virtual memory and mainstore to provide a global read-cache function for the entire system. In the case of the AS/400, this implementation is called expert cache.
Write Cache
The function of the write cache is complementary to, but different from, the read cache. While the read cache anticipates future reads from the disk, the write cache is working to reduce the total number of write operations going to the disk. Most importantly, the write cache provides a fast response time to the host for write operations. This task is accomplished by storing data in solid-state storage in anticipation that an upcoming storage operation will go to an address present in the write cache.
When a write occurs to a location resident in the write cache, the data in this location is replaced with the new data and a write to the disk device is avoided. The cache collects sequential data to create larger blocks of data for more efficient arm utilization and services read requests when the data happens to be present in the cache. All this has the effect of reducing the total number of accesses to the disk array, therefore reducing average response time and increasing throughput.
A special case of a write cache is a write buffer. A write buffer does not reduce the total number of I/Os reaching the disk but rather holds the data until it can be stored. Write buffers can help performance to a limited extent because data flow is variable in time. The buffer acts as a fast-response holding stage while the disk arm is busy.
A design consideration that is more important than performance with write cache and write buffer is data integrity. The system operates on the assumption that all completed write operations are secure on a disk device. Therefore, the level of data security on the write cache must be the same as or greater than that of the disk. To ensure data integrity, a write cache design should have the following characteristics: dual copy, nonvolatility (battery support), and portability.
Another important parameter is error recovery. If something should go wrong immediately after the system has been notified that a write operation is complete, the write cache must be capable of recovering as though the data were totally secure on the disk. Suppose the system is notified of a write complete and the I/O controller fails as data is being moved from the write cache to the disk. The system knows that there has been an I/O controller failure, but it assumes the data is safely on the disk. Thus the I/O controller microcode must be capable of establishing the conditions the system is expecting immediately after installation of the new I/O controller.
The AS/400 Storage I/O Cache Architecture
Among the attributes associated with the AS/400 are simplicity and ease of use. These attributes often hide the fact that beneath the covers of the AS/400 is a sophisticated storage architecture-one of the most advanced in the industry. This architecture, which features single-level storage with persistent addressing and an integrated database, has been exploited to create a cache architecture.
The single-level storage provides applications with a continuous address space where all data is visible and available to any authorized user. This feature, in combination with the integrated database, allows applications access to data without necessarily demanding I/O operations. It also allows storage management to predictably retrieve stored data and to retrieve this data in the most efficient manner.
The AS/400 storage I/O cache architecture has three key components: disk- device read cache, disk-array nonvolatile write cache, and expert cache. 2 is a conceptual picture to help you visualize the architecture. The noticeable element eliminated from this architecture is the traditional, array controller read cache (see 1). Instead, the AS/400 has deployed expert cache as part of storage management. As a result, the AS/400 is able to selectively enhance storage I/O performance using the mainstore already present on the system.
The AS/400 storage I/O cache architecture has three key components: disk- device read cache, disk-array nonvolatile write cache, and expert cache. Figure 2 is a conceptual picture to help you visualize the architecture. The noticeable element eliminated from this architecture is the traditional, array controller read cache (see Figure 1). Instead, the AS/400 has deployed expert cache as part of storage management. As a result, the AS/400 is able to selectively enhance storage I/O performance using the mainstore already present on the system.
Mainstore is partitioned into storage pools. Expert cache manages the caching for the jobs within the pools. Its artificial intelligence caching algorithm continuously monitors physical and logical storage activity, dynamically adjusting the caching parameters for a given pool.
At a logical level, the caching algorithm is able to predict future demand and to cross-reference data for maximum sharing or, conversely, for minimum disk access. At the physical level, it is continuously managing the mainstore and disk behavior. At one instant, it may request a small record from a particular disk; at the next instant, a long sequential group of records from a group of disks. It may choose to retain the small record in mainstore for an anticipated request from another job, while immediately dumping the long sequential data after it has been used. Expert cache manages the data flow in a way that is most beneficial overall to total performance. It must have full control of the disk storage and cannot function properly if it must compete with a controller read cache.
Before developing the AS/400 cache architecture, the IBM development team conducted extensive research on cache design and effectiveness. Models were developed to simulate various cache configurations, and actual customer workloads were used as input to these models. The first study investigated read caches. The analysis revealed that the segmented, disk arm caches or buffers are highly effective.
Figures 3 and 4 show two different workloads. Workload A is read-intensive and sequential; workload B has a higher percentage of write operations. The horizontal line in each figure shows the effectiveness of a read buffer on the disk arm. The other two curves show the effect of adding a controller read cache with and without the arm cache. As you can see, the workload with the higher percentage of writes than reads benefits less from a read cache than the read-intensive, sequential workload.
This analysis was performed across many workloads. The additional price/performance benefit that could be gained by adding controller read caches above and beyond the arm caches was often small and, from a DASD controller perspective, unpredictable. As you look at Figures 3 and 4 again, you'll see the relationship between the size of the cache and its effectiveness, as well the effect of the workload. The higher the number of hits, the more effective the cache.
Observe workload B in 4; read cache in addition to the arm caches would not be cost beneficial. The read cache would need to be very large before any gain could be detected at the system level. In workload A, more can be gained from additional read cache. The optimal way to cover cases where additional read cache is beneficial is to add it selectively. The answer to this challenge is expert cache.
Observe workload B in Figure 4; read cache in addition to the arm caches would not be cost beneficial. The read cache would need to be very large before any gain could be detected at the system level. In workload A, more can be gained from additional read cache. The optimal way to cover cases where additional read cache is beneficial is to add it selectively. The answer to this challenge is expert cache.
Benchmark measurements taken with expert cache turned off show that controller read cache benefits certain types of workloads while impeding others from reaching full performance potential. This is because the controller-based read cache designs treat all data the same. Optimization is typically based on averaging the size of data passing through the cache and then adjusting the caching parameters accordingly. This procedure occurs without regard to the origin of the data, the type of data, or the state of the host. Consequently, unless the workload is very well-behaved as defined by the cache, the caching is simply not optimized.
The only way the controller-based read cache designs can minimize conflicting resource requirements is to become very large. As the cache size becomes very large, the overhead to manage the cache becomes an issue, as do cost and reliability. The net effect on system performance is that the controller-based read cache offers little price/performance benefit above disk arm caches except in cases where the workload happens to fit the design point optimization.
The choice of expert cache or controller read cache must be made up front. Controller-based read cache can conflict with expert cache, resulting in degraded system performance. 5 shows an actual test of expert cache and a controller-based read cache. As the application proceeds, cache conflicts develop. Expert cache is requesting a particular set of data based on changing system activity, while the controller-based read cache is requesting a set of data based on its sampling algorithm. The results show that the job run time increased when both caches were active.
The choice of expert cache or controller read cache must be made up front. Controller-based read cache can conflict with expert cache, resulting in degraded system performance. Figure 5 shows an actual test of expert cache and a controller-based read cache. As the application proceeds, cache conflicts develop. Expert cache is requesting a particular set of data based on changing system activity, while the controller-based read cache is requesting a set of data based on its sampling algorithm. The results show that the job run time increased when both caches were active.
The benchmark in 6 demonstrates the efficiency of expert cache as well as its ability to service two potentially conflicting workloads. Notice that expert cache achieved faster run time than the controller-based read cache. Even when the controller-based read cache size was increased, expert cache still performed better. When benchmarking was done, the sizes of the storage pools for expert cache and controller read cache were equal. The size of the controller read cache was chosen to give the best results. Thus, the total storage dedicated to the controller read cache measurement would actually be 116MB when the 20MB storage pool in mainstore is included.
The benchmark in Figure 6 demonstrates the efficiency of expert cache as well as its ability to service two potentially conflicting workloads. Notice that expert cache achieved faster run time than the controller-based read cache. Even when the controller-based read cache size was increased, expert cache still performed better. When benchmarking was done, the sizes of the storage pools for expert cache and controller read cache were equal. The size of the controller read cache was chosen to give the best results. Thus, the total storage dedicated to the controller read cache measurement would actually be 116MB when the 20MB storage pool in mainstore is included.
Expert cache is highly scalable because the storage-pool size is variable; if beneficial, mainstore size can grow as well. The workload defines the optimal caching potential achievable for a given cache design. Expert cache and controller-based read cache are thus on equal footing in this sense. Expert cache exceeds controller-based read cache capability because, in addition to the caching attributes it shares with its simpler cousin, it has the significant system-view attributes that give it the competitive advantage in the final analysis. The AS/400 development team has proven this fact in many benchmark experiments. In general, expert cache offers better performance with far less physical caching storage than the I/O controller methods.
7 contains a conceptual view of expert cache. The picture is drawn to emphasize the point that expert cache is the collection of system components operating under the command of the system internal-licensed microcode. The physical location of cached data is in mainstore; but, unlike controller-based read cache, the entire system is part of the implementation.
Figure 7 contains a conceptual view of expert cache. The picture is drawn to emphasize the point that expert cache is the collection of system components operating under the command of the system internal-licensed microcode. The physical location of cached data is in mainstore; but, unlike controller-based read cache, the entire system is part of the implementation.
In a manner similar to expert cache, write caching can gain maximum optimization at the mainstore level. The problem with this approach is that mainstore would be required to be as secure as disk, which forces very expensive technology and system real estate. The approach chosen by the IBM developers was to place a nonvolatile write cache above the storage array it services. The write cache research provided dramatic results; 8 shows three different workloads and the effectiveness as a function of size. It is clear that performance is highly dependent on workload; and again, the effectiveness as seen in the context of this experiment does not portray system performance.
In a manner similar to expert cache, write caching can gain maximum optimization at the mainstore level. The problem with this approach is that mainstore would be required to be as secure as disk, which forces very expensive technology and system real estate. The approach chosen by the IBM developers was to place a nonvolatile write cache above the storage array it services. The write cache research provided dramatic results; Figure 8 shows three different workloads and the effectiveness as a function of size. It is clear that performance is highly dependent on workload; and again, the effectiveness as seen in the context of this experiment does not portray system performance.
With the 9337-2xx models or the new AS/400 Advanced Series with internal RAID, and OS/400 V2R3 (containing expert cache), the AS/400 provides all the elements necessary to enhance system performance through caching. The benefit of this solution has been verified in actual benchmark analysis done with competitive products. As the original modeling predicted, the disk with arm caches performed very favorably compared to the controller-based read cache in many of the cases-without the benefit of expert cache. When expert cache is invoked, the margin is increased in those cases where additional read caching is beneficial.
Conclusion
The value that the AS/400 brings to the table is the integration of storage management complexity and function. The AS/400 architecture enables sophisticated caching techniques that offer customers total optimization of system performance. Future direction will include increased integration of storage function as well as greater levels of fault tolerance and performance optimization. IBM's goal is to have the AS/400 storage "taken for granted" so that customers are able to focus on business optimization and not on storage optimization.
Steven J. Finnes has spent the majority of his IBM career in the AS/400 development lab. He was responsible for the AS/400 DASD strategy and product plan that led to the introduction of RAID and the associated cache architecture.
He would like to acknowledge the team that worked through the long winter doing the mathematical modeling and analysis that led to the current implementation on the AS/400. This team included B. Glanzt, T. Mullins, B. Collins, W. Larson, G. Bartels, B. Nelson, M. Johnson, A. Waltz, B. Clark, S. Johnson, and others.
AS/400 Cache Architecture
Figure 1 Controller-based Read Cache
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 2 AS/400 Cache Architecture
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 3 Read Cache Effectiveness - Workload A
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 4 Read Cache Effectiveness - Workload B
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 5 Expert Cache Test Results
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 6 Expert Cache Benchmarks
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 7 Expert Cache Conceptual View
UNABLE TO REPRODUCE GRAPHICS
AS/400 Cache Architecture
Figure 8 Write Cache Effectiveness
UNABLE TO REPRODUCE GRAPHICS
LATEST COMMENTS
MC Press Online