DB2 System Performance Metrics
Responsive Systems Company
Morganville, NJ 07751
DB2 is a complex database management system, that has a major impact upon your overall processing environment. Installations are constantly looking for ways to measure, analyze, and control its resource consumption. This presentation assumes that the reader has a basic understanding of DB2, and does not define basic DB2 terminology. It addresses the origins and data available for analyzing, tuning, and tracking the performance of the DB2 System and Applications, and inter-dependencies and relationships between the various data elements. Many of the system tuning options are explained and the trade-offs discussed, since systems and applications are always somewhat different for each corporation. The Data Sharing Sysplex is not discussed in this presentation.
DB2 is viewed by many people as unpredictable, and sometimes as a "black hole" that consumes all available resources. However, DB2 does not have to be unpredictable, or consume all available resources. There is adequate information available from DB2 and other sources to accurately track resource consumption, determine when the system or applications are not performing properly, and determine the cause so corrective action may be taken.
This paper focuses on the data in the DB2 Statistics and Accounting records, and discusses how this data can be used together with other MVS system information sources, such as RMF, for performance analysis and tuning of the DB2 system and applications.
DB2 Record Structures
The Statistics (SMF Type 100) and Accounting (SMF Type 101) records both use self defining dsects that may or may not occur, depending upon the functions and types of activity at the system and application level. SMF Type 102 records contain two categories of records: The first two are general in nature, and contain useful system information such as the parameters that DB2 is currently using (ZPARM and dynamic system specifications), and tables that map Database ID (DBID) and Object ID (OBID) to the actual Database and Tablespace/Table names. The balance of the 102 records produced by DB2 are specific performance trace records. Many of these are also useful for several performance tuning approaches.
The first place to start, is with an assembly of the applicable macros to produce record formats. The macros exist in the DB2 Maclib, and are:
STATS DSNDQWST SUBTYPE=ALL
PERF DSNDQWIW SUBTYPE=ALL
Note: The Statistics macro requires the 'subtype' parameter, because there are actually two records produced by DB2. Both records use SMF type 100. However, each has a different DB2 Instrumentation Facility Counter ID (IFCID). IFCID 1 contains the system related functions handled by the DB2 System Services address space, commonly called the DB2 Master. IFCID 2 contains database related functions handled by the DB2 Database Services address space, commonly known as the DB2 Database Manager. The performance trace records use IFCIDs from 4 through 241 in DB2 V3.1, with an expansion up to IFCID 306 with V4.1. The Accounting record is IFCID 3. In order to obtain a total picture of all possible records that are produced by a DB2 system, you should assemble DSNDQWIW, also using SUBTYPE=ALL.
The Statistics records are produced when DB2 is started or stopped, at timed intervals as specified in the ZPARM, when Performance or Statistics traces are started, and when DB2 takes a checkpoint. The normal, and default interval for Statistics is 30 minutes, plus or minus a few seconds. It is not possible to 'synch' these occurrences to any interval as with RMF.
The Statistics records are much more difficult to deal with than the Accounting because most of the fields contain accumulated values since DB2 was last started. The values from the preceding record must be saved and subtracted from the current values to obtain the activity for the current time interval. Given this information, it is then important to know which fields are not accumulated. Without listing all of them, the more important ones are: the EDM Pool data, RID Pool data, and the number of active buffers for each Buffer Pool that has been used.
The first thing necessary to understand, is when Accounting records are produced, since this is dependant upon the attaching environment and options of the Allied Agent. Accounting records are produced when a thread is terminated, i.e. a batch job ends, or the Authorization ID (AUTHID) changes if an existing thread is reused.
Whenever an AUTHID changes, DB2 considers this a SIGNON process, and must validate the authorization of the AUTHID to access the system, PLAN, and underlying resources. This is obviously overhead we would like to reduce as much as possible for high volume transaction environments. DB2 provides an 'Authorization Cache' as a 1K (default size) block in the EDM Pool where up to 124 of the most frequently used AUTHIDs are maintained for each PLAN. This cache can be specified up to 4K (specified when a PLAN is bound) to Cache the AUTHIDs. The Authorization Cache eliminates Catalog access for validation if the AUTHID is in the Cache. For large systems with thousands of users, there will still be a high level of Catalog access, even if the 4K cache is used that can cache 508 Authids. A PLAN may be granted to 'Public', and not require authorization checking; but this may create a security exposure. The various CICS RCT options can dramatically reduce this overhead by passing the same AUTHID to DB2 for all users. This can improve throughput; however, it causes a loss of accountability and performance tracking facilities because the activity and resource consumption from many users would be combined into one DB2 Accounting record. A CICS application 'Synchpoint' causes a DB2 COMMIT, but does not cause thread termination or an Accounting record to be produced. However, using an option of Tokene=Yes in the CICS RCT causes a signon to take place, and will produce an accounting record for each transaction. DB2 provides a 'token id' as part of the record that can be matched up to activity records on the CICS side.
Looking at Performance Indicators
There are two areas that must be considered before we start to look at actual performance indicators. First, the functions and responsibilities of the DB2 System address spaces and the Allied Agent address spaces, and secondly, where the CPU components accrue and the type of component based upon the function. The following chart (Chart 1) will begin to shed some light upon where the charges accrue, why, and what is considered normal versus indicators of unusual activity. This chart does not include information for the Stored Procedures address space for Version 4, because of a lack of measurement data at this time.
System Performance Indicators
The first area to address is the CPU component of the three major DB2 system address spaces. This paper will not address the DDF address space, or distributed facilities. DB2 uses both TCB and SRB mode for its system level operations. It uses SRB mode extensively for the performance critical functions because it requires less overhead to dispatch an SRB than a TCB, and because SRBs are non pre-emptable (not considering the Enclave SRB available with MVS/ESA V5). Let's start with the DB2 System Services address spaces, commonly referred to as the DB2 Master or MSTR or SSAS. The normal high priority tasks, Logging and Checkpointing, use SRB mode. They usually do not consume large amounts of CPU, with the impacts of the logging and checkpointing facilities directly related to the insert/update/delete activity of the applications. The other function that uses SRB mode is BACKOUT. Backout is a critical function for maintaining data integrity. When a backout situation is encountered by DB2, it is handled as the highest priority task in the system, and will consume major amounts of CPU at the dispatching priority of the MSTR address space (refer to my paper in the CMG90 proceedings for an analysis of this activity). The other tasks performed by the MSTR address space are BSDS access, Archiving, and allocating the Archive Log datasets. If no Archiving or Backout activity takes place within a Statistics interval, the total CPU time will be quite small compared to the Database Manager address space, and should be about 90% SRB time. If Archiving takes place, the ratio may be about 50% each of TCB and SRB, or may weigh slightly more to the TCB side. There is quite a bit of latitude here. If a Backout of any size occurs, there will be little doubt in your mind when you look at the SRB component, and you may have seen the lights dim at your processing complex while this address space consumed most of your CPU resources during the Backout process. Chart 1 shows the address space incurring the cost of the various functions in a DB2 environment.
Where Do the CPU Charges Accrue ?
The Database Manager (DBAS or DBM1) address space should show the largest CPU component, and most of it should be indicated as SRB time during normal processing hours. Note the "normal processing hours". This is normally your prime shift, possibly 9AM to 5PM, possibly longer depending upon your installation. Unless your system has decision support applications or something similar, that accesses some tablespaces infrequently, there should not be any TCB time reflected in the CPU component. If there is, it means that some tablespaces and/or indices have the default Closerule of YES (default). This will certainly be major impact upon response time. If there is true open/close activity, this will noticeably degrade overall system performance and may tend to fragment the EDM Pool. Using a Closerule of Yes causes additional overhead for the DBAS address space even when objects are not actually closed/opened, because they are managed on a "slow close" or drain queue. Additionally, DB2 will switch an object to "read only" status and close the Syslgrng (based on Zparm specification) to avoid both backout/recovery implications, and to decrease the amount of work and records written at checkpoint intervals.
The IRLM (IMS/VS Resource Lock Manager) handles only locking, and uses SRB mode 99.9% of the time. There is an occasional slight tick of TCB mode; however, it is so small that it probably doesn't matter. Normally, the total amount of time accrued to the IRLM, compared to the DBAS and SSAS, is between small and insignificant. Someplace in the range of 1% or less of the SRB component of the DBAS. Obviously, there are exceptions. Since the IRLM must maintain all locks upon all objects in the system, the greater the number of total locks is, the more work it will require. There are several parameters that can effect IRLM overhead. The first is MAXLKTS (Maximum Locks per Tablespace), the second is IRLMRWT (Timeout for Lock Waits), the third is the Deadlock Detection Cycle time specified on the Proc for the IRLM, and the last is the PC parameter on the IRLM PROC. The Deadlock Detection cycle time should be set at 50% or less of the Timeout value, so deadlocks will be detected before a Timeout occurs. However, if it is set too low, the IRLM may consume more resources than necessary. Additionally, when the IRLM is performing its detection cycle, it cannot handle any other locking requests, and they must wait. As the number of locked resources increase, the overhead to perform the detection cycle will also increase. Whenever a timeout is detected, the IRLM will perform Deadlock Detection twice to ensure that no Deadlock situations exist that may be effecting and/or causing the Timeout. The trade-off here is to balance the overhead with overall system and application performance by getting rid of deadlock situations quickly, while keeping the overhead as low as possible. Some installations with high numbers of deadlocks have reduced the detection cycle time as low as one (1) second. DB2 Version 3 lock avoidance techniques, and Version 4 Type 2 indexes (especially) substantially reduce the number of locks for a system. Some installations experienced a 50%+ reduction of the number of locks system-wide when Type 2 indexes were implemented. The last parameter is the PC=N (should be this) on the JCL Proc. This places all locks (250 bytes for each) into ECSA instead of the private area of the IRLM. This reduces overall locking overhead by 5%. It is important that ECSA is sized large enough to hold 3-5 Meg. of locks. If it isn't, DB2 becomes very unfriendly - the IRLM will simply get whatever additional space it requires from CSA below the 16M line (up to the maximum amount specified in the startup proc), and never tell you or issue any warning messages. If your CSA utilization is somewhat on the "tight" side, this will certainly ruin your day. Now let's start to look at specific performance indicators available in the Statistics record(s).
a Prefetch Request may often read fewer pages than the maximum prefetch quantity if some of the pages in the requested block of pages are already in the pool. This hit ratio should be used in conjunction with the I/o rate/Sec., that can be calculated by adding all the I/o events and dividing by the duration of the statistics interval in seconds.
A major impact, although not encountered too often, is Prefetch Disabled - No Read Engine. This is based upon the maximum number of Media Manager Request Blocks (MMRBs) that DB2 has available. The number is presently fixed at 300; however, this can change at IBM's discretion, and will take effect when you apply that maintenance to your system. This situation has been encountered several times; however, it was not something that could be repeated to determine its cause. It is often an indication of I/o subsystem contention that is not allowing Prefetch to complete fast enough and all the "engines" have requests queued waiting for completion. This will show up in the application accounting record as a "wait for asynchronous I/o"; however, these wait times are for all asynchronous reads the application waited for, not only or necessarily because of a lack of read engines. Another indicator that is not totally understood by many people is Synchronous Writes. These are triggered if the IWTH is reached; however, there are two other instances that will cause this counter to be incremented that are not of the same overall performance impact. The other two situations are: closing a tablespace when updated buffers are still in the pool, and a second checkpoint occurring while updated pages that have not been written are still in the buffer pool. So, if prefetch has not been disabled, and DMTH has not been reached, then synchronous writes (low numbers) are nothing to worry about.
Aside from application design and SQL coding, buffer pool tuning is the single greatest tuning
lever available to improve application performance. Hopefully, the existence of sixty pools
dispels any myth that using one large pool is the best way to optimize performance. HiperPools
are an effective way to leverage less expensive expanded storage to obtain better overall hit ratios
and eliminate I/o. However, many installations have implemented HiperPools with minimal
benefit. Not because they don't work, but because the basics of pool tuning was not done. It is
important to understand what types of objects can benefit from larger amounts of pool memory,
whether it is all VP, or a mix of VP and HP. Very large objects that are accessed using
sequential prefetch will not benefit very much, if at all. Very large objects with very random
access usually will not see great improvements. Objects with very high update rates will not get
much improvement from a HP. When HPs are implemented it becomes just as important to
evaluate their benefit as to track overall performance. A few different approaches should be used
to evaluate the benefit from a HP.
This formula shows the amount of benefit provided by the HP. While low percentages usually
imply little benefit, the number of pages read into the VP from the HP should be divided by the
measurement period to determine the I/o per second that the HP saved. If this is very low then
the HP is not providing any benefit. However, for very large volume systems, it is possible to
have a percentage in the low teens, and still save several I/o per second.
The VP hit ratio shows the percentage of pages found in the VP only, differing from the overall
system hit ratio provided earlier.
This formula provides the percentage of requested pages that were found in the HP.
Sysid=MVSB, Date=07/10/1996, Hour=17
Buffer Pool = 0
BP Parameters: VP Size = 7000 VP Sequential Thresh = 80% Defer Write Thresh = 50%
HP Size = 21000 HP Sequential Thresh = 80% Vert Defer Wrt Thresh = 10%
Castout Attribute = YES VP Parallel Seq Thresh = 50%
General: VP Buffers Allocated = 28000 Active Buffers = 1991 VP Expand/Contract = 0
HP Buffers Allocated = 84000 HP Buffers Backed = 84000 HP Expand/Contract = 0
Recall Migrated Dataset = 0 Dataset Opens = 16 VP/HP Expand Failed = 0
Recall Timeout = 0
Read Operations: Getpage Requests = 8775990 Getpage Reqs - Seq = 4131386 Getpage Reqs - Random = 4644604
GetP Fail - VP Full = 0 GetP Fail - Cond Req = 0 GetP/Sync Read I/o = 27
Sync Read I/os = 327016 Sync Read I/os - Seq = 90551 Sync Read I/os - Random = 236465
Seq Prefetch Reqs = 135172 List Prefetch Reqs = 28795 Dynamic Prefetch Reqs = 144208
Seq Prefetch Pages = 3725599 List Prefetch Pages = 7660 Dynamic Prefetch Pages = 768200
Seq Prefetch I/os = 120453 List Prefetch I/os = 1104 Dynamic Prefetch I/os = 35786
Pgs Read/SPrefetch I/o = 31 Pgs Read/LPrefetch I/o = 7 Pgs Read/DPrefetch I/o = 21
PFetch Fail - No Buff = 0 PFetch Fail - No Engine = 0 Page-ins Due to Read = 7
HP Pgs Read Sync = 21965 HP Pgs Rd Async-No ADM = 13936 HP Pgs Rd Async - ADM = 100262
HP Read Fail - No ADM = 0 HP Read Fail - ADM = 0
Write Operations:Pages Updated = 807356 Pages Written = 66035 Updates/Pages Written = 12
Write I/o Operations = 8801 Immed (Sync) Writes = 1294 Pages/Write I/o = 7
VDWTH Reached = 81 DWTH Reached = 10 DMTH Reached = 0
Write Engine Not Avail = 0 Page-ins Due to Write = 84 HP Pages Written Sync = 0
HP Pgs Wrt Async-No ADM = 3077 HP Pgs Wrt Async - ADM = 4972455 HP Write Fail - No ADM = 0
HP Write Fail - ADM = 0
Parallel I/o: Parallel Group Reqs = 0 I/o Parallel Downgraded = 0 Prefetch I/o Strm Reduce= 0
Max Prefetch I/o = 0 Prefetch Reduced to 1/2 = 0 Prefetch Reduced to 1/4 = 0
Sort/Merge: Sort Merge Passes Req = 0 Workfile Requests = 0 Workfile Req Denied = 0
Merge Pass - Low Buff = 0 WF Prefetch Disabled = 0 WFCreate Fail - No Buff = 0
Workfile Maximum = 0 Destructive Read Req = 0 Destruct Rd Page
There are many reporting facilities available to track performance. The example above, report 1,
is from one of the widely used online monitors. Using the information above, the percentage of
pages read into the VP from the HP is only 2.7% - extremely low. If we just looked at the
number of pages read back from the HP, and divided by 3600 seconds for one hour, this shows a
saving of 37 I/os per second. However, if we add the pages read back in asynchronously both
with and without ADMF, and divide by 32 (for a normal prefetch quantity), then add the
synchronous read, and divide this sum by 3600, we get a saving of 7 I/os per second. While this
I/o saving might be significant, it has to be viewed considering the 84 Megabytes of expanded
storage used to provide this saving.
Application Performance Indicators
One of the most common calls from an Application project, is a request for a Performance Trace.
This is usually not necessary to diagnose most performance problems. The first place to start, is
with an Accounting record (SMF 101, IFCID 3). There is a great deal of information available in
these records. One vital point, is that Accounting Trace Classes 2 and 3 (plus 7 and 8 if using
packages) should be on, in addition to Class 1. This is my general recommendation for all
systems. Yes, it is more overhead for Classes 2 and 3, about 3-5%. However, if you are using an
online monitor, most of them turn on Monitor Classes 2 and 3, or other specific IFCIDs that
cause the same overhead to be generated, and will produce this data in the records anyway. So
you might as well code it in the ZPARM. The only exception to this is when your online monitor
can write all the records to its own datasets instead of SMF; eliminating SMF usage provides a
significant performance improvement.
Since online transactions often have more noticeable performance problems, the following
discussion primarily addresses this area; however, the same approach is applied to TSO and
Batch applications. First we look at Class 1, 2, and 3 times. If the difference between Class 1 and
Class 2 elapsed time is large, the problem is usually within portions of the application code not
related to DB2. TSO is an exception here, since Class 1 time will continue to accrue as long as
the thread exists. Typically, a well performing online transaction will have a Class 2 elapsed
time in the 95 - 98% range of Class 1 elapsed. The Class 3 time is shown as synchronous I/o
elapsed, Lock/Latch wait time, and several other wait events. This is often the primary cause of
the problem, and is related to the type and amount of work performed by the application. Divide
the I/o elapsed time by the number of Synchronous Read I/os. This will provide the average I/o
elapsed time, and indicates an I/o subsystem problem if greater than 28 Ms. for 3380, or 23 Ms.
for 3390. Another obvious consideration here, should be the number of read requests. This is
directly related to the Access Path, and the Buffer Pool hit ratio. How many Getpage Requests,
and Prefetch Requests (why is it using sequential Prefetch)? Does the application appear to be
performing too much work? These may also be reduced if a more efficient Access Path is
chosen. How about Lock and Latch elapsed? If there are Lock Suspensions indicated in the
Locking section, this is impacting response time. If the Lock/Latch time is only for Latches, it
should be much less than .01 seconds. If it is greater, the IRLM may not be at the proper
dispatching priority, or receiving a proper level of service.
Subtract the total Class 3 Time(s) from Class 2 elapsed time. Now compare the remaining Class
2 elapsed to Class 2 CPU. You should see a factor of 1.5 or less for a transaction environment.
If higher, the CICS or IMS MPP is not receiving an adequate level of service, or perhaps the
processor complex is just too heavily utilized. Look for other Locking exceptions such as the
Maximum Concurrent Locks - any online application that holds more than 30 or so concurrent
locks, is a potential impact upon other users, and possibly other users of the same PLAN. Were
there any failures when using MIAP or LP? This will effect CPU and elapsed time, and generate
much higher numbers of Getpages and I/o requests. Two last items to consider. First, how many
Commits were issued? If there were several, find out why. Second, what kind of a relationship
is there between Open/Close Cursors, and Fetch Requests. If there are several Open/Closes, and
the number of Fetches is only a few higher, there may be inefficiencies in the application code.
The first level of defense is the Design Review. There should be several of these, by knowledgeable DB2 personnel, to ensure the basic design is correct from a performance perspective. Performance measurement and tracking of an application should begin when the first SQL statements are being tested, and continue through Systems Test and Implementation. Catch the problems early, and get them fixed. The Explain facility must be used for every SQL statement to evaluate its Access Path, and PLANS must always be bound using 'EXPLAIN(YES)'. The key to success is having someone responsible for the performance, with the authority to ensure the monitoring and necessary application tuning will be done in a timely manner.
There is a wealth of information available outside the DB2 Statistics and Accounting records,
that can be used to monitor, evaluate, and track performance - not only for the DB2 system, but
as the DB2 system affects the rest of your MVS environment. The DB2 System address spaces
should be placed into their own Reporting Performance Groups, so that their storage and CPU
consumption can be tracked.
Performance History Database
A Performance History Database (PHDB) is necessary for Trending, Capacity Planning, and to
help analyze the causes for gradual or sudden performance degradations. One of the key words
here is History. This means a period of time longer than a few days. Transaction volumes and
mixes can vary considerably from day to day, and are impacted by normal business activity such
as end of month or end of quarter processing, etc. We probably need at least a few months of
data to effectively perform trend analysis and capacity planning functions. We also do not need,
nor can we afford to keep every individual Accounting record. These should be summarized by
PLAN, and possibly Connection, and can be maintained not only on a daily basis, but have
summaries kept for specific measurement periods such as AM and PM peak periods. Where
should the data be stored? DB2 might be an attractive choice, except for some immediate
problems - you will have to write the programs to combine the data prior to adding it the DB2
database(s). The alternative would be to use the DB2 Load Utility - its fast, its easy, no
application code is necessary. However, this only allows you to load each detail record. Many
online systems process more than a hundred thousand transactions per day. This is more than 2
million detail records per month. Loads of this size on a daily basis will cause problems. Should
you have one or more Indices on the Table? This will severely impact the load time, and require
frequent re-organizations and Runstats Utility executions. Not having any Index requires a
Tablespace scan. It also seems logical that you will perform your analyses during normal
business hours. Your access of this data will impact the performance of the same applications
you are measuring and tuning. SO, it seems to me that the data belongs outside of DB2 so its use
will not impact your critical online applications. It would also be nice to be able to combine DB2
performance data with MVS performance data to obtain a complete picture of the environment
and determine the impacts of each system upon the other. This is all quite possible to attain, its
just a matter of writing some code.........
There is a wealth of information available within the DB2 records (100, 101,102), that is
necessary for monitoring and tuning the system and applications. The area that all the IBM and
other publications have neglected thus far, is the relationship between different thresholds. a
great many DB2 performance criteria depend upon several other functions in the system. It is
necessary to obtain the entire picture before charging off and just changing parameters. For
example, adding several megabytes to your Buffer Pool may improve the hit ratio and decrease
your I/o rate, but may increase the system paging rate so that your overall performance and the
performance of other applications will degrade rather than improve. Paging to Expanded is not
usually too serious (depending upon the rate); however, if DB2 suffers a page fault when the
needed page is on auxiliary storage, DB2 and all attached allied agents must wait for the page
fault to be resolved. DB2, as a system, does not have as many tuning parameters as other
DBMSs, such as IMS/VS. This certainly does not mean that it is less tuneable; it means that you
have to understand how it functions, and how your changes will impact the rest of the MVS
environment before you are able to tune it effectively.
At least 70% or more of DB2 performance (and resource consumption) lies within the
application design, and SQL coding. SQL coding is not simple; there are many nuances and
subtleties that can cause the Optimizer to select an efficient or inefficient Access Path. If your
overall application design is normalized, and many of your online transactions must Join tables to
satisfy the SQL request - you WILL have performance problems. The application design and
performance problems are easily preventable with an adequate level of understanding and
commitment from management, and the use of available tools and knowledge.
It is much more cost effective to spend the time in design, and to correct inefficiencies before an application reaches production, than to be forced into a multi-million dollar processor upgrade because the application cannot provide adequate response to your end user community. The alternative to proper design is spending millions on hardware attempting to place a band-aid on a severed artery...