Oracle® Database System Administration Guide 10g Release 2 (10.2) for IBM z/OS (OS/390) Part Number B25398-01 |
|
|
View PDF |
This chapter discusses the diagnosis of suspected Oracle database errors and the requirements for documenting these errors to Oracle Support Services. Specific topics in this chapter include documentation requirements, categorization of errors, system dump requirements, and methods of reporting to Oracle Support Services.
For information on z/OS-specific error messages, refer to the Oracle Database Messages Guide for IBM z/OS.
The following topics are included:
Oracle Support Services acts as the interface to the Oracle database user community. Refer to the applicable Oracle Support Services publications for a discussion of policies and procedures for using their services.
During the error resolution cycle, Oracle Support Services might request you provide them with machine readable data. Send machine readable data, not formatted or printed data. Use FTP (File Transfer Protocol) if you plan to send large amounts of data.
If you are requested to send data to Oracle Support Services, then follow the documentation requirements provided in the section "General Documentation Requirements". Failure to follow these requirements might result in the inability to process your information. This could delay the resolution of any errors you are reporting.
When you report a suspected error, you might be asked to describe the Oracle database subsystem and z/OS operating system environments in detail. Provide the full version number of each component that has an error. The full version number includes important PUT levels for your z/OS system.
Before you contact Oracle Support Services, ensure this information is available:
Oracle database library naming conventions
Method of accessing the Oracle database utilities (batch, TSO, or z/OS UNIX System Services)
Oracle database subsystem name
Full version of the Oracle database kernel
Full version of the Oracle database utility
Release level of z/OS
PUT level
RMID of any relevant OS module
In addition to describing the Oracle database operational environment, detailed documentation specific to the error might be required. This might include:
Console logs
Utility SYSOUT
Utility input files
System diagnostic messages
Oracle database error messages
System dumps
Database engine trace data sets
Keep in mind that often more than one error is associated with a single failure. Describe all errors for the failure being reported. If your application uses Pro*C, Pro*COBOL, or another Oracle database Precompiler, then ensure your application displays or prints out all errors it encounters. Without a complete and consistent set of information, diagnosing the problem can be impossible.
When investigating a potential Oracle server error, start by determining which component is failing, where it is failing, and the error category.
When reporting a problem to Oracle Support Services, identify the component suspected of failure, along with its full version and correct release level. A list of components and their version numbers is documented in the Oracle Database Installation Guide for IBM z/OS and any maintenance tape release bulletin.
Use these error categories to describe the error:
Documentation
Incorrect output
Oracle database external error
Abend
Program loop
Performance
Missing functionality
Wait state
When reporting documentation errors, you are asked to provide this information:
Document name
Document part number
Date of publication
Page number
Describe the error in detail.
Documentation errors can include both erroneous documentation and omission of required information.
In general, an incorrect output error exists whenever an Oracle database utility produces a result that differs from written Oracle documentation. When describing errors of incorrect output, you need to describe, in detail, the operation of the function in error. Be prepared to describe your understanding of the proper function, the specific Oracle documentation that describes the proper operation of the function, and a detailed description of the incorrect operation.
If you think you have found a software bug, then be prepared to answer these questions:
Does the problem occur in more than one Oracle tool? (Examples of Oracle tools are SQL*Plus and Oracle Developer/2000).
What are the exact SQL statements used to reproduce the problem?
What are the full version numbers of the Oracle database and related Oracle software?
What is the problem and how is it reproduced?
Oracle database error messages are produced whenever an Oracle database utility or the Oracle database kernel detects an error condition. Depending on the circumstances, error messages might be fatal or nonfatal to the utility or kernel.
Be prepared to identify the exact error message and message number received and the complete circumstances surrounding the error.
Any program check in an Oracle database utility or the Oracle database kernel address is considered an error. A system dump is required as documentation in the event of a program check.
Ensure the system dump contains all of the private area of the Oracle database address space. Without it, diagnosis is sometimes impossible.
System abends might or might not indicate a failure of the Oracle database subsystem depending upon circumstances.
The following abends are not considered Oracle database failures:
013 - open failure
122 - cancelled by operator
222 - cancelled by operator
322 - CPU time exceeded
722 - SYSOUT lines exceeded
A program loop is evident when the Oracle server task consumes CPU time, but no actual work is performed. This situation is substantially different from an Oracle server task that performs most of its operations in cache (also known as a CPU bound job). CPU bound operations might include large batch sorts, sort merge joins, nested loop joins where the driving table is small enough to fit into the SGA, and so forth.
Any program loop that occurs within an Oracle server or utility address space is considered an Oracle database failure. Loop conditions are rarely experienced and are considered serious errors. The initial diagnostic approach with a loop consists of a system dump. If a task is in a program loop, then ensure the system dump includes all of the private area of the Oracle database address space.
Further diagnosis might be required using z/OS SLIP commands. Oracle Support Services furnishes specific instructions on the use of SLIP depending upon circumstances.
Oracle database system performance is determined by many factors, most of which are not within the control of Oracle Corporation. Considerations such as system load, I/O topology, and database design make the documentation of performance errors difficult.
Provide detailed information about the state of your environment when reporting an error.
Specific documentation might include:
CPU type and memory configuration
Database topology
I/O topology
System workload by type
Oracle database workload characterization
Query execution plans
Enhancement requests can be opened with Oracle Support Services to request the inclusion of functions and features that Oracle products do not currently have. When opening an enhancement request, describe the specific feature or function to be added to the product and provide a business case to justify the enhancement.
A wait state occurs whenever a required system resource (for example, an enqueue) is unavailable. Because the task requiring the resource is in a blocked or wait state, little or no CPU time is consumed. However, wait states are not limited to a lack of operating system resources. A wait state can occur within the Oracle server due to an incompatible lock request, high contention on an internal latch, an archive task that halts, and so forth.
A wait state in an Oracle database utility might or might not be considered an Oracle database error. A wait state that occurs due to operating system resource conflicts (for example, multiple requests for a tape mount) is not considered an Oracle database error. If a wait state occurs within an Oracle database utility, then a system dump is required of both the utility and kernel address spaces. If a wait state occurs in the Oracle server address space, then a system dump is required of that address space.
Before you contact Oracle Support Services, try these recommended approaches:
Check the Oracle database alert log and system console logs for any error messages.
Query the V$LOCK dynamic view to determine whether there are any lock conflicts.
For a high volume online transaction processing (OLTP) system, query the V$WAITSTAT dynamic view to determine whether there is contention for a class of data blocks. To perform this query, invoke SQL*Plus and enter these commands:
SQL> CONNECT / AS SYSDBA SQL> SELECT * FROM V$WAITSTAT;
For more information about the V$WAITSTAT view, refer to the Oracle Database Reference.
If your database is in ARCHIVELOG mode, then ensure your archive data set destination is not full. If the destination is full, then the archiver task cannot copy any more full redo log data sets, thus preventing any further changes to the database. In this case, all tasks attempting to make changes are eventually placed in a wait state.
If you cannot diagnose the wait state problem using the previous suggestions, then first obtain these dumps and contact Oracle Support Services:
An SVC system dump that includes all of the private area of the Oracle database address space.
A SYSTEMSTATE dump from within Oracle database. When you contact Oracle Support Services, tell them you performed this task.
To obtain a SYSTEMSTATE dump, invoke SQL*Plus and perform these commands:
SQL> CONNECT / AS SYSDBA SQL> ALTER SESSION SET EVENTS 'immediate trace name systemstate level 10';
A SYSTEMSTATE dump can be quite large, depending on the number of concurrent connections to the Oracle server, the number of resources each connection holds or requests in any mode, and the size of the SGA. The current cursor and object state for each concurrent connection is also included in the SYSTEMSTATE dump. The size of these dumps can easily be in the multiple of megabytes. A SYSTEMSTATE dump is synonymous with a SYSTEMSTATE trace data set.
When providing documentation on suspected Oracle database failures, it might be necessary for you to provide a system dump of the Oracle server or utility address spaces. Dumps are initiated through the z/OS operator interface using the DUMP and SLIP commands, or automatically by Oracle database if it detects a problem.
Dumps sent to Oracle Support Services as documentation for suspected errors must not be formatted. Formatted dumps will not be accepted.
When specifying dump parameters in response to a z/OS DUMP COMM=(' ') command, you must include this specification:
PSA,TRT,RGN,CSA
Additional parameters may be required.
Once a SYS1.DUMPxx
data set is created, the system operator is notified whenever a dump to that data set occurs. Because all Oracle database abends are dumped to SYS1.DUMP data sets and are not dynamically allocated, you must ensure a SYS1.DUMP data set is always available.
You must also ensure the SYS1.DUMP data set is large enough to accommodate a dump of two address spaces (the Oracle database address space and the client address space). This allows for a complete dump if an Oracle database utility abends while in cross memory mode. Refer to "Oracle Server Storage Requirements" in Chapter 15, "Oracle Database Performance", for more information about estimating the size of an Oracle database address space.
If a SYS1.DUMP data set is not available, then a dump might be lost.
Operator initiated dumps are accomplished with the z/OS DUMP command, where text
is the title you want the dump to have:
DUMP COMM=(text)
After the DUMP command has been issued, you must respond to the system WTOR with the following command:
R xx,[JOBNAME=(jobn)|ASID=(nnn),]SDATA=(PSA,TRT,RGN,CSA)
where:
Table 16-1 Variable Descriptions for Code Example
Variable | Description |
---|---|
|
is the reply identification number. |
|
is the name of the started task or batch job. |
|
is the hexadecimal address space identifier of the address space you want to dump. |
New operands can be requested.