E10K: Debugging Drain Failures¶
- Title:
E10K: Debugging Drain Failures
- Author:
Douglas O’Leary <dkoleary@olearycomputers.com>
- Description:
E10K: Debugging Drain Failures
- Date created:
08/1999
- Date updated:
09/1999
- Disclaimer:
Standard: Use the information that follows at your own risk. If you screw up a system, don’t blame it on me…
Enable the kernel variable dr_mem_debug by setting its value to -1 using either adb or setting the value in /etc/system and reboot:
# adb -kw physmem 13af5d dr_mem_debug/W0x1 dr_mem_debug: 0x0 = 0x1 $q
Capture the console output from a failed DR drain session. The failed address will be readily apparent, the message will be something to the effect:
hold_pfns: page not held: <some address>
In an adb session, enter the following command:
<page address from step 2>$<page
Look for the field p_selock. If the value in this field is 1, the problem is possibly related to swap.
If there is a value in the p_vnode field, then enter the following:
<vnode address>$<vnode
Look for the vop field, this tells us which virtual operation is in progress.
If the value in p_selock is an address, then we need to adjust the value of this address by subtracting 8 from the high order bit. For example, if the p_selock field = c0000000, then the value we need for the next step is 40000000. This is a thread address. To check this, enter the following command:
<thread address>$<thread
Look for the field called procp and get that address. Enter the following:
<proc address>$<proc2u
Search through the screen output for the psargs field. This will indicate the process that is holding the lock.
If this is a 3rd party vendor, we need to know who. In any case, mail the screen output to us and we will look further at it.
Notes on adb: Be careful of the columns. Things don’t always align