----- Original Message -----
>
> On RHEL5 Xen kernel, it shows multiple garbage graphic characters, and I
> cannot even transpose it, because it must have a CR, LF, or something else
> embedded in the garbarge.
>
> If you want to re-post, fix the module list truncation issue, try to do
> something to handle kernel versions that do not have the same #define's
> as what you're hardwiring, and perhaps make it into a "mod -t" option
> that displays just the module name and its TAINT flag.
>
> Dave
>
Hi Dave,
Firstly, sorry for the delay.
Here's another attempt.
An example:
crash> linux_banner
linux_banner = $3 = "Linux version 2.6.32-220.28.1.el6.x86_64
(mockbuild(a)x86-022.build.eng.bos.redhat.com) (gcc version 4.4.6 20110731
(Red Hat 4.4.6-3) (GCC) ) #1 SMP Wed Oct 3 12:26:28 EDT 2012\n"
crash> mod -T
NAME TAINT
vxspec (P)(U)
dmpaa (P)(U)
emcpgpx (P)(U)
dmpap (P)(U)
dmpjbod (P)(U)
emcp (P)(U)
emcpmpx (P)(U)
emcpdm (P)(U)
emcpxcrypt (P)(U)
emcpvlumd (P)(U)
vxfs (P)(U)
fdd (P)(U)
vxportal (P)(U)
vxdmp (P)(U)
vxio (P)(U)
llt (P)(U)
gab (P)(U)
vxfen (P)(U)
vxglm (P)(U)
vxgms (P)(U)
vxodm (P)(U)
crash> mod -t vxfs
NAME TAINT
vxfs (P)(U)
Hi Aaron,
Before getting into my somewhat long-winded review, I do want to
emphasize that I think this is a valuable option for kernel debugging,
and it absolutely warrants being folded into the "mod" command.
The v2 patch tested OK on a wide variety of kernel versions without
running into any of the garbage display issues I saw with the first
version.
Here are my issues/comments with the v2 patch. Let's get the simple
things out of the way first.
At some point during your development, run the build through
"make warn", and address any issues there:
$ make warn
...
cc -c -g -DX86_64 -DGDB_7_3_1 kernel.c -Wall -O2 -Wstrict-prototypes
-Wmissing-prototypes -fstack-protector
kernel.c:3512:1: warning: no previous prototype for 'is_module_taint'
[-Wmissing-prototypes]
kernel.c:3577:1: warning: no previous prototype for 'show_module_taint'
[-Wmissing-prototypes]
kernel.c: In function 'show_module_taint':
kernel.c:3579:9: warning: unused variable 'bx' [-Wunused-variable]
...
I wouldn't bother with the "mod -t <module>" vs. "mod -T".
The number of modules that are tainted would not be so onerous
that it warrants a per-module option. (Just pipe it to grep if
you *really* feel it's necessary.) So I would just make one simple
"mod -t" option that checks all the modules and displays those with
taint flags.
If no modules are tainted, then indicate that -- instead of just this:
crash> mod -T
NAME TAINT
crash>
You could change it to only show the header if there are actually tainted
modules, and if there are none, then show something like "no tainted modules".
And the "help mod" page would obviously need documentation...
Now to the meat of the matter, as I mentioned when reviewing the
first patch, I really prefer that hardwiring of kernel #define's
be avoided if at all possible. Your patch seems to pick a
subset of the possible flags:
/*
+* Module taint flags
+*/
+
+#define TAINT_PROPRIETARY_MODULE 0
+#define TAINT_FORCED_MODULE 1
+#define TAINT_UNSIGNED_MODULE 6
+#define TAINT_CRAP 10
+#define TAINT_OOT_MODULE 12
+#define TAINT_TECH_PREVIEW 29
+
And looking at the kernel history, say, starting at upstream 2.6.18:
#define TAINT_PROPRIETARY_MODULE (1<<0)
#define TAINT_FORCED_MODULE (1<<1)
#define TAINT_UNSAFE_SMP (1<<2)
#define TAINT_FORCED_RMMOD (1<<3)
#define TAINT_MACHINE_CHECK (1<<4)
#define TAINT_BAD_PAGE (1<<5)
where RHEL5 expounded upon the base 2.6.18 kernel like this:
#define TAINT_PROPRIETARY_MODULE (1<<0)
#define TAINT_FORCED_MODULE (1<<1)
#define TAINT_UNSAFE_SMP (1<<2)
#define TAINT_FORCED_RMMOD (1<<3)
#define TAINT_MACHINE_CHECK (1<<4)
#define TAINT_BAD_PAGE (1<<5)
#define TAINT_UNSIGNED_MODULE (1<<6)
#define TAINT_7 (1<<7)
#define TAINT_8 (1<<8)
#define TAINT_9 (1<<9)
#define TAINT_10 (1<<10)
#define TAINT_11 (1<<11)
#define TAINT_12 (1<<12)
#define TAINT_13 (1<<13)
#define TAINT_14 (1<<14)
#define TAINT_15 (1<<15)
#define TAINT_16 (1<<16)
#define TAINT_17 (1<<17)
#define TAINT_18 (1<<18)
#define TAINT_19 (1<<19)
#define TAINT_20 (1<<20)
#define TAINT_21 (1<<21)
#define TAINT_22 (1<<22)
#define TAINT_23 (1<<23)
#define TAINT_24 (1<<24)
#define TAINT_25 (1<<25)
#define TAINT_26 (1<<26)
#define TAINT_27 (1<<27)
/* Reserving bits for vendor specific uses */
#define TAINT_HARDWARE_UNSUPPORTED (1<<28)
#define TAINT_TECH_PREVIEW (1<<29)
#define TAINT_RESERVED30 (1<<30)
#define TAINT_RESERVED31 (1<<31)
Then if we look at upstream 2.6.32, a few more bits
were added:
#define TAINT_PROPRIETARY_MODULE 0
#define TAINT_FORCED_MODULE 1
#define TAINT_UNSAFE_SMP 2
#define TAINT_FORCED_RMMOD 3
#define TAINT_MACHINE_CHECK 4
#define TAINT_BAD_PAGE 5
#define TAINT_USER 6
#define TAINT_DIE 7
#define TAINT_OVERRIDDEN_ACPI_TABLE 8
#define TAINT_WARN 9
#define TAINT_CRAP 10
But RHEL6 (2.6.32-based) went a bit further:
#define TAINT_PROPRIETARY_MODULE 0
#define TAINT_FORCED_MODULE 1
#define TAINT_UNSAFE_SMP 2
#define TAINT_FORCED_RMMOD 3
#define TAINT_MACHINE_CHECK 4
#define TAINT_BAD_PAGE 5
#define TAINT_USER 6
#define TAINT_DIE 7
#define TAINT_OVERRIDDEN_ACPI_TABLE 8
#define TAINT_WARN 9
#define TAINT_CRAP 10
#define TAINT_FIRMWARE_WORKAROUND 11
#define TAINT_12 12
#define TAINT_13 13
#define TAINT_14 14
#define TAINT_15 15
#define TAINT_16 16
#define TAINT_17 17
#define TAINT_18 18
#define TAINT_19 19
#define TAINT_20 20
#define TAINT_21 21
#define TAINT_22 22
#define TAINT_23 23
#define TAINT_24 24
#define TAINT_25 25
#define TAINT_26 26
#define TAINT_BIT_BY_ZOMBIE 27
/* Reserving bits for vendor specific uses */
#define TAINT_HARDWARE_UNSUPPORTED 28
#define TAINT_TECH_PREVIEW 29
/* Bits 30 - 31 are reserved for Red Hat use only */
#define TAINT_RESERVED30 30
#define TAINT_RESERVED31 31
And here's the state in the current upstream linux git tree:
#define TAINT_PROPRIETARY_MODULE 0
#define TAINT_FORCED_MODULE 1
#define TAINT_UNSAFE_SMP 2
#define TAINT_FORCED_RMMOD 3
#define TAINT_MACHINE_CHECK 4
#define TAINT_BAD_PAGE 5
#define TAINT_USER 6
#define TAINT_DIE 7
#define TAINT_OVERRIDDEN_ACPI_TABLE 8
#define TAINT_WARN 9
#define TAINT_CRAP 10
#define TAINT_FIRMWARE_WORKAROUND 11
#define TAINT_OOT_MODULE 12
So how would be the best way to deal with handling the other bits
that you are not checking?
When I reviewed your first post, I didn't notice that the
solution -- at least for 2.6.28 and later kernels -- is pretty
much implemented in the kernel by the introduction of the
kernel's "tnts" structure and the print_tainted() function:
struct tnt {
u8 bit;
char true;
char false;
};
static const struct tnt tnts[] = {
{ TAINT_PROPRIETARY_MODULE, 'P', 'G' },
{ TAINT_FORCED_MODULE, 'F', ' ' },
{ TAINT_UNSAFE_SMP, 'S', ' ' },
{ TAINT_FORCED_RMMOD, 'R', ' ' },
{ TAINT_MACHINE_CHECK, 'M', ' ' },
{ TAINT_BAD_PAGE, 'B', ' ' },
{ TAINT_USER, 'U', ' ' },
{ TAINT_DIE, 'D', ' ' },
{ TAINT_OVERRIDDEN_ACPI_TABLE, 'A', ' ' },
{ TAINT_WARN, 'W', ' ' },
{ TAINT_CRAP, 'C', ' ' },
{ TAINT_FIRMWARE_WORKAROUND, 'I', ' ' },
{ TAINT_OOT_MODULE, 'O', ' ' },
};
Why not just read the "tnts[]" array and then use the same
logic as the kernel's print_tainted() function? Doing that
would even handle the vendor-specific issue, where for example,
RHEL6 has this version of the tnts array:
static const struct tnt tnts[] = {
{ TAINT_PROPRIETARY_MODULE, 'P', 'G' },
{ TAINT_FORCED_MODULE, 'F', ' ' },
{ TAINT_UNSAFE_SMP, 'S', ' ' },
{ TAINT_FORCED_RMMOD, 'R', ' ' },
{ TAINT_MACHINE_CHECK, 'M', ' ' },
{ TAINT_BAD_PAGE, 'B', ' ' },
{ TAINT_USER, 'U', ' ' },
{ TAINT_DIE, 'D', ' ' },
{ TAINT_OVERRIDDEN_ACPI_TABLE, 'A', ' ' },
{ TAINT_WARN, 'W', ' ' },
{ TAINT_CRAP, 'C', ' ' },
{ TAINT_FIRMWARE_WORKAROUND, 'I', ' ' },
{ TAINT_12, '?', '-' },
{ TAINT_13, '?', '-' },
{ TAINT_14, '?', '-' },
{ TAINT_15, '?', '-' },
{ TAINT_16, '?', '-' },
{ TAINT_17, '?', '-' },
{ TAINT_18, '?', '-' },
{ TAINT_19, '?', '-' },
{ TAINT_20, '?', '-' },
{ TAINT_21, '?', '-' },
{ TAINT_22, '?', '-' },
{ TAINT_23, '?', '-' },
{ TAINT_24, '?', '-' },
{ TAINT_25, '?', '-' },
{ TAINT_26, '?', '-' },
{ TAINT_BIT_BY_ZOMBIE, 'Z', ' ' },
{ TAINT_HARDWARE_UNSUPPORTED, 'H', ' ' },
{ TAINT_TECH_PREVIEW, 'T', ' ' },
};
For kernels earlier than 2.6.28, you could still hardwire
the values that were valid at that point in time, or you
could use the subset of bits that you've chosen. But for
bits that are set that are *not* part of your chosen
subset (if you do it that way), you should probably put
the bit number in brackets, and then the user could refer
to the relevant kernel source for details on what the bit
number means.
But from 2.6.28 on, please use the kernel data structure
contents.
And lastly, what about the transition from module.gpgsig_ok to
module.sig_ok? And BTW, if the relevant "ok" bit is not set,
will the (U) be displayed twice? Maybe there should be a
special display to differentiate the two (U) possibilities?
Thanks,
Dave