Hi Lianbo,
On Fri, Jul 12, 2024 at 6:32 PM Lianbo Jiang <lijiang(a)redhat.com> wrote:
 Hi, Tao
 On 7/5/24 9:26 AM, devel-request(a)lists.crash-utility.osci.io wrote:
 > Date: Thu,  4 Jul 2024 17:00:56 +1200
 > From: Tao Liu<ltao(a)redhat.com>
 > Subject: [Crash-utility] [PATCH] Fix "irq -a" exceeding the memory
 >       range issue
 > To:devel@lists.crash-utility.osci.io
 > Cc: Tao Liu<ltao(a)redhat.com>
 > Message-ID:<20240704050056.17375-1-ltao@redhat.com>
 > Content-Type: text/plain; charset="US-ASCII"; x-default=true
 >
 > Previously without the patch, there was an error observed as follows:
 >
 > crash> irq -a
 > IRQ NAME                 AFFINITY
 >    0 timer                0-191
 >    4 ttyS0                0-23,96-119
 > ...
 >   84 smartpqi             72-73,168
 > irq: page excluded: kernel virtual address: ffff97d03ffff000  type: "irq_desc
affinity"
 >
 > The reason is the reading of irq affinity exceeded the memory range, see
 > the following debug info:
 >
 > Thread 1 "crash" hit Breakpoint 1, generic_get_irq_affinity (irq=85) at
kernel.c:7373
 > 7375          irq_desc_addr = get_irq_desc_addr(irq);
 > (gdb) p/x irq_desc_addr
 > $1 = 0xffff97d03f21e800
 >
 > crash> struct irq_desc 0xffff97d03f21e800
 > struct irq_desc {
 >    irq_common_data = {
 >      state_use_accessors = 425755136,
 >      node = 3,
 >      handler_data = 0x0,
 >      msi_desc = 0xffff97ca51b83480,
 >      affinity = 0xffff97d03fffee60,
 >      effective_affinity = 0xffff97d03fffe6c0
 >    },
 >
 > crash> whatis cpumask_t
 > typedef struct cpumask {
 >      unsigned long bits[128];
 > } cpumask_t;
 > SIZE: 1024
 >
 > In order to get the affinity, crash will read the memory range 0xffff97d03fffee60
 > ~ 0xffff97d03fffee60 + 1024(0x400) by line:
 >
 >       readmem(affinity_ptr, KVADDR, affinity, len,
 >               "irq_desc affinity", FAULT_ON_ERROR);
 >
 > However the reading will exceed the effective memory range:
 >
 > crash> kmem 0xffff97d03fffee60
 > CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
 > ffff97c900044400       32     123297    162944   1273     4k  kmalloc-32
 >    SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
 >    fffffca460ffff80  ffff97d03fffe000     3    128         81    47
 >    FREE / [ALLOCATED]
 >    [ffff97d03fffee60]
 >
 >        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
 > fffffca460ffff80 83fffe000 dead000000000001 ffff97d03fffe340  1 d7ffffe0000800 slab
 >
 > crash> kmem ffff97d03ffff000
 >        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
 > fffffca460ffffc0 83ffff000                0        0  1 d7ffffe0004000 reserved
 >
 > crash> dmesg
 > ...
 > [    0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fe00ffff] reserved
 > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000083fffefff] usable
 > [    0.000000] BIOS-e820: [mem 0x000000083ffff000-0x000000083fffffff] reserved
 > ...
 >
 > The beginning physical address, aka 0x83fffe000, is located in the usable
 > area and is readable, however the later physical address, starting from
 > 0x83ffff000, is located in reserved region and not readable. In fact,
 > the affinity member is allocated by alloc_cpumask_var_node(), for the 192 CPUs
 > system, the allocated size is only 24, and we can see it is within
 > the kmalloc-32 slab. So it is incorrect to read 1024 length(given by
 > STRUCT_SIZE("cpumask_t")), only 24 is enough.
 >
 > Since there are plenty of places in crash which takes the value of
 > STRUCT_SIZE("cpumask_t"), and works fine for the past, this patch will
 > not modify them all, but only this place which encountered the issue.
 >
 > Signed-off-by: Tao Liu<ltao(a)redhat.com>
 > ---
 >   kernel.c | 9 ++++++---
 >   1 file changed, 6 insertions(+), 3 deletions(-)
 >
 > diff --git a/kernel.c b/kernel.c
 > index 8a9d498..464e877 100644
 > --- a/kernel.c
 > +++ b/kernel.c
 > @@ -7362,7 +7362,7 @@ void
 >   generic_get_irq_affinity(int irq)
 >   {
 >       ulong irq_desc_addr;
 > -     long len;
 > +     long len, len_cpumask;
 >       ulong affinity_ptr;
 >       ulong *affinity;
 >       ulong tmp_addr;
 > @@ -7382,8 +7382,11 @@ generic_get_irq_affinity(int irq)
 >       if (!action)
 >               return;
 >
 > -     if ((len = STRUCT_SIZE("cpumask_t")) < 0)
 > -             len = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) * sizeof(ulong);
 > +     len = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) * sizeof(ulong);
 > +     len_cpumask = STRUCT_SIZE("cpumask_t");
 > +     if (len_cpumask > 0) {
 > +             len = len_cpumask > len ? len : len_cpumask;
 > +     }
 >
 This change looks good, but I still have two comments below:
 [1] Can we drop the evaluation of "STRUCT_SIZE("cpumask_t")" and
just
 use the size of "DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) * sizeof(ulong)"
 ? Are there any regression issues? 
I'm not sure about the change, I will run a regression against it.
 [2] There are the similar case in the get_cpumask_buf(), see tools.c,
 can you make the same change? 
Yes, I will give it a try to see if regressions are found.
Thanks,
Tao Liu
 ulong *
 get_cpumask_buf(void)
 {
          int cpulen;
          if ((cpulen = STRUCT_SIZE("cpumask_t")) < 0)
                  cpulen = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) *
 sizeof(ulong);
          return (ulong *)GETBUF(cpulen);
 }
 Any thoughts?
 Thanks
 Lianbo
 >       affinity = (ulong *)GETBUF(len);
 >       if (VALID_MEMBER(irq_common_data_affinity))
 > -- 2.40.1