Hi Lianbo,
On Wed, Jul 17, 2024 at 2:24 PM lijiang <lijiang(a)redhat.com> wrote:
On Wed, Jul 17, 2024 at 6:01 AM Tao Liu <ltao(a)redhat.com> wrote:
>
> Hi Lianbo,
>
> On Tue, Jul 16, 2024 at 11:19 AM Tao Liu <ltao(a)redhat.com> wrote:
> >
> > Hi Lianbo,
> >
> > On Fri, Jul 12, 2024 at 6:32 PM Lianbo Jiang <lijiang(a)redhat.com> wrote:
> > >
> > > Hi, Tao
> > >
> > > On 7/5/24 9:26 AM, devel-request(a)lists.crash-utility.osci.io wrote:
> > > > Date: Thu, 4 Jul 2024 17:00:56 +1200
> > > > From: Tao Liu<ltao(a)redhat.com>
> > > > Subject: [Crash-utility] [PATCH] Fix "irq -a" exceeding the
memory
> > > > range issue
> > > > To:devel@lists.crash-utility.osci.io
> > > > Cc: Tao Liu<ltao(a)redhat.com>
> > > > Message-ID:<20240704050056.17375-1-ltao@redhat.com>
> > > > Content-Type: text/plain; charset="US-ASCII";
x-default=true
> > > >
> > > > Previously without the patch, there was an error observed as follows:
> > > >
> > > > crash> irq -a
> > > > IRQ NAME AFFINITY
> > > > 0 timer 0-191
> > > > 4 ttyS0 0-23,96-119
> > > > ...
> > > > 84 smartpqi 72-73,168
> > > > irq: page excluded: kernel virtual address: ffff97d03ffff000 type:
"irq_desc affinity"
> > > >
> > > > The reason is the reading of irq affinity exceeded the memory range,
see
> > > > the following debug info:
> > > >
> > > > Thread 1 "crash" hit Breakpoint 1, generic_get_irq_affinity
(irq=85) at kernel.c:7373
> > > > 7375 irq_desc_addr = get_irq_desc_addr(irq);
> > > > (gdb) p/x irq_desc_addr
> > > > $1 = 0xffff97d03f21e800
> > > >
> > > > crash> struct irq_desc 0xffff97d03f21e800
> > > > struct irq_desc {
> > > > irq_common_data = {
> > > > state_use_accessors = 425755136,
> > > > node = 3,
> > > > handler_data = 0x0,
> > > > msi_desc = 0xffff97ca51b83480,
> > > > affinity = 0xffff97d03fffee60,
> > > > effective_affinity = 0xffff97d03fffe6c0
> > > > },
> > > >
> > > > crash> whatis cpumask_t
> > > > typedef struct cpumask {
> > > > unsigned long bits[128];
> > > > } cpumask_t;
> > > > SIZE: 1024
> > > >
> > > > In order to get the affinity, crash will read the memory range
0xffff97d03fffee60
> > > > ~ 0xffff97d03fffee60 + 1024(0x400) by line:
> > > >
> > > > readmem(affinity_ptr, KVADDR, affinity, len,
> > > > "irq_desc affinity", FAULT_ON_ERROR);
> > > >
> > > > However the reading will exceed the effective memory range:
> > > >
> > > > crash> kmem 0xffff97d03fffee60
> > > > CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME
> > > > ffff97c900044400 32 123297 162944 1273 4k
kmalloc-32
> > > > SLAB MEMORY NODE TOTAL ALLOCATED FREE
> > > > fffffca460ffff80 ffff97d03fffe000 3 128 81 47
> > > > FREE / [ALLOCATED]
> > > > [ffff97d03fffee60]
> > > >
> > > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > > > fffffca460ffff80 83fffe000 dead000000000001 ffff97d03fffe340 1
d7ffffe0000800 slab
> > > >
> > > > crash> kmem ffff97d03ffff000
> > > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > > > fffffca460ffffc0 83ffff000 0 0 1 d7ffffe0004000
reserved
> > > >
> > > > crash> dmesg
> > > > ...
> > > > [ 0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fe00ffff]
reserved
> > > > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000083fffefff]
usable
> > > > [ 0.000000] BIOS-e820: [mem 0x000000083ffff000-0x000000083fffffff]
reserved
> > > > ...
> > > >
> > > > The beginning physical address, aka 0x83fffe000, is located in the
usable
> > > > area and is readable, however the later physical address, starting
from
> > > > 0x83ffff000, is located in reserved region and not readable. In fact,
> > > > the affinity member is allocated by alloc_cpumask_var_node(), for the
192 CPUs
> > > > system, the allocated size is only 24, and we can see it is within
> > > > the kmalloc-32 slab. So it is incorrect to read 1024 length(given by
> > > > STRUCT_SIZE("cpumask_t")), only 24 is enough.
> > > >
> > > > Since there are plenty of places in crash which takes the value of
> > > > STRUCT_SIZE("cpumask_t"), and works fine for the past, this
patch will
> > > > not modify them all, but only this place which encountered the issue.
> > > >
> > > > Signed-off-by: Tao Liu<ltao(a)redhat.com>
> > > > ---
> > > > kernel.c | 9 ++++++---
> > > > 1 file changed, 6 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/kernel.c b/kernel.c
> > > > index 8a9d498..464e877 100644
> > > > --- a/kernel.c
> > > > +++ b/kernel.c
> > > > @@ -7362,7 +7362,7 @@ void
> > > > generic_get_irq_affinity(int irq)
> > > > {
> > > > ulong irq_desc_addr;
> > > > - long len;
> > > > + long len, len_cpumask;
> > > > ulong affinity_ptr;
> > > > ulong *affinity;
> > > > ulong tmp_addr;
> > > > @@ -7382,8 +7382,11 @@ generic_get_irq_affinity(int irq)
> > > > if (!action)
> > > > return;
> > > >
> > > > - if ((len = STRUCT_SIZE("cpumask_t")) < 0)
> > > > - len = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) *
sizeof(ulong);
> > > > + len = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) * sizeof(ulong);
> > > > + len_cpumask = STRUCT_SIZE("cpumask_t");
> > > > + if (len_cpumask > 0) {
> > > > + len = len_cpumask > len ? len : len_cpumask;
> > > > + }
> > > >
> > >
> > > This change looks good, but I still have two comments below:
> > >
> > > [1] Can we drop the evaluation of
"STRUCT_SIZE("cpumask_t")" and just
> > > use the size of "DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) *
sizeof(ulong)"
> > > ? Are there any regression issues?
>
> I made a regression test, if all STRUCT_SIZE("cpumask_t") are replaced
> by DIV_ROUND_UP(...), there are regression issues found, but I didn't
> dive into the root cause of the failing reason.
>
If so, let's keep it.
Anyway, I'm still curious about the regression issue.
I re-read your comments, and I think I have misunderstood your request.
The regressions are found when all places of STRUCT_SIZE("cpumask_t")
in crash were replaced by DIV_ROUND_UP(...). You see there are plenty
of places in crash which used STRUCT_SIZE("cpumask_t"), including the
2(the one which this patch is dealing with, and the one in tools.c)
However if only the 2 places (the one which this patch is dealing
with, and the one in tools.c) are replaced, there is no regression
being noticed.
So I'm going to deal with the only 2 places in v2.
Thanks,
Tao Liu
> >
> > I'm not sure about the change, I will run a regression against it.
> > >
> > > [2] There are the similar case in the get_cpumask_buf(), see tools.c,
> > > can you make the same change?
>
> Yes, unlike [1], with only the similar case modified, no regressions
> found. I will post v2 to include it.
Ok, sounds good. Thank you for the regression test.
Lianbo
>
>
> Thanks,
> Tao Liu
>
> >
> > Yes, I will give it a try to see if regressions are found.
> >
> > Thanks,
> > Tao Liu
> > >
> > > ulong *
> > > get_cpumask_buf(void)
> > > {
> > > int cpulen;
> > > if ((cpulen = STRUCT_SIZE("cpumask_t")) < 0)
> > > cpulen = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) *
> > > sizeof(ulong);
> > > return (ulong *)GETBUF(cpulen);
> > > }
> > >
> > > Any thoughts?
> > >
> > >
> > > Thanks
> > >
> > > Lianbo
> > >
> > >
> > > > affinity = (ulong *)GETBUF(len);
> > > > if (VALID_MEMBER(irq_common_data_affinity))
> > > > -- 2.40.1
> > >
>