On Fri, Dec 20, 2024 at 12:50 AM Lucas Oakley <soakley@redhat.com> wrote:
Hi Lianbo,

Thank you for your input. It looks good to me, and it tests just fine. Sorry, not calling the routine twice should 

Thanks for the confirmation, Lucas.
 
have come to my mind before submitting. Would you like for me to resubmit a patch with the modifications, 
or would you like to take care of the next steps? This is the first patch I've submitted so I'm not totally sure.

No need to repost, I can help to modify it when merging this one(just make slight changes).
 
Hopefully more to come in the future!

Welcome, Lucas.

Thanks
Lianbo
 

Best regards,
Lucas


On Wed, Dec 18, 2024 at 10:46 PM lijiang <lijiang@redhat.com> wrote:
Hi, Lucas
Thank you for the fix.
On Tue, Dec 17, 2024 at 12:14 PM <devel-request@lists.crash-utility.osci.io> wrote:
Date: Sat, 14 Dec 2024 18:01:14 -0500
From: Lucas Oakley <soakley@redhat.com>
Subject: [Crash-utility] [PATCH] Fix incorrect 'bt -v' output
        suggesting overflow
To: devel@lists.crash-utility.osci.io
Message-ID: <20241214230114.2854910-1-soakley@redhat.com>
Content-Type: text/plain; charset="US-ASCII"; x-default=true

Change check_stack_overflow() to check if the thread_info's cpu
member is smaller than possible existing CPUs, rather than the
kernel table's cpu number (kt->cpus). The kernel table's cpu number
is changed on some architectures to reflect the highest numbered
online cpu + 1. This can cause a false positive in
check_stack_overflow() if the cpu member of a parked task's
thread_info structure, assigned to an offlined cpu, is larger than
the kt->cpus but lower than the number of existing logical cpus.
An example of this is RHEL 7 on s390x or RHEL 8 on ppc64le when
the highest numbered CPU is offlined.

Signed-off-by: Lucas Oakley <soakley@redhat.com>
---
 task.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/task.c b/task.c
index 33de7da..93dab0e 100644
--- a/task.c
+++ b/task.c
@@ -11253,12 +11253,12 @@ check_stack_overflow(void)
                                cpu = 0;
                                break;
                        }
-                       if (cpu >= kt->cpus) {
+                       if (cpu >= get_cpus_present()) {
                                if (!overflow)
                                        print_task_header(fp, tc, 0);
                                fprintf(fp,
                                    "  possible stack overflow: thread_info.cpu: %d >= %d\n",
-                                       cpu, kt->cpus);
+                                       cpu, get_cpus_present());
                                overflow++; total++;
                        }
                }

To avoid calling get_cpus_present() twice, I would tend to modify it as below:

diff --git a/task.c b/task.c
index 33de7da2a692..49f771e275c1 100644
--- a/task.c
+++ b/task.c
@@ -11238,6 +11238,8 @@ check_stack_overflow(void)
                }
 
                if (VALID_MEMBER(thread_info_cpu)) {
+                       int cpus = get_cpus_present();
+
                        switch (cpu_size)
                        {
                        case 1:
@@ -11253,12 +11255,12 @@ check_stack_overflow(void)
                                cpu = 0;
                                break;
                        }
-                       if (cpu >= kt->cpus) {
+                       if (cpu >= cpus) {
                                if (!overflow)
                                        print_task_header(fp, tc, 0);
                                fprintf(fp,
                                    "  possible stack overflow: thread_info.cpu: %d >= %d\n",
-                                       cpu, kt->cpus);
+                                       cpu, cpus);
                                overflow++; total++;
                        }
                }

What do you think?


 Lianbo

--
2.47.1