Re: [Crash-utility] [PATCH] x86_64: Fix check for __per_cpu_offset initialisation

Wednesday, 11 August 2021

...

 Date: Wed, 11 Aug 2021 14:24:30 +0200
 From: Philipp Rudo <prudo(a)redhat.com&gt;
 To: lijiang <lijiang(a)redhat.com&gt;
 Cc: "Discussion list for crash utility usage,   maintenance and
         development" <crash-utility(a)redhat.com&gt;
 Subject: Re: [Crash-utility] [PATCH] x86_64: Fix check for
         __per_cpu_offset initialisation
 Message-ID: <20210811142430.5e3e1a86@rhtmp>
 Content-Type: text/plain; charset=US-ASCII

 Hi Lianbo,

 On Wed, 11 Aug 2021 17:05:26 +0800
 lijiang <lijiang(a)redhat.com&gt; wrote:

 > >
 > > Date: Thu,  5 Aug 2021 15:19:37 +0200
 > > From: Philipp Rudo <prudo(a)redhat.com&gt;
 > > To: crash-utility(a)redhat.com
 > > Subject: [Crash-utility] [PATCH] x86_64: Fix check for
 > >         __per_cpu_offset        initialisation
 > > Message-ID: <20210805131937.5051-1-prudo(a)redhat.com&gt;
 > >
 > > Since at least kernel v2.6.30 the __per_cpu_offset gets initialized to
 > > __per_cpu_load. So first check if the __per_cpu_offset was set to a
 > > proper value before reading any per cpu variable to prevent potential
 > > bugs.
 > >
 > >
 > Hi, Philipp
 >
 > Thank you for the patch. Can you help to describe  more details about the
 > potential risks? and what conditions might trigger the potential bugs?

 the bug is always triggered during initialization of the per-cpu data
 on x86_64. At least for kernels not using struct x8664_pda, which
 AFAIK was also removed with kernel v2.6.30.

 The risk for crash is low. Right after the superfluous read there is a
 check if the read cpunumber matches the expected one.

                          if (cpunumber != cpus)
                                  break;

 So the worst case scenario I see is that crash initializes one
 additional cpu with non-sense data. But given that the bug exists for
 ~12 years and nobody reported such an bug I assume that the check worked
 well so far.

 Thank you for the explanation in detail, Philipp.

...
 > Did you mean that it's related to the crash live analysis
 issue(1978032)? I
 > tried to reproduce it, but so far I haven't reproduced it with the
 upstream
 > kernel.

 Yes, this bug is related to bz1978032. For whatever reason the
 superfluous read triggered the panic.

 I could reproduce the bug upstream with CONFIG_IO_URING _disabled_.
 Unfortunately there is a RHEL-only patch [1] that tampers with the
 Kconfig for IO_URING. So when you copy a kernel-ark config to the
 upstream repo and run 'make oldconfig' the IO_URING will silently be
 _enabled_.

 You are right.

...
 BTW, I tried to reproduce the panic yesterday on kernel-5.14.0-0.rc4
 but failed. Not sure if the bug was fixed in the meantime or I was
 simply "lucky"...

 This issue may have been fixed in the kernel-5.14.0-0.rc4, however, this
patch is still meaningful, and can prevent potential risks.

Acked-by: Lianbo Jiang <lijiang(a)redhat.com&gt;

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005