Re: Patches for zram, swap cache fixes
by HAGIO KAZUHITO(萩尾 一仁)
On 2023/11/17 18:10, Johan.Erlandsson(a)sony.com wrote:
>>> Hi
>>> Sharing 3 changes for zram regarding swap cache handling. Please have a look.
>>>
>>> Subject: [PATCH 1/3] zram, swap cache missing page tree offset
>>> Subject: [PATCH 2/3] zram, swap cache entries are pointer to struct page
>>> Subject: [PATCH 3/3] zram, exclude shadow entries from swap cache lookup
>>
>> Thank you for the patches.
>>
>> > /* this already exists in maple_tree.h add to defs.h ? */
>>
>> Is it ok to add '#include maple_tree.h' ?
>
> Yes, that should work perfectly for 'xa_is_value'.
Thanks for the reply.
I'd like to squash the patches into a patch and add our signed-off-by
tags. Please let me know if there is any trouble with the attached patch.
One more thing, do you have any error message without the patch? I'd
like to add it to the commit log, if possible.
Thanks,
Kazu
>
>>
>> and the warning below is emitted, this can be fixed when merging.
>>
>> diskdump.c: In function 'lookup_swap_cache':
>> diskdump.c:2890:29: warning: unused variable 'page' [-Wunused-variable]
>> ulong swp_type, swp_space, page;
>> ^~~~
>
> Sorry, I missed that one. Thanks for reviewing.
>
> Johan
1 year
Re: Patches for zram, swap cache fixes
by HAGIO KAZUHITO(萩尾 一仁)
On 2023/11/10 2:33, Johan.Erlandsson(a)sony.com wrote:
> Hi
> Sharing 3 changes for zram regarding swap cache handling. Please have a look.
>
> Subject: [PATCH 1/3] zram, swap cache missing page tree offset
> Subject: [PATCH 2/3] zram, swap cache entries are pointer to struct page
> Subject: [PATCH 3/3] zram, exclude shadow entries from swap cache lookup
Thank you for the patches.
> /* this already exists in maple_tree.h add to defs.h ? */
Is it ok to add '#include maple_tree.h' ?
and the warning below is emitted, this can be fixed when merging.
diskdump.c: In function 'lookup_swap_cache':
diskdump.c:2890:29: warning: unused variable 'page' [-Wunused-variable]
ulong swp_type, swp_space, page;
^~~~
Thanks,
Kazu
1 year
[PATCH] symbols: expand kernel modules symtable when loading by mod -s/-S
by Tao Liu
There is an issue that, for kernel modules loaded by mod -s/-S, "dis -rl" fails
to display module's code line number data after execute "bt" cmd in crash.
Without the patch:
crsah> mod -S
crash> bt
PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0"
#0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3
...snip...
#7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e
[exception RIP: lpfc_nlp_get+210]
RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046
RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000
R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0
R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc]
...snip...
crash> dis -rl ffffffffc0f60f82
0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp
0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx
0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi
With the patch:
crash> mod -S
crash> bt
PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0"
#0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3
...snip...
#7 [ff2c9f725c39fc00] page_fault at ffffffff8ea0114e
[exception RIP: lpfc_nlp_get+210]
RIP: ffffffffc0f60f82 RSP: ff2c9f725c39fcb0 RFLAGS: 00010046
RAX: 0000000000000046 RBX: ff2bd8d8ac056000 RCX: 0000000000fffffc
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
RBP: ff2bd8d8ac056090 R8: 0000000000000000 R9: 0000000000000000
R10: ff2bd90d1f8701c0 R11: 0000000000000001 R12: ff2bd93320482ae0
R13: ff2bd93051a80524 R14: ff2bd93051a80000 R15: ff2bd9332079fc00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc]
...snip...
crash> dis -rl ffffffffc0f60f82
/usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6756
0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
/usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6759
0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp
The root cause is, after kernel module been loaded by command mod, the symtable
is not expanded in gdb side. crash command bt or dis will trigger such an
expansion. However the symtable expansion is different for the 2 commands:
The stack trace of "dis -rl" for symtable expanding:
#0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0xe6a77a0) at symfile.c:2914
#1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector (this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=1, expandable=expandable@entry=0) at buildsym.c:1072
#2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block (this=<optimized out>, static_block=static_block@entry=0xfbe4b60, section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106
#3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x8ee4c60) at /usr/include/c++/8/bits/unique_ptr.h:716
#4 process_queue (per_objfile=0xc54c870) at dwarf2/read.c:9220
#5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2448
#6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd0a320, per_objfile=0xc54c870, skip_partial=<optimized out>) at dwarf2/read.c:2472
#7 0x000000000077f75e in dw2_expand_all_symtabs (objfile=<optimized out>) at dwarf2/read.c:3768
#8 0x00000000008f254d in gdb_get_line_number (req=0x7fffffffb1f0) at symtab.c:7112
#9 0x00000000008f22af in gdb_command_funnel_1 (req=0x7fffffffb1f0) at symtab.c:7023
#10 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffb1f0) at symtab.c:6965
#11 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffb1f0) at gdb_interface.c:409
#12 0x00000000005f5bd8 in get_line_number (addr=18446744072651935408, buf=buf@entry=0x7fffffffd460 "", reserved=reserved@entry=0) at symbols.c:4440
#13 0x000000000059e574 in cmd_dis () at kernel.c:2143
The stack trace of "bt" for symtable expanding:
#0 0x00000000008d8d9f in add_compunit_symtab_to_objfile (cu=cu@entry=0x1ad15630) at symfile.c:2914
#1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector (this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=1, expandable=expandable@entry=0) at buildsym.c:1072
#2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block (this=<optimized out>, static_block=static_block@entry=0x1db0be30, section=<optimized out>, expandable=expandable@entry=0) at buildsym.c:1106
#3 0x000000000077e8e9 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x7465240) at /usr/include/c++/8/bits/unique_ptr.h:716
#4 process_queue (per_objfile=0xc113810) at dwarf2/read.c:9220
#5 dw2_do_instantiate_symtab (per_cu=<optimized out>, per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2448
#6 0x000000000077ed67 in dw2_instantiate_symtab (per_cu=0xdd069d0, per_objfile=0xc113810, skip_partial=<optimized out>) at dwarf2/read.c:2472
#7 0x000000000077f8ed in dw2_lookup_symbol (objfile=<optimized out>, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at dwarf2/read.c:3669
#8 0x00000000008e6d03 in lookup_symbol_via_quick_fns (objfile=0xdd277a0, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2392
#9 0x00000000008e7153 in lookup_symbol_in_objfile (objfile=0xdd277a0, block_index=STATIC_BLOCK, name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2541
#10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb (objfile=0xdd277a0, cb_data=0x7fffffffc470) at symtab.c:2615
#11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order (gdbarch=<optimized out>, cb=0x8e7342 <lookup_symbol_global_or_static_iterator_cb(objfile*, void*)>, cb_data=0x7fffffffc470, current_objfile=0x0) at solib-svr4.c:3248
#12 0x00000000008e754e in lookup_global_or_static_symbol (name=0x7fffffffc890 "cpumask_t", block_index=STATIC_BLOCK, objfile=0x0, domain=STRUCT_DOMAIN) at symtab.c:2660
#13 0x00000000008e75da in lookup_static_symbol (name=0x7fffffffc890 "cpumask_t", domain=STRUCT_DOMAIN) at symtab.c:2678
#14 0x00000000008e632c in lookup_symbol_aux (name=0x7fffffffc890 "cpumask_t", match_type=symbol_name_match_type::FULL, block=0x0, domain=STRUCT_DOMAIN, language=language_c, is_a_field_of_this=0x0) at symtab.c:2122
#15 0x00000000008e5a7a in lookup_symbol_in_language (name=0x7fffffffc890 "cpumask_t", block=0x0, domain=STRUCT_DOMAIN, lang=language_c, is_a_field_of_this=0x0) at symtab.c:1889
#16 0x00000000008e5b30 in lookup_symbol (name=0x7fffffffc890 "cpumask_t", block=0x0, domain=STRUCT_DOMAIN, is_a_field_of_this=0x0) at symtab.c:1915
#17 0x00000000008f2a4a in gdb_get_datatype (req=0x7fffffffc730) at symtab.c:7229
#18 0x00000000008f22c0 in gdb_command_funnel_1 (req=0x7fffffffc730) at symtab.c:7027
#19 0x00000000008f2003 in gdb_command_funnel (req=0x7fffffffc730) at symtab.c:6965
#20 0x00000000005b7f02 in gdb_interface (req=req@entry=0x7fffffffc730) at gdb_interface.c:409
#21 0x00000000005f8a9f in datatype_info (name=name@entry=0xa8454d "cpumask_t", member=member@entry=0x0, dm=dm@entry=0xfffffffffffffffc) at symbols.c:5715
#22 0x0000000000599947 in cpu_map_size (type=<optimized out>) at kernel.c:913
#23 0x00000000005a975d in get_cpus_online () at kernel.c:9556
#24 0x0000000000637a8b in diskdump_get_prstatus_percpu (cpu=16) at diskdump.c:2277
#25 0x000000000062f0e4 in get_netdump_regs_x86_64 (bt=0x7fffffffd950, ripp=0x7fffffffd130, rspp=0x7fffffffd138) at netdump.c:3471
#26 0x000000000059fe68 in back_trace (bt=bt@entry=0x7fffffffd950) at kernel.c:3092
#27 0x00000000005ab1cb in cmd_bt () at kernel.c:2859
For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand
all symtable of the objfile, or "*.ko.debug" in our case. However for
the stacktrace of "bt", it doesn't expand all, but only a subset of symtable
which is enough to find a symbol by dw2_lookup_symbol(). As a result, the
objfile->compunit_symtabs, which is the head of a single linked list of
struct compunit_symtab, is not NULL but didn't contain all symtables. It
will not be reinitialized in gdb_get_line_number() by "dis -rl" because
!objfile_has_full_symbols(objfile) check will fail, so it cannot display
the proper code line number data.
This patch will force all the symtable of module to be expanded during
mod load phase, so no matter what commands follow, objfile->compunit_symtabs
always contain all symtabls.
Signed-off-by: Tao Liu <ltao(a)redhat.com>
---
PS: This patch is a stand along and is not the follow-up of
[PATCH v2] symbols: skip load .init.* sections if module was successfully initialized
---
gdb-10.2.patch | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/gdb-10.2.patch b/gdb-10.2.patch
index d81030d..0a9a4e1 100644
--- a/gdb-10.2.patch
+++ b/gdb-10.2.patch
@@ -3187,3 +3187,14 @@ exit 0
result = stringtab + symbol_entry->_n._n_n._n_offset;
}
else
+--- gdb-10.2/gdb/symtab.c.orig
++++ gdb-10.2/gdb/symtab.c
+@@ -7537,6 +7537,8 @@ gdb_add_symbol_file(struct gnu_request *req)
+ lm->loaded_objfile = objfile->separate_debug_objfile;
+ else
+ lm->loaded_objfile = objfile;
++ if (lm->loaded_objfile->sf)
++ lm->loaded_objfile->sf->qf->expand_all_symtabs(lm->loaded_objfile);
+ break;
+ }
+ }
--
2.40.1
1 year
[PATCH] Fix 'rd' command for zram data display in Linux 6.2+
by Chengen Du
A kernel commit 7ac07a26dea7 (zram: preparation for multi-zcomp support)
in Linux replaces "compressor" with "comp_algs" in the zram struct.
If not fixed, the issue triggers the following error:
rd: WARNING: Some pages are swapped out to zram. Please run mod -s zram.
rd: invalid user virtual address: ffff7d23f010 type: "64-bit UVADDR"
Signed-off-by: Chengen Du <chengen.du(a)canonical.com>
---
defs.h | 1 +
diskdump.c | 61 +++++++++++++++++++++++++++++++++++-------------------
2 files changed, 41 insertions(+), 21 deletions(-)
diff --git a/defs.h b/defs.h
index 788f63a..2cae5b6 100644
--- a/defs.h
+++ b/defs.h
@@ -2227,6 +2227,7 @@ struct offset_table { /* stash of commonly-used offsets */
long module_memory_size;
long irq_data_irq;
long zspage_huge;
+ long zram_comp_algs;
};
struct size_table { /* stash of commonly-used sizes */
diff --git a/diskdump.c b/diskdump.c
index 0fe46f4..0103221 100644
--- a/diskdump.c
+++ b/diskdump.c
@@ -2757,6 +2757,8 @@ diskdump_device_dump_info(FILE *ofp)
static ulong ZRAM_FLAG_SHIFT;
static ulong ZRAM_FLAG_SAME_BIT;
+static ulong ZRAM_COMP_PRIORITY_BIT1;
+static ulong ZRAM_COMP_PRIORITY_MASK;
static void
zram_init(void)
@@ -2764,7 +2766,10 @@ zram_init(void)
long zram_flag_shift;
MEMBER_OFFSET_INIT(zram_mempoll, "zram", "mem_pool");
- MEMBER_OFFSET_INIT(zram_compressor, "zram", "compressor");
+ if (THIS_KERNEL_VERSION >= LINUX(6, 2, 0))
+ MEMBER_OFFSET_INIT(zram_comp_algs, "zram", "comp_algs");
+ else
+ MEMBER_OFFSET_INIT(zram_compressor, "zram", "compressor");
MEMBER_OFFSET_INIT(zram_table_flag, "zram_table_entry", "flags");
if (INVALID_MEMBER(zram_table_flag))
MEMBER_OFFSET_INIT(zram_table_flag, "zram_table_entry", "value");
@@ -2782,6 +2787,8 @@ zram_init(void)
ZRAM_FLAG_SHIFT = 1 << zram_flag_shift;
ZRAM_FLAG_SAME_BIT = 1 << (zram_flag_shift+1);
+ ZRAM_COMP_PRIORITY_BIT1 = ZRAM_FLAG_SHIFT + 7;
+ ZRAM_COMP_PRIORITY_MASK = 0x3;
if (CRASHDEBUG(1))
fprintf(fp, "zram_flag_shift: %ld\n", zram_flag_shift);
@@ -2980,13 +2987,17 @@ try_zram_decompress(ulonglong pte_val, unsigned char *buf, ulong len, ulonglong
unsigned char *outbuf = NULL;
ulong zram, zram_table_entry, sector, index, entry, flags, size,
outsize, off;
+ int comp_alg_unavail;
- if (INVALID_MEMBER(zram_compressor)) {
+ comp_alg_unavail = (THIS_KERNEL_VERSION >= LINUX(6, 2, 0))
+ ? INVALID_MEMBER(zram_comp_algs) : INVALID_MEMBER(zram_compressor);
+ if (comp_alg_unavail) {
zram_init();
- if (INVALID_MEMBER(zram_compressor)) {
- error(WARNING,
- "Some pages are swapped out to zram. "
- "Please run mod -s zram.\n");
+ comp_alg_unavail = (THIS_KERNEL_VERSION >= LINUX(6, 2, 0))
+ ? INVALID_MEMBER(zram_comp_algs) : INVALID_MEMBER(zram_compressor);
+ if (comp_alg_unavail) {
+ error(WARNING, "some pages are swapped out to zram. "
+ "please run mod -s zram.\n");
return 0;
}
}
@@ -2997,8 +3008,29 @@ try_zram_decompress(ulonglong pte_val, unsigned char *buf, ulong len, ulonglong
if (!get_disk_name_private_data(pte_val, vaddr, NULL, &zram))
return 0;
- readmem(zram + OFFSET(zram_compressor), KVADDR, name,
- sizeof(name), "zram compressor", FAULT_ON_ERROR);
+ if (THIS_KERNEL_VERSION >= LINUX(2, 6, 0)) {
+ swp_offset = (ulonglong)__swp_offset(pte_val);
+ } else {
+ swp_offset = (ulonglong)SWP_OFFSET(pte_val);
+ }
+
+ sector = swp_offset << (PAGESHIFT() - 9);
+ index = sector >> SECTORS_PER_PAGE_SHIFT;
+ readmem(zram, KVADDR, &zram_table_entry,
+ sizeof(void *), "zram_table_entry", FAULT_ON_ERROR);
+ zram_table_entry += (index * SIZE(zram_table_entry));
+ readmem(zram_table_entry + OFFSET(zram_table_flag), KVADDR, &flags,
+ sizeof(void *), "zram_table_flag", FAULT_ON_ERROR);
+ if (THIS_KERNEL_VERSION >= LINUX(6, 2, 0)) {
+ ulong comp_alg_addr;
+ uint32_t prio = (flags >> ZRAM_COMP_PRIORITY_BIT1) & ZRAM_COMP_PRIORITY_MASK;
+ readmem(zram + OFFSET(zram_comp_algs) + sizeof(const char *) * prio, KVADDR,
+ &comp_alg_addr, sizeof(comp_alg_addr), "zram comp_algs", FAULT_ON_ERROR);
+ read_string(comp_alg_addr, name, sizeof(name));
+ } else {
+ readmem(zram + OFFSET(zram_compressor), KVADDR, name, sizeof(name),
+ "zram compressor", FAULT_ON_ERROR);
+ }
if (STREQ(name, "lzo")) {
#ifdef LZO
if (!(dd->flags & LZO_SUPPORTED)) {
@@ -3019,12 +3051,6 @@ try_zram_decompress(ulonglong pte_val, unsigned char *buf, ulong len, ulonglong
return 0;
}
- if (THIS_KERNEL_VERSION >= LINUX(2, 6, 0)) {
- swp_offset = (ulonglong)__swp_offset(pte_val);
- } else {
- swp_offset = (ulonglong)SWP_OFFSET(pte_val);
- }
-
zram_buf = (unsigned char *)GETBUF(PAGESIZE());
/* lookup page from swap cache */
off = PAGEOFFSET(vaddr);
@@ -3034,15 +3060,8 @@ try_zram_decompress(ulonglong pte_val, unsigned char *buf, ulong len, ulonglong
goto out;
}
- sector = swp_offset << (PAGESHIFT() - 9);
- index = sector >> SECTORS_PER_PAGE_SHIFT;
- readmem(zram, KVADDR, &zram_table_entry,
- sizeof(void *), "zram_table_entry", FAULT_ON_ERROR);
- zram_table_entry += (index * SIZE(zram_table_entry));
readmem(zram_table_entry, KVADDR, &entry,
sizeof(void *), "entry of table", FAULT_ON_ERROR);
- readmem(zram_table_entry + OFFSET(zram_table_flag), KVADDR, &flags,
- sizeof(void *), "zram_table_flag", FAULT_ON_ERROR);
if (!entry || (flags & ZRAM_FLAG_SAME_BIT)) {
int count;
ulong *same_buf = (ulong *)GETBUF(PAGESIZE());
--
2.40.1
1 year
[ANNOUNCE] crash-8.0.4 is available
by HAGIO KAZUHITO(萩尾 一仁)
Download from:
https://crash-utility.github.io/
or
https://github.com/crash-utility/crash/releases
The GitHub master branch serves as a development branch that will
contain all patches that are queued for the next release:
$ git clone https://github.com/crash-utility/crash.git
Changelog:
a6832f608cb5 crash-8.0.3 -> crash-8.0.4
262b1c71b485 Fix printing incorrect panic string issue
55a43bcefa20 Fix compilation error and warning with gcc-4.8.5
fc6ed525407f gdb: Verify COFF symbol stringtab offset
a8e5e4cbae54 gdb: Avoid buffer overflow in ada_decode
0172e35083b5 Fix "rd" command to display data on zram on Linux 5.17 and later
ac097d6cb157 diskdump: add hook for additional checks on prstatus notes validity
5e758aaa0fd8 Make "clear" external command runnable without "!" and alias-able
578fc08b8255 memory_driver: Support overriding kernel directory
1cfd513ea9c8 memory_driver: Use designated initializer for 'crash_dev'
3c44056efef2 memory_driver: Ensure PWD points to the current directory
c9a732d0f6ab arm64: Fix "vtop" command to display swap information on Linux 5.19 and later
a9291fc1bf61 ppc64: do page traversal if vmemmap_list not populated
27f3ccd6c296 In verify_version() don't require specific syment type values for linux_banner symbol to get its address
3253e5ac87c6 Fix "ps/vm" commands to display the memory usage for exiting tasks
1aa93cd33fa1 RISCV64: Add KASLR support
f774fe0f59b4 deduplicate kernel_version open-coded parser
eeaed479a438 Fix "kmem -s|-S" not working properly when CONFIG_SLAB_FREELIST_HARDENED is enabled
bc145861bfeb Revert "Fix "kmem -s|-S" not working properly on RHEL8.6 and later"
ff963b795b3f RISCV64: Use va_kernel_pa_offset in VTOP()
69f38d777450 Fix "ps/vm" commands to display correct memory usage
558aecc98987 Fix "foreach" command with "DE" state to display only expected tasks
c74f375e0ef7 Fix get_linux_banner_from_vmlinux() for vmlinux without ".rodata" symbol
aa5763800d61 Fix warning about kernel version inconsistency during crash startup
f0b59524624b Fix segmentation fault by "tree -s" option with Maple Tree
38d35bd1423c Fix "irq [-a|-s]" options on Linux 6.5-rc1 and later
d17d51a92a3a Exclude zero entries from do_maple_tree() return value
b76e116c50ff vmware: Improve output when we fail to read vmware 'vmsn' file
6d0be1316aa3 Fix "irq -a" option on Linux 6.0 and later
4ee56105881d Fix compilation error due to new strlcpy function that glibc added
88580068b7dd Fix failure of gathering task table on Linux 6.5-rc1 and later
7750e61fdb2a Support module memory layout change on Linux 6.4
8b24b2025fb4 ppc64: Remove redundant PTE checks
6c8cd9b5dcf4 arm64: Fix again segfault in arm64_is_kernel_exception_frame() when corrupt stack pointer address is given
91a76958e4a8 Revert "Fix segfault in arm64_is_kernel_exception_frame() when corrupt stack pointer address is given"
ec1e61b33a70 Fix invalid structure size error during crash startup on ppc64
77d8621876c1 x86_64: Fix "bt" command printing stale entries on Linux 6.4 and later
8527bbff71cb Output prompt when stdin is not a TTY
9868ebc8e648 Fix segfault in arm64_is_kernel_exception_frame() when corrupt stack pointer address is given
db8c030857b4 diskdump/netdump: fix segmentation fault caused by failure of stopping CPUs
a0eceb041dfa arm64/x86_64: Enhance "vtop" command to show zero_pfn information
342cf340ed03 Fix "kmem -v" option displaying no regions on Linux 6.3 and later
58c1816521c2 Fix failure of "dev -d|-D" options on Linux 6.4 and later kernels
040a56e9f9d0 Fix kernel version macros for revision numbers over 255
2505a65ff547 Mark start of 8.0.4 development phase with version 8.0.3++
Full changelog:
https://crash-utility.github.io/changelog/ChangeLog-8.0.4.txt
or
https://github.com/crash-utility/crash/compare/8.0.3...8.0.4
1 year
[PATCH v2] symbols: skip load .init.* sections if module was successfully initialized
by Tao Liu
There might be address overlap of one module's .init.text symbols and
another module's .text symbols. As a result, gdb fails to translate the
address to symbol name correctly:
crash> sym -m virtio_blk | grep MODULE
ffffffffc00a4000 MODULE START: virtio_blk
ffffffffc00a86ec MODULE END: virtio_blk
crash> gdb info address floppy_module_init
Symbol "floppy_module_init" is a function at address 0xffffffffc00a4131.
Since the .init.* sections of a module had been freed by kernel if the
module was initialized successfully, there is no need to load the .init.*
sections data from "*.ko.debug" in gdb to create such an overlap.
lm->mod_init_module_ptr is used as a flag of whether module is freed.
Without the patch:
crash> mod -S
crash> struct blk_mq_ops 0xffffffffc00a7160
struct blk_mq_ops {
queue_rq = 0xffffffffc00a45b0 <floppy_module_init+1151>, <-- symbol translated from module floppy
map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
...snip...
complete = 0xffffffffc00a4370 <floppy_module_init+575>,
init_request = 0xffffffffc00a4260 <floppy_module_init+303>,
...snip...
}
With the patch:
crash> mod -S
crash> struct blk_mq_ops 0xffffffffc00a7160
struct blk_mq_ops {
queue_rq = 0xffffffffc00a45b0 <virtio_queue_rq>, <-- symbol translated from module virtio_blk
map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
...snip...
complete = 0xffffffffc00a4370 <virtblk_request_done>,
init_request = 0xffffffffc00a4260 <virtblk_init_request>,
...snip...
}
Signed-off-by: Tao Liu <ltao(a)redhat.com>
---
v1: [PATCH 1/2] symbols: expand kernel modules symtable before symbols translation
[PATCH 2/2] symbols: fix the error belonging of the kernel modules symbols
v2 -> v1: Used different solution, re-drafted patch based on Kazu's comments,
so v1 can be discarded.
---
symbols.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/symbols.c b/symbols.c
index 8e8b4c3..dae5b04 100644
--- a/symbols.c
+++ b/symbols.c
@@ -13283,7 +13283,7 @@ add_symbol_file_kallsyms(struct load_module *lm, struct gnu_request *req)
shift_string_right(req->buf, strlen(buf));
BCOPY(buf, req->buf, strlen(buf));
retval = TRUE;
- } else {
+ } else if (lm->mod_init_module_ptr || !STRNEQ(section_name, ".init.")) {
sprintf(buf, " -s %s 0x%lx", section_name, section_vaddr);
while ((len + strlen(buf)) >= buflen) {
RESIZEBUF(req->buf, buflen, buflen * 2);
--
2.40.1
1 year
[PATCH 1/2] symbols: expand kernel modules symtable before symbols translation
by Tao Liu
The kernel modules symbol translation may change after a c expression
evaluation.
without patch:
crash> mod -S
crash> struct blk_mq_ops 0xffffffffc00a7160
struct blk_mq_ops {
queue_rq = 0xffffffffc00a45b0 <virtio_queue_rq>, <--symbol translated from kernel
map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
...snip...
complete = 0xffffffffc00a4370 <virtblk_request_done>,
init_request = 0xffffffffc00a4260 <virtblk_init_request>,
...snip...
}
crash> px ((struct request *)0xffff880fdb246000)->q->mq_ops
$1 = (struct blk_mq_ops *) 0xffffffffc00a7160 <virtio_mq_ops>
crash> struct blk_mq_ops 0xffffffffc00a7160
struct blk_mq_ops {
queue_rq = 0xffffffffc00a45b0 <floppy_module_init+1151>, <--symbol translated from module
map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
...snip...
complete = 0xffffffffc00a4370 <floppy_module_init+575>,
init_request = 0xffffffffc00a4260 <floppy_module_init+303>,
...snip...
}
with patch:
crash> mod -S
crash> struct blk_mq_ops 0xffffffffc00a7160
struct blk_mq_ops {
queue_rq = 0xffffffffc00a45b0 <floppy_module_init+1151>, <--symbol translated from module
map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
...snip...
complete = 0xffffffffc00a4370 <floppy_module_init+575>,
init_request = 0xffffffffc00a4260 <floppy_module_init+303>,
..snip...
}
crash> px ((struct request *)0xffff880fdb246000)->q->mq_ops
$1 = (struct blk_mq_ops *) 0xffffffffc00a7160 <virtio_mq_ops>
crash> struct blk_mq_ops 0xffffffffc00a7160
struct blk_mq_ops {
queue_rq = 0xffffffffc00a45b0 <floppy_module_init+1151>, <--symbol translated from module
map_queue = 0xffffffff813015c0 <blk_mq_map_queue>,
...snip...
complete = 0xffffffffc00a4370 <floppy_module_init+575>,
init_request = 0xffffffffc00a4260 <floppy_module_init+303>,
...snip...
}
The root cause for the changing of symbol translation is, after "mod -S", the
kernel modules files "*.ko.debug" will be loaded. However the compile unit
symtable of the kernel modules may not get expanded. As a result, the symtable
of kernel modules, or obj_file->compunit_symtabs is nullptr, which don't take
any effect for gdb symbol translation, it is unexpected. A c expression
evaluation will trigger such an expansion.
This patch will make sure symtable always get expanded before gdb symbol
translation.
Signed-off-by: Tao Liu <ltao(a)redhat.com>
---
gdb-10.2.patch | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/gdb-10.2.patch b/gdb-10.2.patch
index d81030d..31135ca 100644
--- a/gdb-10.2.patch
+++ b/gdb-10.2.patch
@@ -3187,3 +3187,20 @@ exit 0
result = stringtab + symbol_entry->_n._n_n._n_offset;
}
else
+--- gdb-10.2/gdb/symtab.c.orig
++++ gdb-10.2/gdb/symtab.c
+@@ -2931,6 +2931,14 @@ find_pc_sect_compunit_symtab (CORE_ADDR pc, struct obj_section *section)
+
+ for (objfile *obj_file : current_program_space->objfiles ())
+ {
++#ifdef CRASH_MERGE
++ std::string objfile_name = objfile_filename(obj_file);
++
++ if (objfile_name.find(".ko") != std::string::npos) {
++ if (obj_file->sf && obj_file->compunit_symtabs == nullptr)
++ obj_file->sf->qf->expand_all_symtabs(obj_file);
++ }
++#endif
+ for (compunit_symtab *cust : obj_file->compunits ())
+ {
+ const struct block *b;
\ No newline at end of file
--
2.40.1
1 year
[PATCH] Fix printing incorrect panic string issue
by Lianbo Jiang
Since the panic_on_oops is disabled, when getting a BUG hit in the code,
the system continues and does not panic. However, a short time later, a
hard lockup is hit and the system does panic. Even though the system
panicked at hard lockup, the panic string is still the first BUG hit.
For example:
Without the patch:
crash> sys|grep PANIC
PANIC: "BUG: unable to handle kernel paging request at ffffab835d7f9d50"
With the patch:
crash> sys|grep PANIC
PANIC: "Kernel panic - not syncing: Hard LOCKUP"
Let's search for the panic string based on the severity of the panic
event, and also refactore the get_panicmsg() a little bit to improve
readability.
Reported-by: John Pittman <jpittman(a)redhat.com>
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
task.c | 109 +++++++++++++++++++++++----------------------------------
1 file changed, 44 insertions(+), 65 deletions(-)
diff --git a/task.c b/task.c
index 4018a543b715..6809c27c4b91 100644
--- a/task.c
+++ b/task.c
@@ -6301,6 +6301,31 @@ get_active_task(int cpu)
return NO_TASK;
}
+/*
+ * Arrange the panic strings based on the severity of the panic
+ * events.
+ */
+static const char* panic_msg[] = {
+ "SysRq : Crash",
+ "SysRq : Trigger a crash",
+ "SysRq : Netdump",
+ "Kernel panic: ",
+ "Kernel panic - ",
+ "Kernel BUG at",
+ "kernel BUG at",
+ "Unable to handle kernel paging request",
+ "Unable to handle kernel NULL pointer dereference",
+ "BUG: unable to handle kernel ",
+ "general protection fault: ",
+ "double fault: ",
+ "divide error: ",
+ "stack segment: ",
+ "[Hardware Error]: ",
+ "Bad mode in ",
+ "Oops: ",
+};
+
+#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
/*
* Read the panic string.
@@ -6308,7 +6333,7 @@ get_active_task(int cpu)
char *
get_panicmsg(char *buf)
{
- int msg_found;
+ int msg_found, i;
BZERO(buf, BUFSIZE);
msg_found = FALSE;
@@ -6332,76 +6357,30 @@ get_panicmsg(char *buf)
* active-task flag appropriately. The message may or
* may not be used as the panic message.
*/
- rewind(pc->tmpfile);
- while (fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "SysRq : Crash") ||
- strstr(buf, "SysRq : Trigger a crash")) {
- pc->flags |= SYSRQ;
- break;
- }
- }
- rewind(pc->tmpfile);
- while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "general protection fault: ") ||
- strstr(buf, "double fault: ") ||
- strstr(buf, "divide error: ") ||
- strstr(buf, "stack segment: ")) {
- msg_found = TRUE;
- break;
- }
- }
- rewind(pc->tmpfile);
- while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "SysRq : Netdump") ||
- strstr(buf, "SysRq : Crash") ||
- strstr(buf, "SysRq : Trigger a crash")) {
- pc->flags |= SYSRQ;
- msg_found = TRUE;
- break;
+ for (i = 0; i < ARRAY_SIZE(panic_msg); i++) {
+ rewind(pc->tmpfile);
+ while (fgets(buf, BUFSIZE, pc->tmpfile)) {
+ if (strstr(buf, panic_msg[i])) {
+ msg_found = TRUE;
+ if(strstr(buf, "SysRq :"))
+ pc->flags |= SYSRQ;
+ goto FOUND;
+ }
}
- }
- rewind(pc->tmpfile);
- while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "Oops: ") ||
- strstr(buf, "Kernel BUG at") ||
- strstr(buf, "kernel BUG at") ||
- strstr(buf, "Unable to handle kernel paging request") ||
- strstr(buf, "Unable to handle kernel NULL pointer dereference") ||
- strstr(buf, "BUG: unable to handle kernel "))
- msg_found = TRUE;
+
}
+
rewind(pc->tmpfile);
while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "sysrq") &&
- symbol_exists("sysrq_pressed")) {
- get_symbol_data("sysrq_pressed", sizeof(int),
- &msg_found);
- break;
- }
+ if (strstr(buf, "sysrq") &&
+ symbol_exists("sysrq_pressed")) {
+ get_symbol_data("sysrq_pressed", sizeof(int),
+ &msg_found);
+ break;
+ }
}
- rewind(pc->tmpfile);
- while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "Kernel panic: ") ||
- strstr(buf, "Kernel panic - ")) {
- msg_found = TRUE;
- break;
- }
- }
- rewind(pc->tmpfile);
- while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "[Hardware Error]: ")) {
- msg_found = TRUE;
- break;
- }
- }
- rewind(pc->tmpfile);
- while (!msg_found && fgets(buf, BUFSIZE, pc->tmpfile)) {
- if (strstr(buf, "Bad mode in ")) {
- msg_found = TRUE;
- break;
- }
- }
+FOUND:
close_tmpfile();
if (!msg_found)
--
2.41.0
1 year
Re: [Crash-utility] Kernel Crash Analysis on Android
by Shankar, AmarX
Hi Dave,
Thanks for your info regarding kexec tool.
I am unable to download kexec from below link.
http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/kexec-too...
It says HTTP 404 Page Not Found.
Could you please guide me on this?
Thanks & Regards,
Amar Shankar
> On Wed, Mar 21, 2012 at 06:00:00PM +0000, Shankar, AmarX wrote:
>
> > I want to do kernel crash Analysis on Android Merrifield Target.
> >
> > Could someone please help me how to do it?
>
> Merrifield is pretty much similar than Medfield, e.g it has x86 core. So I
> guess you can follow the instructions how to setup kdump on x86 (see
> Documentation/kdump/kdump.txt) unless you already have that configured.
>
> crash should support this directly presuming you have vmlinux/vmcore files to
> feed it. You can configure crash to support x86 on x86_64 host by running:
>
> % make target=X86
> & make
>
> (or something along those lines).
Right -- just the first make command will suffice, i.e., when running
on an x86_64 host:
$ wget http://people.redhat.com/anderson/crash-6.0.4.tar.gz
$ tar xzf crash-6.0.4.tar.gz
...
$ cd crash-6.0.4
$ make target=X86
...
$ ./crash <path-to>/vmlinux <path-to>/vmcore
Dave
From: Shankar, AmarX
Sent: Wednesday, March 21, 2012 11:30 PM
To: 'crash-utility(a)redhat.com'
Subject: Kernel Crash Analysis on Android
Hi,
I want to do kernel crash Analysis on Android Merrifield Target.
Could someone please help me how to do it?
Thanks & Regards,
Amar Shankar
1 year