 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] [PATCH v3 1/6] Port the maple tree data structures and main functions
                                
                                
                                
                                    
                                        by lijiang
                                    
                                
                                
                                        Thank you for the update, Tao.
On Tue, Dec 6, 2022 at 4:40 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Tue,  6 Dec 2022 16:40:17 +0800
> From: Tao Liu <ltao(a)redhat.com>
> To: crash-utility(a)redhat.com
> Subject: [Crash-utility] [PATCH v3 1/6] Port the maple tree data
>         structures and main functions
> Message-ID: <20221206084022.58693-2-ltao(a)redhat.com>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> There are 2 ways to iterate vm_area_struct: 1) by rbtree,
> aka vma.vm_rb; 2) by linked list, aka vma.vm_prev/next.
> However for linux maple tree patch[1][2], vm_rb and vm_prev/next
> are removed from vm_area_struct. For memory.c:vm_area_dump
> of crash, it mainly uses linked list as a way of vma iteration,
> which will not work for this case. So maple tree iteration
> need to be ported to crash.
>
> For crash, currently it only iteratively read the maple tree,
> no more rcu safe or maple tree modification features
> needed. So we only port a subset of kernel maple tree
> features. In addition, we need to modify the ported kernel
> source code, making it compatible with crash.
>
> The formal crash way of vmcore struct member resolving is:
>
>     readmem(node, KVADDR, buf, SIZE(buf), "", flag);
>     return buf + OFFSET(member);
>
> which is the reimplementation of kernel way of member resolving:
>
>     return node->member;
>
> The 1st one is arch independent, it uses gdb to resolve the OFFSET
> of members, so crash don't need to know what the inside of the
> struct is, even if the struct changes for new kernel version. The 2nd
> one is arch dependent, the struct need to be ported to crash, and the
> OFFSET of members may differ between crash and kernel due to padding/
> alignment or optimization reasons.
>
> This patch deals with the 2 issues: 1) Poring mt_dump() function, and
> all its dependencies from kernel source[3] to crash, to enable crash
> maple tree iteration, 2) adapting the ported code with crash.
>
> [1]: https://github.com/oracle/linux-uek/commit/d19703645b80abe35dff1a88449d07...
> [2]: https://github.com/oracle/linux-uek/commit/91dee01f1ebb6b6587463b6ee6f7bb...
> [3]: https://github.com/oracle/linux-uek, maple/mainline branch
>
> Signed-off-by: Tao Liu <ltao(a)redhat.com>
> ---
>  Makefile     |  10 +-
>  defs.h       |  19 +++
>  maple_tree.c | 433 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  maple_tree.h |  81 ++++++++++
>  4 files changed, 540 insertions(+), 3 deletions(-)
>  create mode 100644 maple_tree.c
>  create mode 100644 maple_tree.h
>
> diff --git a/Makefile b/Makefile
> index 79aef17..6f19b77 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -59,6 +59,7 @@ IBM_HFILES=ibm_common.h
>  SADUMP_HFILES=sadump.h
>  UNWIND_HFILES=unwind.h unwind_i.h rse.h unwind_x86.h unwind_x86_64.h
>  VMWARE_HFILES=vmware_vmss.h
> +MAPLE_TREE_HFILES=maple_tree.h
>
>  CFILES=main.c tools.c global_data.c memory.c filesys.c help.c task.c \
>         kernel.c test.c gdb_interface.c configure.c net.c dev.c bpf.c \
> @@ -73,12 +74,12 @@ CFILES=main.c tools.c global_data.c memory.c filesys.c help.c task.c \
>         xen_hyper.c xen_hyper_command.c xen_hyper_global_data.c \
>         xen_hyper_dump_tables.c kvmdump.c qemu.c qemu-load.c sadump.c ipcs.c \
>         ramdump.c vmware_vmss.c vmware_guestdump.c \
> -       xen_dom0.c kaslr_helper.c sbitmap.c
> +       xen_dom0.c kaslr_helper.c sbitmap.c maple_tree.c
>
>  SOURCE_FILES=${CFILES} ${GENERIC_HFILES} ${MCORE_HFILES} \
>         ${REDHAT_CFILES} ${REDHAT_HFILES} ${UNWIND_HFILES} \
>         ${LKCD_DUMP_HFILES} ${LKCD_TRACE_HFILES} ${LKCD_OBSOLETE_HFILES}\
> -       ${IBM_HFILES} ${SADUMP_HFILES} ${VMWARE_HFILES}
> +       ${IBM_HFILES} ${SADUMP_HFILES} ${VMWARE_HFILES} ${MAPLE_TREE_HFILES}
>
>  OBJECT_FILES=main.o tools.o global_data.o memory.o filesys.o help.o task.o \
>         build_data.o kernel.o test.o gdb_interface.o net.o dev.o bpf.o \
> @@ -93,7 +94,7 @@ OBJECT_FILES=main.o tools.o global_data.o memory.o filesys.o help.o task.o \
>         xen_hyper.o xen_hyper_command.o xen_hyper_global_data.o \
>         xen_hyper_dump_tables.o kvmdump.o qemu.o qemu-load.o sadump.o ipcs.o \
>         ramdump.o vmware_vmss.o vmware_guestdump.o \
> -       xen_dom0.o kaslr_helper.o sbitmap.o
> +       xen_dom0.o kaslr_helper.o sbitmap.o maple_tree.o
>
>  MEMORY_DRIVER_FILES=memory_driver/Makefile memory_driver/crash.c memory_driver/README
>
> @@ -536,6 +537,9 @@ kaslr_helper.o: ${GENERIC_HFILES} kaslr_helper.c
>  bpf.o: ${GENERIC_HFILES} bpf.c
>         ${CC} -c ${CRASH_CFLAGS} bpf.c ${WARNING_OPTIONS} ${WARNING_ERROR}
>
> +maple_tree.o: ${GENERIC_HFILES} ${MAPLE_TREE_HFILES} maple_tree.c
> +       ${CC} -c ${CRASH_CFLAGS} maple_tree.c ${WARNING_OPTIONS} ${WARNING_ERROR}
> +
>  ${PROGRAM}: force
>         @$(MAKE) all
>
> diff --git a/defs.h b/defs.h
> index afdcf6c..792b007 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -2181,6 +2181,21 @@ struct offset_table {                    /* stash of commonly-used offsets */
>         long blk_mq_tags_nr_reserved_tags;
>         long blk_mq_tags_rqs;
>         long request_queue_hctx_table;
> +       long mm_struct_mm_mt;
> +       long maple_tree_ma_root;
> +       long maple_tree_ma_flags;
> +       long maple_node_parent;
> +       long maple_node_ma64;
> +       long maple_node_mr64;
> +       long maple_node_slot;
> +       long maple_arange_64_pivot;
> +       long maple_arange_64_slot;
> +       long maple_arange_64_gap;
> +       long maple_arange_64_meta;
> +       long maple_range_64_pivot;
> +       long maple_range_64_slot;
> +       long maple_metadata_end;
> +       long maple_metadata_gap;
>  };
>
>  struct size_table {         /* stash of commonly-used sizes */
> @@ -2351,6 +2366,8 @@ struct size_table {         /* stash of commonly-used sizes */
>         long sbitmap_queue;
>         long sbq_wait_state;
>         long blk_mq_tags;
> +       long maple_tree_struct;
> +       long maple_node_struct;
>  };
>
>  struct array_table {
> @@ -5557,6 +5574,8 @@ int file_dump(ulong, ulong, ulong, int, int);
>  int same_file(char *, char *);
>  int cleanup_memory_driver(void);
>
> +void maple_init(void);
> +int do_mptree(struct tree_data *);
>
>  /*
>   *  help.c
> diff --git a/maple_tree.c b/maple_tree.c
> new file mode 100644
> index 0000000..e27369b
> --- /dev/null
> +++ b/maple_tree.c
> @@ -0,0 +1,433 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Maple Tree implementation
> + * Copyright (c) 2018-2022 Oracle Corporation
> + * Authors: Liam R. Howlett <Liam.Howlett(a)oracle.com>
> + *         Matthew Wilcox <willy(a)infradead.org>
> + *
> + * The following are copied and modified from lib/maple_tree.c
> + */
> +
> +#include "maple_tree.h"
> +#include "defs.h"
> +
> +unsigned char *mt_slots = NULL;
> +unsigned char *mt_pivots = NULL;
> +unsigned long mt_max[4] = {0};
> +
> +#define MAPLE_BUFSIZE                  512
> +
> +static inline void *mte_to_node(void *maple_enode_entry)
> +{
> +       return (void *)((unsigned long)maple_enode_entry & ~MAPLE_NODE_MASK);
> +}
> +
> +static inline enum maple_type mte_node_type(void *maple_enode_entry)
> +{
> +       return ((unsigned long)maple_enode_entry >> MAPLE_NODE_TYPE_SHIFT) &
> +               MAPLE_NODE_TYPE_MASK;
> +}
> +
> +static inline void *mt_slot(void *maple_tree_mt, void **slots,
> +                           unsigned char offset)
> +{
> +       return slots[offset];
> +}
The argument "maple_tree_mt" is unused, can it be removed from the
above mt_slot()? That can simplify the following implementation of
functions, such as do_mt_entry(), do_mt_node() and do_mt_range64(),
etc.
The mt_slot() is implemented in the kernel with the parameter "mt"
because it is related to the lockdep engine, for crash utility, it
should be safe to drop this.
> +
> +static inline bool ma_is_leaf(const enum maple_type type)
> +{
> +       return type < maple_range_64;
> +}
> +
> +/***************For cmd_tree********************/
> +
> +struct maple_tree_ops {
> +       void (*entry)(ulong node, ulong slot, const char *path,
> +                     ulong index, void *private);
> +       uint radix;
Seems the "radix" is unused?
Thanks.
Lianbo
> +       void *private;
> +       bool is_td;
> +};
> +
> +static const char spaces[] = "                                ";
> +
> +static void do_mt_range64(void *, void *, unsigned long, unsigned long,
> +                         unsigned int, char *, unsigned long *,
> +                         struct maple_tree_ops *);
> +static void do_mt_arange64(void *, void *, unsigned long, unsigned long,
> +                          unsigned int, char *, unsigned long *,
> +                          struct maple_tree_ops *);
> +static void do_mt_entry(void *, unsigned long, unsigned long, unsigned int,
> +                       unsigned int, char *, unsigned long *,
> +                       struct maple_tree_ops *);
> +static void do_mt_node(void *, void *, unsigned long, unsigned long,
> +                      unsigned int, char *, unsigned long *,
> +                      struct maple_tree_ops *);
> +struct req_entry *fill_member_offsets(char *);
> +void dump_struct_members_fast(struct req_entry *, int, ulong);
> +void dump_struct_members_for_tree(struct tree_data *, int, ulong);
> +
> +static void mt_dump_range(unsigned long min, unsigned long max,
> +                         unsigned int depth)
> +{
> +       if (min == max)
> +               fprintf(fp, "%.*s%lu: ", depth * 2, spaces, min);
> +       else
> +               fprintf(fp, "%.*s%lu-%lu: ", depth * 2, spaces, min, max);
> +}
> +
> +static inline bool mt_is_reserved(const void *entry)
> +{
> +       return ((unsigned long)entry < MAPLE_RESERVED_RANGE) &&
> +              xa_is_internal(entry);
> +}
> +
> +static inline bool mte_is_leaf(void *maple_enode_entry)
> +{
> +       return ma_is_leaf(mte_node_type(maple_enode_entry));
> +}
> +
> +static unsigned int mt_height(void *maple_tree_mt)
> +{
> +       return (*(unsigned int *)(maple_tree_mt + OFFSET(maple_tree_ma_flags)) &
> +               MT_FLAGS_HEIGHT_MASK)
> +              >> MT_FLAGS_HEIGHT_OFFSET;
> +}
> +
> +static void dump_mt_range64(void *maple_range_64_node)
> +{
> +       int i;
> +
> +       fprintf(fp, " contents: ");
> +       for (i = 0; i < mt_slots[maple_range_64] - 1; i++)
> +               fprintf(fp, "%p %lu ",
> +                       *((void **)(maple_range_64_node +
> +                                   OFFSET(maple_range_64_slot)) + i),
> +                       *((unsigned long *)(maple_range_64_node +
> +                                           OFFSET(maple_range_64_pivot)) + i));
> +       fprintf(fp, "%p\n", *((void **)(maple_range_64_node +
> +                                       OFFSET(maple_range_64_slot)) + i));
> +}
> +
> +static void dump_mt_arange64(void *maple_arange_64_node)
> +{
> +       int i;
> +
> +       fprintf(fp, " contents: ");
> +       for (i = 0; i < mt_slots[maple_arange_64]; i++)
> +               fprintf(fp, "%lu ",
> +                       *((unsigned long *)(maple_arange_64_node +
> +                                           OFFSET(maple_arange_64_gap)) + i));
> +
> +       fprintf(fp, "| %02X %02X| ",
> +               *((unsigned char *)maple_arange_64_node +
> +                 OFFSET(maple_arange_64_meta) +
> +                 OFFSET(maple_metadata_end)),
> +               *((unsigned char *)maple_arange_64_node +
> +                 OFFSET(maple_arange_64_meta) +
> +                 OFFSET(maple_metadata_gap)));
> +
> +       for (i = 0; i < mt_slots[maple_arange_64] - 1; i++)
> +               fprintf(fp, "%p %lu ",
> +                       *((void **)(maple_arange_64_node +
> +                                   OFFSET(maple_arange_64_slot)) + i),
> +                       *((unsigned long *)(maple_arange_64_node +
> +                                           OFFSET(maple_arange_64_pivot)) + i));
> +       fprintf(fp, "%p\n",
> +               *((void **)(maple_arange_64_node +
> +                           OFFSET(maple_arange_64_slot)) + i));
> +}
> +
> +static void dump_mt_entry(void *entry, unsigned long min, unsigned long max,
> +                         unsigned int depth)
> +{
> +       mt_dump_range(min, max, depth);
> +
> +       if (xa_is_value(entry))
> +               fprintf(fp, "value %ld (0x%lx) [%p]\n", xa_to_value(entry),
> +                               xa_to_value(entry), entry);
> +       else if (xa_is_zero(entry))
> +               fprintf(fp, "zero (%ld)\n", xa_to_internal(entry));
> +       else if (mt_is_reserved(entry))
> +               fprintf(fp, "UNKNOWN ENTRY (%p)\n", entry);
> +       else
> +               fprintf(fp, "%p\n", entry);
> +}
> +
> +static void dump_mt_node(void *maple_node, char *node_data, unsigned int type,
> +                        unsigned long min, unsigned long max, unsigned int depth)
> +{
> +       mt_dump_range(min, max, depth);
> +
> +       fprintf(fp, "node %p depth %d type %d parent %p", maple_node, depth, type,
> +               maple_node ? *(void **)(node_data + OFFSET(maple_node_parent))
> +                    : NULL);
> +}
> +
> +static void do_mt_range64(void *maple_tree_mt, void *entry,
> +                         unsigned long min, unsigned long max,
> +                         unsigned int depth, char *path,
> +                         unsigned long *global_index, struct maple_tree_ops *ops)
> +{
> +       void *maple_node_m_node = mte_to_node(entry);
> +       unsigned char tmp_node[MAPLE_BUFSIZE];
> +       bool leaf = mte_is_leaf(entry);
> +       unsigned long first = min, last;
> +       int i;
> +       int len = strlen(path);
> +       struct tree_data *td = ops->is_td ? (struct tree_data *)ops->private : NULL;
> +       void *maple_range_64_node;
> +
> +       if (SIZE(maple_node_struct) > MAPLE_BUFSIZE)
> +               error(FATAL, "MAPLE_BUFSIZE should be larger than maple_node_struct");
> +
> +       readmem((ulong)maple_node_m_node, KVADDR, tmp_node, SIZE(maple_node_struct),
> +               "mt_dump_range64 read maple_node", FAULT_ON_ERROR);
> +
> +       maple_range_64_node = tmp_node + OFFSET(maple_node_mr64);
> +
> +       for (i = 0; i < mt_slots[maple_range_64]; i++) {
> +               last = max;
> +
> +               if (i < (mt_slots[maple_range_64] - 1))
> +                       last = *((unsigned long *)(maple_range_64_node +
> +                                                  OFFSET(maple_range_64_pivot)) + i);
> +               else if (!*((void **)(maple_range_64_node +
> +                                     OFFSET(maple_range_64_slot)) + i) &&
> +                        max != mt_max[mte_node_type(entry)])
> +                       break;
> +               if (last == 0 && i > 0)
> +                       break;
> +               if (leaf)
> +                       do_mt_entry(mt_slot(maple_tree_mt,
> +                                           (void **)(maple_range_64_node +
> +                                                     OFFSET(maple_range_64_slot)), i),
> +                                   first, last, depth + 1, i, path, global_index, ops);
> +               else if (*((void **)(maple_range_64_node +
> +                                    OFFSET(maple_range_64_slot)) + i)) {
> +                       sprintf(path + len, "/%d", i);
> +                       do_mt_node(maple_tree_mt,
> +                                  mt_slot(maple_tree_mt,
> +                                          (void **)(maple_range_64_node +
> +                                                    OFFSET(maple_range_64_slot)), i),
> +                                  first, last, depth + 1, path, global_index, ops);
> +               }
> +
> +               if (last == max)
> +                       break;
> +               if (last > max) {
> +                       fprintf(fp, "node %p last (%lu) > max (%lu) at pivot %d!\n",
> +                               maple_range_64_node, last, max, i);
> +                       break;
> +               }
> +               first = last + 1;
> +       }
> +}
> +
> +static void do_mt_arange64(void *maple_tree_mt, void *entry,
> +                          unsigned long min, unsigned long max,
> +                          unsigned int depth, char *path,
> +                          unsigned long *global_index, struct maple_tree_ops *ops)
> +{
> +       void *maple_node_m_node = mte_to_node(entry);
> +       unsigned char tmp_node[MAPLE_BUFSIZE];
> +       bool leaf = mte_is_leaf(entry);
> +       unsigned long first = min, last;
> +       int i;
> +       int len = strlen(path);
> +       struct tree_data *td = ops->is_td ? (struct tree_data *)ops->private : NULL;
> +       void *maple_arange_64_node;
> +
> +       if (SIZE(maple_node_struct) > MAPLE_BUFSIZE)
> +               error(FATAL, "MAPLE_BUFSIZE should be larger than maple_node_struct");
> +
> +       readmem((ulong)maple_node_m_node, KVADDR, tmp_node, SIZE(maple_node_struct),
> +               "mt_dump_arange64 read maple_node", FAULT_ON_ERROR);
> +
> +       maple_arange_64_node = tmp_node + OFFSET(maple_node_ma64);
> +
> +       for (i = 0; i < mt_slots[maple_arange_64]; i++) {
> +               last = max;
> +
> +               if (i < (mt_slots[maple_arange_64] - 1))
> +                       last = *((unsigned long *)(maple_arange_64_node +
> +                                                  OFFSET(maple_arange_64_pivot)) + i);
> +               else if (! *((void **)(maple_arange_64_node +
> +                                      OFFSET(maple_arange_64_slot)) + i))
> +                       break;
> +               if (last == 0 && i > 0)
> +                       break;
> +
> +               if (leaf)
> +                       do_mt_entry(mt_slot(maple_tree_mt,
> +                                           (void **)(maple_arange_64_node +
> +                                                     OFFSET(maple_arange_64_slot)), i),
> +                                   first, last, depth + 1, i, path, global_index, ops);
> +               else if (*((void **)(maple_arange_64_node +
> +                                    OFFSET(maple_arange_64_slot)) + i)) {
> +                       sprintf(path + len, "/%d", i);
> +                       do_mt_node(maple_tree_mt,
> +                                  mt_slot(maple_tree_mt,
> +                                          (void **)(maple_arange_64_node +
> +                                                    OFFSET(maple_arange_64_slot)), i),
> +                                  first, last, depth + 1, path, global_index, ops);
> +               }
> +
> +               if (last == max)
> +                       break;
> +               if (last > max) {
> +                       fprintf(fp, "node %p last (%lu) > max (%lu) at pivot %d!\n",
> +                               maple_arange_64_node, last, max, i);
> +                       break;
> +               }
> +               first = last + 1;
> +       }
> +}
> +
> +static void do_mt_entry(void *entry, unsigned long min, unsigned long max,
> +                       unsigned int depth, unsigned int index, char *path,
> +                       unsigned long *global_index, struct maple_tree_ops *ops)
> +{
> +       int print_radix = 0, i;
> +       static struct req_entry **e = NULL;
> +       struct tree_data *td = ops->is_td ? (struct tree_data *)ops->private : NULL;
> +
> +       if (!td)
> +               return;
> +}
> +
> +static void do_mt_node(void *maple_tree_mt, void *entry,
> +                      unsigned long min, unsigned long max,
> +                      unsigned int depth, char *path,
> +                      unsigned long *global_index, struct maple_tree_ops *ops)
> +{
> +       void *maple_node = mte_to_node(entry);
> +       unsigned int type = mte_node_type(entry);
> +       unsigned int i;
> +       char tmp_node[MAPLE_BUFSIZE];
> +       struct tree_data *td = ops->is_td ? (struct tree_data *)ops->private : NULL;
> +
> +       if (SIZE(maple_node_struct) > MAPLE_BUFSIZE)
> +               error(FATAL, "MAPLE_BUFSIZE should be larger than maple_node_struct");
> +
> +       readmem((ulong)maple_node, KVADDR, tmp_node, SIZE(maple_node_struct),
> +               "mt_dump_node read maple_node", FAULT_ON_ERROR);
> +
> +       switch (type) {
> +       case maple_dense:
> +               for (i = 0; i < mt_slots[maple_dense]; i++) {
> +                       if (min + i > max)
> +                               fprintf(fp, "OUT OF RANGE: ");
> +                       do_mt_entry(mt_slot(maple_tree_mt,
> +                                           (void **)(tmp_node + OFFSET(maple_node_slot)), i),
> +                                   min + i, min + i, depth, i, path, global_index, ops);
> +               }
> +               break;
> +       case maple_leaf_64:
> +       case maple_range_64:
> +               do_mt_range64(maple_tree_mt, entry, min, max,
> +                             depth, path, global_index, ops);
> +               break;
> +       case maple_arange_64:
> +               do_mt_arange64(maple_tree_mt, entry, min, max,
> +                              depth, path, global_index, ops);
> +               break;
> +       default:
> +               fprintf(fp, " UNKNOWN TYPE\n");
> +       }
> +}
> +
> +static int do_maple_tree_traverse(ulong ptr, int is_root,
> +                                 struct maple_tree_ops *ops)
> +{
> +       char path[BUFSIZE] = {0};
> +       unsigned char tmp_tree[MAPLE_BUFSIZE];
> +       void *entry;
> +       struct tree_data *td = ops->is_td ? (struct tree_data *)ops->private : NULL;
> +       unsigned long global_index = 0;
> +
> +       if (SIZE(maple_tree_struct) > MAPLE_BUFSIZE)
> +               error(FATAL, "MAPLE_BUFSIZE should be larger than maple_tree_struct");
> +
> +       if (!is_root) {
> +               strcpy(path, "direct");
> +               do_mt_node(NULL, (void *)ptr, 0,
> +                          mt_max[mte_node_type((void *)ptr)], 0,
> +                          path, &global_index, ops);
> +       } else {
> +               readmem((ulong)ptr, KVADDR, tmp_tree, SIZE(maple_tree_struct),
> +                       "mt_dump read maple_tree", FAULT_ON_ERROR);
> +               entry = *(void **)(tmp_tree + OFFSET(maple_tree_ma_root));
> +
> +               if (!xa_is_node(entry))
> +                       do_mt_entry(entry, 0, 0, 0, 0, path, &global_index, ops);
> +               else if (entry) {
> +                       strcpy(path, "root");
> +                       do_mt_node(tmp_tree, entry, 0,
> +                                  mt_max[mte_node_type(entry)], 0,
> +                                  path, &global_index, ops);
> +               }
> +       }
> +       return 0;
> +}
> +
> +int do_mptree(struct tree_data *td)
> +{
> +       struct maple_tree_ops ops = {
> +               .entry          = NULL,
> +               .private        = td,
> +               .radix          = 0,
> +               .is_td          = true,
> +       };
> +
> +       int is_root = !(td->flags & TREE_NODE_POINTER);
> +
> +       do_maple_tree_traverse(td->start, is_root, &ops);
> +
> +       return 0;
> +}
> +
> +/***********************************************/
> +void maple_init(void)
> +{
> +       int array_len;
> +
> +       STRUCT_SIZE_INIT(maple_tree_struct, "maple_tree");
> +       STRUCT_SIZE_INIT(maple_node_struct, "maple_node");
> +
> +       MEMBER_OFFSET_INIT(maple_tree_ma_root, "maple_tree", "ma_root");
> +       MEMBER_OFFSET_INIT(maple_tree_ma_flags, "maple_tree", "ma_flags");
> +
> +       MEMBER_OFFSET_INIT(maple_node_parent, "maple_node", "parent");
> +       MEMBER_OFFSET_INIT(maple_node_ma64, "maple_node", "ma64");
> +       MEMBER_OFFSET_INIT(maple_node_mr64, "maple_node", "mr64");
> +       MEMBER_OFFSET_INIT(maple_node_slot, "maple_node", "slot");
> +
> +       MEMBER_OFFSET_INIT(maple_arange_64_pivot, "maple_arange_64", "pivot");
> +       MEMBER_OFFSET_INIT(maple_arange_64_slot, "maple_arange_64", "slot");
> +       MEMBER_OFFSET_INIT(maple_arange_64_gap, "maple_arange_64", "gap");
> +       MEMBER_OFFSET_INIT(maple_arange_64_meta, "maple_arange_64", "meta");
> +
> +       MEMBER_OFFSET_INIT(maple_range_64_pivot, "maple_range_64", "pivot");
> +       MEMBER_OFFSET_INIT(maple_range_64_slot, "maple_range_64", "slot");
> +
> +       MEMBER_OFFSET_INIT(maple_metadata_end, "maple_metadata", "end");
> +       MEMBER_OFFSET_INIT(maple_metadata_gap, "maple_metadata", "gap");
> +
> +       array_len = get_array_length("mt_slots", NULL, sizeof(char));
> +       mt_slots = calloc(array_len, sizeof(char));
> +       readmem(symbol_value("mt_slots"), KVADDR, mt_slots,
> +               array_len * sizeof(char), "maple_init read mt_slots",
> +               RETURN_ON_ERROR);
> +
> +       array_len = get_array_length("mt_pivots", NULL, sizeof(char));
> +       mt_pivots = calloc(array_len, sizeof(char));
> +       readmem(symbol_value("mt_pivots"), KVADDR, mt_pivots,
> +               array_len * sizeof(char), "maple_init read mt_pivots",
> +               RETURN_ON_ERROR);
> +
> +       mt_max[maple_dense]           = mt_slots[maple_dense];
> +       mt_max[maple_leaf_64]         = ULONG_MAX;
> +       mt_max[maple_range_64]        = ULONG_MAX;
> +       mt_max[maple_arange_64]       = ULONG_MAX;
> +}
> \ No newline at end of file
> diff --git a/maple_tree.h b/maple_tree.h
> new file mode 100644
> index 0000000..c423e45
> --- /dev/null
> +++ b/maple_tree.h
> @@ -0,0 +1,81 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +#ifndef _MAPLE_TREE_H
> +#define _MAPLE_TREE_H
> +/*
> + * Maple Tree - An RCU-safe adaptive tree for storing ranges
> + * Copyright (c) 2018-2022 Oracle
> + * Authors:     Liam R. Howlett <Liam.Howlett(a)Oracle.com>
> + *              Matthew Wilcox <willy(a)infradead.org>
> + *
> + * eXtensible Arrays
> + * Copyright (c) 2017 Microsoft Corporation
> + * Author: Matthew Wilcox <willy(a)infradead.org>
> + *
> + * See Documentation/core-api/xarray.rst for how to use the XArray.
> + */
> +#include <stdbool.h>
> +#include <limits.h>
> +
> +/*
> + * The following are copied and modified from include/linux/maple_tree.h
> + */
> +
> +enum maple_type {
> +       maple_dense,
> +       maple_leaf_64,
> +       maple_range_64,
> +       maple_arange_64,
> +};
> +
> +#define MAPLE_NODE_MASK                255UL
> +
> +#define MT_FLAGS_HEIGHT_OFFSET 0x02
> +#define MT_FLAGS_HEIGHT_MASK   0x7C
> +
> +#define MAPLE_NODE_TYPE_MASK   0x0F
> +#define MAPLE_NODE_TYPE_SHIFT  0x03
> +
> +#define MAPLE_RESERVED_RANGE   4096
> +
> +/*
> + * The following are copied and modified from include/linux/xarray.h
> + */
> +
> +#define XA_ZERO_ENTRY          xa_mk_internal(257)
> +
> +static inline void *xa_mk_internal(unsigned long v)
> +{
> +       return (void *)((v << 2) | 2);
> +}
> +
> +static inline bool xa_is_internal(const void *entry)
> +{
> +       return ((unsigned long)entry & 3) == 2;
> +}
> +
> +static inline bool xa_is_node(const void *entry)
> +{
> +       return xa_is_internal(entry) && (unsigned long)entry > 4096;
> +}
> +
> +static inline bool xa_is_value(const void *entry)
> +{
> +       return (unsigned long)entry & 1;
> +}
> +
> +static inline bool xa_is_zero(const void *entry)
> +{
> +       return entry == XA_ZERO_ENTRY;
> +}
> +
> +static inline unsigned long xa_to_internal(const void *entry)
> +{
> +       return (unsigned long)entry >> 2;
> +}
> +
> +static inline unsigned long xa_to_value(const void *entry)
> +{
> +       return (unsigned long)entry >> 1;
> +}
> +
> +#endif /* _MAPLE_TREE_H */
> \ No newline at end of file
> --
> 2.33.1
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> --
> Crash-utility mailing list
> Crash-utility(a)redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility
>
>
> ------------------------------
>
> End of Crash-utility Digest, Vol 207, Issue 6
> *********************************************
>
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH v3 0/6] Add maple tree vma iteration support for crash
                                
                                
                                
                                    
                                        by Tao Liu
                                    
                                
                                
                                        Patchset [1] introduces maple tree data structure for linux, and the
modification on mm subsystem.
The main impact on crash utility, is the modification on vm_area_struct.
Patch [2][3] removed the rbtree and linked list iteration of
vm_area_struct, making it impossible for crash to iterate vma
in the traditional way. For example, we can observe the failing
of crash cmd vm/fuser on kernel which has integrated with patchset [1].
This patchset deals with the issue by porting and adapting
kernel's maple tree vma iteration code to crash utility. It has been
tested on linux-next-next-20220914 [4].
[1]: https://lore.kernel.org/all/20220906194824.2110408-1-Liam.Howlett@oracle....
[2]: https://github.com/oracle/linux-uek/commit/d19703645b80abe35dff1a88449d07...
[3]: https://github.com/oracle/linux-uek/commit/91dee01f1ebb6b6587463b6ee6f7bb...
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/snaps...
v1 -> v2:
1) Move xarray.h and maple_tree_vma.h into maple.h.
2) Remove variable-length array for maple_tree.c.
3) Add tree cmd and do_maple_tree() support for maple tree.
4) Other small modifications.
v2 -> v3:
1) Remove for_each_vma() macro, and all its dependence functions such as
   mas_find(), and use mt_dump()(aka do_maple_tree_traverse()) as a way for
   maple tree iteration instead.
2) Make do_maple_tree_info and maple_tree_ops local variable instead of
   global variable.
3) Show only valid maple entries by tree cmd.
4) Remove empty structures, such as maple_tree{}/maple_metadata{}, use void *
   instead.
5) Other changes based on Kazu and Lianbo's comments.
Tao Liu (6):
  Port the maple tree data structures and main functions
  Add tree cmd support for maple tree
  Add do_maple_tree support for maple tree
  Introduce maple tree vma iteration to memory.c
  Update the maple tree help info for tree cmd
  Dump maple tree offset variables by help -o
 Makefile     |  12 +-
 defs.h       |  26 +++
 help.c       |  46 ++--
 maple_tree.c | 641 +++++++++++++++++++++++++++++++++++++++++++++++++++
 maple_tree.h |  81 +++++++
 memory.c     | 327 +++++++++++++++-----------
 symbols.c    |  32 +++
 tools.c      |  64 +++--
 8 files changed, 1057 insertions(+), 172 deletions(-)
 create mode 100644 maple_tree.c
 create mode 100644 maple_tree.h
-- 
2.33.1
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH] gdb: Fix an assertion failure in the gdb's copy_type()
                                
                                
                                
                                    
                                        by Lianbo Jiang
                                    
                                
                                
                                        This is a backported patch from gdb. Without the patch, the following
crash command may abort due to an assertion failure in the gdb's
copy_type():
  crash> px __per_cpu_start:0
  gdbtypes.c:5505: internal-error: type* copy_type(const type*): Assertion `TYPE_OBJFILE_OWNED (type)' failed.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n)
The gdb commit 8e2da1651879 ("Fix assertion failure in copy_type"),
which solved the current issue.
Reported-by: Buland Kumar Singh <bsingh(a)redhat.com>
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
 gdb-10.2.patch | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)
diff --git a/gdb-10.2.patch b/gdb-10.2.patch
index 7055f6e0fb0b..aa34743501ad 100644
--- a/gdb-10.2.patch
+++ b/gdb-10.2.patch
@@ -2039,3 +2039,42 @@ exit 0
                  }
                  nextfield++;
          }
+--- gdb-10.2/gdb/gdbtypes.c.orig
++++ gdb-10.2/gdb/gdbtypes.c
+@@ -5492,27 +5492,25 @@ copy_type_recursive (struct objfile *objfile,
+ }
+ 
+ /* Make a copy of the given TYPE, except that the pointer & reference
+-   types are not preserved.
+-   
+-   This function assumes that the given type has an associated objfile.
+-   This objfile is used to allocate the new type.  */
++   types are not preserved. */
+ 
+ struct type *
+ copy_type (const struct type *type)
+ {
+-  struct type *new_type;
+-
+-  gdb_assert (TYPE_OBJFILE_OWNED (type));
++  struct type *new_type = alloc_type_copy (type);
+ 
+-  new_type = alloc_type_copy (type);
+   TYPE_INSTANCE_FLAGS (new_type) = TYPE_INSTANCE_FLAGS (type);
+   TYPE_LENGTH (new_type) = TYPE_LENGTH (type);
+   memcpy (TYPE_MAIN_TYPE (new_type), TYPE_MAIN_TYPE (type),
+ 	  sizeof (struct main_type));
+   if (type->main_type->dyn_prop_list != NULL)
+-    new_type->main_type->dyn_prop_list
+-      = copy_dynamic_prop_list (&TYPE_OBJFILE (type) -> objfile_obstack,
+-				type->main_type->dyn_prop_list);
++    {
++      struct obstack *storage = (TYPE_OBJFILE_OWNED (type)
++                                ? &TYPE_OBJFILE (type)->objfile_obstack
++                                : gdbarch_obstack (TYPE_OWNER (type).gdbarch));
++      new_type->main_type->dyn_prop_list
++       = copy_dynamic_prop_list (storage, type->main_type->dyn_prop_list);
++    }
+ 
+   return new_type;
+ }
-- 
2.37.1
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH] Fix build failure due to no EM_RISCV with glibc-2.23 and earlier
                                
                                
                                
                                    
                                        by HAGIO KAZUHITO(萩尾 一仁)
                                    
                                
                                
                                        With glibc-2.23 and earlier (e.g. RHEL7), crash build fails with errors
like this due to EM_RISCV undeclared:
  $ make -j 24 warn
  TARGET: X86_64
  CRASH: 8.0.2++
  GDB: 10.2
  ...
  symbols.c: In function 'is_kernel':
  symbols.c:3746:8: error: 'EM_RISCV' undeclared (first use in this function)
     case EM_RISCV:
          ^
  ...
Define EM_RISCV as 243 [1][2] if not defined.
[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=94e73c95d9b5
[2] http://www.sco.com/developers/gabi/latest/ch4.eheader.html
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
---
 defs.h | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/defs.h b/defs.h
index d3d837631632..08ac4dc96a92 100644
--- a/defs.h
+++ b/defs.h
@@ -3493,6 +3493,10 @@ struct arm64_stackframe {
 #define _MAX_PHYSMEM_BITS       48
 #endif  /* MIPS64 */
 
+#ifndef EM_RISCV
+#define EM_RISCV		243
+#endif
+
 #ifdef RISCV64
 #define _64BIT_
 #define MACHINE_TYPE		"RISCV64"
-- 
1.8.3.1
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH] Fix for 'kmem -i' to display correct SLAB statistics
                                
                                
                                
                                    
                                        by Lianbo Jiang
                                    
                                
                                
                                        Kernel commit d42f3245c7e2 ("mm: memcg: convert vmstat slab counters to
bytes"), which is contained in linux v5.9-rc1 and later kernels, renamed
NR_SLAB_{RECLAIMABLE,UNRECLAIMABLE} to NR_SLAB_{RECLAIMABLE,UNRECLAIMABLE}_B.
Without the patch, "kmem -i" command will display incorrect SLAB
statistics:
  crash> kmem -i | grep -e PAGES -e SLAB
                   PAGES        TOTAL      PERCENTAGE
           SLAB    89458     349.4 MB    0% of TOTAL MEM
                   ^^^^^     ^^^^^
With the patch, the actual result is:
  crash> kmem -i | grep -e PAGES -e SLAB
                   PAGES        TOTAL      PERCENTAGE
           SLAB   261953    1023.3 MB    0% of TOTAL MEM
Reported-by: Buland Kumar Singh <bsingh(a)redhat.com>
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
 memory.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/memory.c b/memory.c
index 9d003713534b..a8f08f1a4d09 100644
--- a/memory.c
+++ b/memory.c
@@ -8382,9 +8382,11 @@ dump_kmeminfo(void)
 	if (vm_stat_init()) {
 		if (dump_vm_stat("NR_SLAB", &nr_slab, 0))
 			get_slabs = nr_slab;
-		else if (dump_vm_stat("NR_SLAB_RECLAIMABLE", &nr_slab, 0)) {
+		else if (dump_vm_stat("NR_SLAB_RECLAIMABLE", &nr_slab, 0) ||
+				dump_vm_stat("NR_SLAB_RECLAIMABLE_B", &nr_slab, 0)) {
 			get_slabs = nr_slab;
-			if (dump_vm_stat("NR_SLAB_UNRECLAIMABLE", &nr_slab, 0))
+			if (dump_vm_stat("NR_SLAB_UNRECLAIMABLE", &nr_slab, 0) ||
+					dump_vm_stat("NR_SLAB_UNRECLAIMABLE_B", &nr_slab, 0))
 				get_slabs += nr_slab;
 		}
 	}
-- 
2.37.1
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        Re: [Crash-utility] [PATCH] SLUB: Fix for offset change of struct slab members on Linux 6.2-rc1
                                
                                
                                
                                    
                                        by lijiang
                                    
                                
                                
                                        On Fri, Dec 16, 2022 at 8:00 PM <crash-utility-request(a)redhat.com> wrote:
> Date: Fri, 16 Dec 2022 05:03:46 +0000
> From: HAGIO KAZUHITO(?????)  <k-hagio-ab(a)nec.com>
> To: "crash-utility(a)redhat.com" <crash-utility(a)redhat.com>,
>         "lijiang(a)redhat.com" <lijiang(a)redhat.com>
> Subject: [Crash-utility] [PATCH] SLUB: Fix for offset change of struct
>         slab members on Linux 6.2-rc1
> Message-ID: <1671167016-29225-1-git-send-email-k-hagio-ab(a)nec.com>
> Content-Type: text/plain; charset="iso-2022-jp"
>
> From: Kazuhito Hagio <k-hagio-ab(a)nec.com>
>
> The following kernel commits split slab info from struct page into
> struct slab in Linux 5.17.
>
>   d122019bf061 ("mm: Split slab into its own type")
>   07f910f9b729 ("mm: Remove slab from struct page")
>
> Crash commit 5f390ed811b0 followed the change for SLUB, but crash still
> uses the offset of page.lru inappropriately.  It could happen to work
> well because it was the same value as the offset of slab.slab_list until
> Linux 6.1.
>
> However, kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab
> fields to allow larger rcu_head") in Linux 6.2-rc1 changed the offset of
> slab.slab_list.  As a result, without the patch, "kmem -s|-S" options
> print the following errors and fail to print values correctly for
> kernels configured with CONFIG_SLUB.
>
>   crash> kmem -S filp
>   CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
>   kmem: filp: partial list slab: ffffcc650405ab88 invalid page.inuse: -1
>   ffff8fa0401eca00      232       1267      1792     56     8k  filp
>   ...
>   KMEM_CACHE_NODE   NODE  SLABS  PARTIAL  PER-CPU
>   ffff8fa0401cb8c0     0     56       24        8
>   NODE 0 PARTIAL:
>     SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
>   kmem: filp: invalid partial list slab pointer: ffffcc650405ab88
>
>
Thank you for the fix, Kazu.
After applying the patch, I got another error based on the latest kernel
commit 9d2f6060fe4c3b49d0cdc1dce1c99296f33379c8:
 crash> kmem -S filp
CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
ffff9d80c030a100      232       1125      3936    123     8k  filp
CPU 0 KMEM_CACHE_CPU:
  ffff9d81e7a38470
CPU 0 SLAB:
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  fffff0f3841a7700  ffff9d80c69dc000     0     32         10    22
  FREE / [ALLOCATED]
kmem: filp: slab: fffff0f3841a7700 invalid freepointer: 4404c55079ecb7fe
CPU 1 KMEM_CACHE_CPU:
  ffff9d81e7a78470
CPU 1 SLAB:
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  fffff0f38446a600  ffff9d80d1a98000     0     32          0    32
  FREE / [ALLOCATED]
...
And this issue can not always be reproduced, I have tested it more than ten
times, the above error can be observed on my side, maybe one or two times.
But anyway, I'm curious if this is another issue. Could you please also
double check it?
Thanks.
Lianbo
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com>
> ---
>  defs.h    |  1 +
>  memory.c  | 16 ++++++++++------
>  symbols.c |  1 +
>  3 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/defs.h b/defs.h
> index 04476b3ff62e..57c1acc4e8df 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -2182,6 +2182,7 @@ struct offset_table {                    /* stash of
> commonly-used offsets */
>         long blk_mq_tags_rqs;
>         long request_queue_hctx_table;
>         long percpu_counter_counters;
> +       long slab_slab_list;
>  };
>
>  struct size_table {         /* stash of commonly-used sizes */
> diff --git a/memory.c b/memory.c
> index 9d003713534b..d05737cc1429 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -781,6 +781,8 @@ vm_init(void)
>                 if (INVALID_MEMBER(page_slab))
>                         MEMBER_OFFSET_INIT(page_slab, "slab",
> "slab_cache");
>
> +               MEMBER_OFFSET_INIT(slab_slab_list, "slab", "slab_list");
> +
>                 MEMBER_OFFSET_INIT(page_slab_page, "page", "slab_page");
>                 if (INVALID_MEMBER(page_slab_page))
>                         ANON_MEMBER_OFFSET_INIT(page_slab_page, "page",
> "slab_page");
> @@ -19474,6 +19476,7 @@ do_node_lists_slub(struct meminfo *si, ulong
> node_ptr, int node)
>  {
>         ulong next, last, list_head, flags;
>         int first;
> +       long list_off = VALID_MEMBER(slab_slab_list) ?
> OFFSET(slab_slab_list) : OFFSET(page_lru);
>
>         if (!node_ptr)
>                 return;
> @@ -19487,7 +19490,7 @@ do_node_lists_slub(struct meminfo *si, ulong
> node_ptr, int node)
>                 next == list_head ? "  (empty)\n" : "");
>         first = 0;
>          while (next != list_head) {
> -               si->slab = last = next - OFFSET(page_lru);
> +               si->slab = last = next - list_off;
>                 if (first++ == 0)
>                         fprintf(fp, "  %s", slab_hdr);
>
> @@ -19510,7 +19513,7 @@ do_node_lists_slub(struct meminfo *si, ulong
> node_ptr, int node)
>
>                 if (!IS_KVADDR(next) ||
>                     ((next != list_head) &&
> -                    !is_page_ptr(next - OFFSET(page_lru), NULL))) {
> +                    !is_page_ptr(next - list_off, NULL))) {
>                         error(INFO,
>                             "%s: partial list slab: %lx invalid
> page.lru.next: %lx\n",
>                                 si->curname, last, next);
> @@ -19537,7 +19540,7 @@ do_node_lists_slub(struct meminfo *si, ulong
> node_ptr, int node)
>                 next == list_head ? "  (empty)\n" : "");
>         first = 0;
>          while (next != list_head) {
> -               si->slab = next - OFFSET(page_lru);
> +               si->slab = next - list_off;
>                 if (first++ == 0)
>                         fprintf(fp, "  %s", slab_hdr);
>
> @@ -19754,6 +19757,7 @@ count_partial(ulong node, struct meminfo *si,
> ulong *free)
>         short inuse, objects;
>         ulong total_inuse;
>         ulong count = 0;
> +       long list_off = VALID_MEMBER(slab_slab_list) ?
> OFFSET(slab_slab_list) : OFFSET(page_lru);
>
>         count = 0;
>         total_inuse = 0;
> @@ -19765,12 +19769,12 @@ count_partial(ulong node, struct meminfo *si,
> ulong *free)
>         hq_open();
>
>         while (next != list_head) {
> -               if (!readmem(next - OFFSET(page_lru) + OFFSET(page_inuse),
> +               if (!readmem(next - list_off + OFFSET(page_inuse),
>                     KVADDR, &inuse, sizeof(ushort), "page.inuse",
> RETURN_ON_ERROR)) {
>                         hq_close();
>                         return -1;
>                 }
> -               last = next - OFFSET(page_lru);
> +               last = next - list_off;
>
>                 if (inuse == -1) {
>                         error(INFO,
> @@ -19796,7 +19800,7 @@ count_partial(ulong node, struct meminfo *si,
> ulong *free)
>                 }
>                 if (!IS_KVADDR(next) ||
>                     ((next != list_head) &&
> -                    !is_page_ptr(next - OFFSET(page_lru), NULL))) {
> +                    !is_page_ptr(next - list_off, NULL))) {
>                         error(INFO, "%s: partial list slab: %lx invalid
> page.lru.next: %lx\n",
>                                 si->curname, last, next);
>                         break;
> diff --git a/symbols.c b/symbols.c
> index e279cfa68490..66158dcf1744 100644
> --- a/symbols.c
> +++ b/symbols.c
> @@ -9700,6 +9700,7 @@ dump_offset_table(char *spec, ulong makestruct)
>                  OFFSET(slab_inuse));
>          fprintf(fp, "                     slab_free: %ld\n",
>                  OFFSET(slab_free));
> +        fprintf(fp, "                slab_slab_list: %ld\n",
> OFFSET(slab_slab_list));
>
>          fprintf(fp, "               kmem_cache_size: %ld\n",
>                  OFFSET(kmem_cache_size));
> --
> 2.31.1
>
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH V4 0/9] Support RISCV64 arch and common commands
                                
                                
                                
                                    
                                        by Xianting Tian
                                    
                                
                                
                                        This series of patches are for Crash-utility tool, it make crash tool support
RISCV64 arch and the common commands(*, bt, p, rd, mod, log, set, struct, task,
dis, help -r, help -m, and so on).
To make the crash tool work normally for RISCV64 arch, we need a Linux kernel
patch, which exports the kernel virtual memory layout, va_bits, phys_ram_base
to vmcoreinfo, it can simplify the development of crash tool.
The Linux kernel patch set:
https://lore.kernel.org/lkml/20221019103623.7008-1-xianting.tian@linux.al...
 
This series of patches are tested on QEMU RISCV64 env and SoC platform of
T-head Xuantie 910 RISCV64 CPU.
====================================
  Some test examples list as below
====================================
... ...
      KERNEL: vmlinux
    DUMPFILE: vmcore
        CPUS: 1
        DATE: Fri Jul 15 10:24:25 CST 2022
      UPTIME: 00:00:33
LOAD AVERAGE: 0.05, 0.01, 0.00
       TASKS: 41
    NODENAME: buildroot
     RELEASE: 5.18.9
     VERSION: #30 SMP Fri Jul 15 09:47:03 CST 2022
     MACHINE: riscv64  (unknown Mhz)
      MEMORY: 1 GB
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"
         PID: 113
     COMMAND: "sh"
        TASK: ff60000002269600  [THREAD_INFO: ff60000002269600]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)
carsh>
crash> p mem_map
mem_map = $1 = (struct page *) 0xff6000003effbf00
crash> p /x *(struct page *) 0xff6000003effbf00
$5 = {
  flags = 0x1000,
  {
    {
      {
        lru = {
          next = 0xff6000003effbf08,
          prev = 0xff6000003effbf08
        },
        {
          __filler = 0xff6000003effbf08,
          mlock_count = 0x3effbf08
        }
      },
      mapping = 0x0,
      index = 0x0,
      private = 0x0
    },
  ... ...
crash> mod
     MODULE       NAME             BASE         SIZE  OBJECT FILE
ffffffff0113e740  nvme_core  ffffffff01133000  98304  (not loaded)  [CONFIG_KALLSYMS]
ffffffff011542c0  nvme       ffffffff0114c000  61440  (not loaded)  [CONFIG_KALLSYMS]
crash> rd ffffffff0113e740 8
ffffffff0113e740:  0000000000000000 ffffffff810874f8   .........t......
ffffffff0113e750:  ffffffff011542c8 726f635f656d766e   .B......nvme_cor
ffffffff0113e760:  0000000000000065 0000000000000000   e...............
ffffffff0113e770:  0000000000000000 0000000000000000   ................
crash> vtop ffffffff0113e740
VIRTUAL           PHYSICAL
ffffffff0113e740  8254d740
   PGD: ffffffff810e9ff8 => 2ffff001
  P4D: 0000000000000000 => 000000002fffec01
  PUD: 00005605c2957470 => 0000000020949801
  PMD: 00007fff7f1750c0 => 0000000020947401
   PTE: 0 => 209534e7
 PAGE: 000000008254d000
  PTE     PHYSICAL  FLAGS
209534e7  8254d000  (PRESENT|READ|WRITE|GLOBAL|ACCESSED|DIRTY)
      PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
ff6000003f0777d8 8254d000                0        0  1 0
crash> bt
PID: 113      TASK: ff6000000226c200  CPU: 0    COMMAND: "sh"
 #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8
 #1 [ff20000010333cf0] panic at ffffffff806578c6
 #2 [ff20000010333d50] sysrq_reset_seq_param_set at ffffffff8038c03c
 #3 [ff20000010333da0] __handle_sysrq at ffffffff8038c604
 #4 [ff20000010333e00] write_sysrq_trigger at ffffffff8038cae4
 #5 [ff20000010333e20] proc_reg_write at ffffffff801b7ee8
 #6 [ff20000010333e40] vfs_write at ffffffff80152bb2
 #7 [ff20000010333e80] ksys_write at ffffffff80152eda
 #8 [ff20000010333ed0] sys_write at ffffffff80152f52
-------
Changes V1 -> V2:
 1, Do the below fixes based on HAGIO KAZUHITO's comments: 
    Fix build warnings,
    Use MACRO for Linux version,
    Add description of x86_64 binary for riscv64 in README,
    Fix build error for the "sticky" target for build on x86_64,
    Fix the mixed indent.
 2, Add 'help -m/M' support patch to this patch set.
 3, Support native compiling approach, which means the host OS distro
    is also a riscv64 (lp64d) Linux, based on Yixun Lan's comments.
 4, Use __riscv and __riscv_xlen instead of __riscv64__ based on Yixun Lan's comments.
Changes V2 -> V3:
 1, Fix coding style, avoid including the header twice, move free() to right place,
    introduce VM_FLAGS and so on based on Li Jiang's comments.
 2, Adjust the implementation of riscv64_verify_symbol(refer to the logic of x86_64_verify_symbol)
    as KSYMS_START isn't set when verify symbol in some case. 
Changes V3 -> V4:
 1, rebase this patch set to latest crash code.
 2, Remove the code of get the value of ADDRESS_SPACE_END, which is not used in current implementation.
Xianting Tian (9):
  Add RISCV64 framework code support
  RISCV64: Make crash tool enter command line and support some commands
  RISCV64: Add 'dis' command support
  RISCV64: Add irq command support
  RISCV64: Add 'bt' command support
  RISCV64: Add 'help -r' command support
  RISCV64: Add 'help -m/M' command support
  RISCV64: Add 'mach' command support
  RISCV64: Add the implementation of symbol verify
 Makefile            |    7 +-
 README              |    4 +-
 configure.c         |   43 +-
 defs.h              |  251 +++++++-
 diskdump.c          |   21 +-
 help.c              |    2 +-
 lkcd_vmdump_v1.h    |    8 +-
 lkcd_vmdump_v2_v3.h |    8 +-
 netdump.c           |   22 +-
 ramdump.c           |    2 +
 riscv64.c           | 1485 +++++++++++++++++++++++++++++++++++++++++++
 symbols.c           |   10 +
 12 files changed, 1841 insertions(+), 22 deletions(-)
 create mode 100644 riscv64.c
-- 
2.17.1
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH v2] Fix mount command to appropriately display the mount dumps
                                
                                
                                
                                    
                                        by Lianbo Jiang
                                    
                                
                                
                                        Recently the following failure has been observed on some vmcores when
using the mount command:
  crash> mount
       MOUNT           SUPERBLK     TYPE   DEVNAME   DIRNAME
  ffff97a4818a3480 ffff979500013800 rootfs none      /
  ffff97e4846ca700 ffff97e484653000 sysfs  sysfs     /sys
  ...
  ffff97b484753420                0 mount: invalid kernel virtual address: 0  type: "super_block buffer"
The kernel virtual address of the super_block is zero when the mount
command fails at the address 0xffff97b484753420. And the remaining
dumping information will be discarded. That is not expected.
Check the address and skip it with a warning, if this is an invalid
kernel virtual address, that can avoid truncating the remaining mount
dumps.
Reported-by: Dave Wysochanski <dwysocha(a)redhat.com>
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
 filesys.c | 4 ++++
 1 file changed, 4 insertions(+)
diff --git a/filesys.c b/filesys.c
index c2ea78de821d..d64b54a9b822 100644
--- a/filesys.c
+++ b/filesys.c
@@ -1491,6 +1491,10 @@ show_mounts(ulong one_vfsmount, int flags, struct task_context *namespace_contex
 		}
 
 		sbp = ULONG(vfsmount_buf + OFFSET(vfsmount_mnt_sb)); 
+		if (!IS_KVADDR(sbp)) {
+			error(WARNING, "cannot get super_block from vfsmnt: 0x%lx\n", *vfsmnt);
+			continue;
+		}
 
 		if (flags)
 			fprintf(fp, "%s", mount_hdr);
-- 
2.37.1
                                
                         
                        
                                
                                2 years, 10 months
                        
                        
                 
         
 
        
            
        
        
        
                
                        
                                
                                 
                                        
                                
                         
                        
                                
                                
                                        
                                                
                                        
                                        
                                        [PATCH] Fix mount command to appropriately display the mount dumps
                                
                                
                                
                                    
                                        by Lianbo Jiang
                                    
                                
                                
                                        Recently the following failure has been observed on some vmcores when
using the mount command:
  crash> mount
       MOUNT           SUPERBLK     TYPE   DEVNAME   DIRNAME
  ffff97a4818a3480 ffff979500013800 rootfs none      /
  ffff97e4846ca700 ffff97e484653000 sysfs  sysfs     /sys
  ...
  ffff97b484753420                0 mount: invalid kernel virtual address: 0  type: "super_block buffer"
The kernel virtual address of the super_block is zero when the mount
command fails at the address 0xffff97b484753420. And the remaining
dumping information will be discarded. That is not expected.
Check the address and skip it, if this is an invalid kernel virtual
address, that can avoid truncating the remaining mount dumps.
Reported-by: Dave Wysochanski <dwysocha(a)redhat.com>
Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
---
 filesys.c | 2 ++
 1 file changed, 2 insertions(+)
diff --git a/filesys.c b/filesys.c
index c2ea78de821d..8c2d4e316208 100644
--- a/filesys.c
+++ b/filesys.c
@@ -1491,6 +1491,8 @@ show_mounts(ulong one_vfsmount, int flags, struct task_context *namespace_contex
 		}
 
 		sbp = ULONG(vfsmount_buf + OFFSET(vfsmount_mnt_sb)); 
+		if (!IS_KVADDR(sbp))
+			continue;
 
 		if (flags)
 			fprintf(fp, "%s", mount_hdr);
-- 
2.37.1
                                
                         
                        
                                
                                2 years, 10 months