This patch series optimizes symbol handling in crash utility for RISCV64
architecture, particularly beneficial when debugging kernels with large
symbol tables (900K+ symbols).
Background:
On a 32-core RISCV64 physical machine running Linux 6.12.13, the kernel
symbol table contains over 900,000 entries (wc -l /proc/kallsyms). When
using crash utility to analyze vmcore or live system, significant time
is spent in symbol-related operations, specifically:
1. Symbol name hash lookups - the original simple hash function causes
high collision rates with large symbol tables
2. Processing of linker-generated mapping symbols (.L*, L0*, $*) which
are unnecessary for debugging but consume processing time
Patch 1: Optimize symname_hash with larger table and FNV-1a hash
- Increases SYMNAME_HASH from 512 to 16384 (32x) to reduce collisions
- Replaces simple hash with FNV-1a algorithm for better distribution
- Removes strlen() call by computing hash in single pass
This reduces average hash bucket chain length significantly, improving
symbol lookup performance at the cost of ~248KB additional memory.
Patch 2: Add mapping symbol filter in riscv64_verify_symbol
- Filters out linker mapping symbols (.L*, L0*, $*) that should not be
in the symbol list
- Also optimizes riscv64_verify_symbol() by consolidating name validity
checks
Together these patches improve crash utility startup and symbol lookup
performance on RISCV64 systems with large kernel symbol tables.
Rui Qi (2):
symbols: optimize symname_hash with larger table and FNV-1a hash
RISCV64: add mapping symbol filter in riscv64_verify_symbol
defs.h | 2 +-
riscv64.c | 14 +++++++++++---
symbols.c | 20 ++++++++++++++------
3 files changed, 27 insertions(+), 9 deletions(-)
--
2.43.0
Show replies by date
Optimize the symbol name hash table to reduce collision and improve
performance, especially on RISC-V architecture where symbol lookup
is a hotspot.
Changes:
- Increase SYMNAME_HASH from 512 to 16384 (32x) to reduce collisions
- Replace simple hash with FNV-1a algorithm for better distribution
- Remove strlen() call, compute hash in single pass
This reduces the average chain length in hash buckets significantly,
improving symbol lookup performance at the cost of ~248KB additional
memory.
Signed-off-by: Rui Qi <qirui.001(a)bytedance.com>
---
defs.h | 2 +-
symbols.c | 15 +++++++++------
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/defs.h b/defs.h
index a6f43725b6b8..4cf062894ebe 100644
--- a/defs.h
+++ b/defs.h
@@ -2900,7 +2900,7 @@ struct downsized {
#define SYMVAL_HASH_INDEX(vaddr) \
(((vaddr) >> machdep->pageshift) % SYMVAL_HASH)
-#define SYMNAME_HASH (512)
+#define SYMNAME_HASH (16384)
#define PATCH_KERNEL_SYMBOLS_START ((char *)(1))
#define PATCH_KERNEL_SYMBOLS_STOP ((char *)(2))
diff --git a/symbols.c b/symbols.c
index e6865cabef74..afdf4a61cea2 100644
--- a/symbols.c
+++ b/symbols.c
@@ -1170,16 +1170,19 @@ symname_hash_init(void)
static unsigned int
symname_hash_index(char *name)
{
- unsigned int len, value;
- unsigned char *array = (unsigned char *)name;
+ unsigned int hash = 2166136261U;
+ unsigned char *p = (unsigned char *)name;
- len = strlen(name);
- if (!len)
+ if (!*p)
error(FATAL, "The length of the symbol name is zero!\n");
- value = array[len - 1] * array[len / 2];
+ /* FNV-1a hash algorithm for better distribution */
+ while (*p) {
+ hash ^= *p++;
+ hash *= 16777619;
+ }
- return (array[0] ^ value) % SYMNAME_HASH;
+ return hash % SYMNAME_HASH;
}
/*
--
2.20.1
Add mapping symbol filter in riscv64_verify_symbol() to filter out
linker mapping symbols like '.L*', 'L0*' and '$*'. These symbols
should not end up in the symbol list.
Also optimize riscv64_verify_symbol() by consolidating name validity
checks at the function entry.
Changes:
- riscv64.c: Add mapping symbol filter and optimize name checks
- symbols.c: Add machine_type("RISCV64") to enable verify_symbol()
call in store_module_kallsyms_v2()
Signed-off-by: Rui Qi <qirui.001(a)bytedance.com>
---
riscv64.c | 14 +++++++++++---
symbols.c | 5 ++++-
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/riscv64.c b/riscv64.c
index d1c7d4a36109..c8b51ea2bbac 100644
--- a/riscv64.c
+++ b/riscv64.c
@@ -209,15 +209,23 @@ riscv64_cmd_mach(void)
static int
riscv64_verify_symbol(const char *name, ulong value, char type)
{
- if (CRASHDEBUG(8) && name && strlen(name))
+ if (!name || !strlen(name))
+ return FALSE;
+
+ if (CRASHDEBUG(8))
fprintf(fp, "%08lx %s\n", value, name);
+ /* Filter out mapping symbols */
+ if ((name[0] == '.' && name[1] == 'L') ||
+ (name[0] == 'L' && name[1] == '0') ||
+ (name[0] == '$'))
+ return FALSE;
+
if (!(machdep->flags & KSYMS_START)) {
if (STREQ(name, "_text") || STREQ(name, "_stext"))
machdep->flags |= KSYMS_START;
- return (name && strlen(name) && !STRNEQ(name, "__func__.")
&&
- !STRNEQ(name, "__crc_"));
+ return (!STRNEQ(name, "__func__.") && !STRNEQ(name,
"__crc_"));
}
return TRUE;
diff --git a/symbols.c b/symbols.c
index afdf4a61cea2..8eb8b37abc23 100644
--- a/symbols.c
+++ b/symbols.c
@@ -2990,9 +2990,12 @@ store_module_kallsyms_v2(struct load_module *lm, int start, int
curr,
* or '$x' for ARM64, and '$d'.
* On LoongArch we have linker mapping symbols like '.L'
* or 'L0'.
+ * On RISCV64 we have linker mapping symbols like '.L',
+ * 'L0' or '$'.
* Make sure that these don't end up into our symbol list.
*/
- if ((machine_type("ARM") || machine_type("ARM64") ||
machine_type("LOONGARCH64")) &&
+ if ((machine_type("ARM") || machine_type("ARM64") ||
machine_type("LOONGARCH64") ||
+ machine_type("RISCV64")) &&
!machdep->verify_symbol(nameptr, ec->st_value, ec->st_info))
continue;
--
2.20.1