On 2023/04/12 18:04, lijiang wrote:
>>> On 2023/03/25 13:12, Tao Liu wrote:
>>>> Previously crash use lseek/read for kcore and vmcore reading, this
> involves 2
>>>> syscalls. And we can replace them with pread, only 1 syscall is needed
> for
>>>> kcore/vmcore reading, and we can have a better performance. Please
> note there
>>>> are plenty of places in crash using lseek/read, this patch doesn't
> modify all
>>>> of them, just the most commonly used kcore and diskdump vmcore reading.
>>>
>>> This is nice as an optimization, but can we distinguish a seek error and
>>> read error with errno as currently we can do?
>>>
>>
>> Sorry I didn't understand your question...
>>
>> The functions which I modified in this patch didn't use a global errno
>> to distinguish error types, they used a returned value instead.
>> Currently if read error, READ_ERROR will be returned, same as seek
>> error returns SEEK_ERROR. Do you want to distinguish pread error by
>> introducing a new error number as PREAD_ERROR?
>
> No, I would like to know which causes an error in pread(), seek or read.
> In other words, is there any way to know which is wrong, the specified
> offset or file/media?
>
>
Good questions.
In fact, the SEEK_ERROR is simply returned to the top-level call, but it
seems that there is no corresponding processing for the type of errors at
the top-level code. Only saw two cases(please correct me if I missed
anything):
[1] readmem()
...
case SEEK_ERROR:
if (PRINT_ERROR_MESSAGE)
error(INFO, SEEK_ERRMSG,
memtype_string(memtype, 0), addr, type);
goto readmem_error;
case READ_ERROR:
if (PRINT_ERROR_MESSAGE)
error(INFO, READ_ERRMSG,
memtype_string(memtype, 0), addr, type);
if ((pc->flags & DEVMEM) && (kt->flags &
PRE_KERNEL_INIT) &&
!(error_handle & NO_DEVMEM_SWITCH) &&
devmem_is_restricted() &&
switch_to_proc_kcore()) {
error_handle &= ~QUIET;
return(readmem(addr, memtype, bufptr, size,
type, error_handle));
}
goto readmem_error;..
[2] read_diskdump()
...
if ((ret = cache_page(curpaddr)) < 0) {
if (CRASHDEBUG(8))
fprintf(fp, "read_diskdump: "
"%s: cannot cache page: %llx\n",
ret == SEEK_ERROR ?
"SEEK_ERROR" :
"READ_ERROR",
(ulonglong)curpaddr);
return ret;
}
...
Given that, only recognizing the type of errors, it can not do anything.
Would it really make sense to distinguish the type of errors(seek/read
errors)?
I can see some seek errors on GitHub issues [1] and I don't think that
it's useless at all for the first step of debugging.
[1]
https://github.com/crash-utility/crash/issues?q=is%3Aissue+seek+error
But according to the man page and some experiments, lseek() allows an
offset beyond the end of the file for normal dumpfiles (a vmcore,
/proc/kcore and etc.) and just does not allow a negative offset.
So, my concern will be mostly solved by adding a negative offset check,
for example,
if (offset < 0) {
if (CRASHDEBUG(8))
fprintf(fp, " ... offset: %llx\n", offset);
return SEEK_ERROR;
}
How about adding something like this, instead of just removing lseek?
BTW: Could it be helpful to print an actual error when the pread() fails?
For example:
+ ssize_t prdcnt = pread(fd, bufptr, readcnt, offset);
+ if (prdcnt != readcnt) {
+ if (prdcnt == -1)
+ error(WARNING, "/proc/kcore: %s\n",
strerror(errno));
return READ_ERROR;
+ }
Yes, it's also sounds good to print errno for better debugging.
I would prefer it with CRASHDEBUG(8) same as the other error messages.
Thanks,
Kazu