Hi Lianbo,
On 2022/06/08 22:27, lijiang wrote:
Hi, Kazu
Thank you for the fix.
On Tue, Jun 7, 2022 at 4:42 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab(a)nec.com
<mailto:k-hagio-ab@nec.com>> wrote:
There have been some reports that the "dev -d|-D" options displayed
incorrect I/O stats due to racy blk_mq_ctx.rq_completed value.
To fix it, make the options use sbitmap to count I/O stats on RHEL8
and adjust to its blk_mq_tags structure.
Signed-off-by: Kazuhito Hagio <k-hagio-ab(a)nec.com
<mailto:k-hagio-ab@nec.com>>
---
The patch tested ok with upstream 4.0 to 5.18 kernels and several
RHEL7 and RHEL8 ones.
dev.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/dev.c b/dev.c
index 0172c83ffaea..733e3a8a40cd 100644
--- a/dev.c
+++ b/dev.c
@@ -4339,6 +4339,10 @@ static void bt_for_each(ulong q, ulong tags, ulong sbq, uint
reserved, uint nr_r
static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint cnt, struct diskio
*dio)
{
uint i;
+ int bitmap_tags_is_ptr = 0;
+
+ if (MEMBER_TYPE("blk_mq_tags", "bitmap_tags") ==
TYPE_CODE_PTR)
+ bitmap_tags_is_ptr = 1;
The above change is related to the kernel(v5.10-rc1) commit:
222a5ae03cdd ("blk-mq: Use pointers for blk_mq_tags bitmap tags")
Could you please add it to the patch log?
Sure, I'll add them. Thank you for the information.
I did not search for the upstream patches.
And later, kernel(v5.16-rc1) stops using pointers again:
ae0f1a732f4a ("blk-mq: Stop using pointers for blk_mq_tags bitmap tags")
for (i = 0; i < cnt; i++) {
ulong addr = 0, tags = 0;
@@ -4357,9 +4361,17 @@ static void queue_for_each_hw_ctx(ulong q, ulong *hctx, uint
cnt, struct diskio
if (nr_reserved_tags) {
addr = tags + OFFSET(blk_mq_tags_breserved_tags);
+ if (bitmap_tags_is_ptr &&
+ !readmem(addr, KVADDR, &addr, sizeof(ulong),
+ "blk_mq_tags.bitmap_tags",
RETURN_ON_ERROR))
+ break;
bt_for_each(q, tags, addr, 1, nr_reserved_tags, dio);
}
addr = tags + OFFSET(blk_mq_tags_bitmap_tags);
+ if (bitmap_tags_is_ptr &&
+ !readmem(addr, KVADDR, &addr, sizeof(ulong),
+ "blk_mq_tags.bitmap_tags",
RETURN_ON_ERROR))
+ break;
bt_for_each(q, tags, addr, 0, nr_reserved_tags, dio);
}
}
@@ -4423,8 +4435,13 @@ get_mq_diskio(unsigned long q, unsigned long *mq_count)
unsigned long mctx_addr;
struct diskio tmp = {0};
- if (INVALID_MEMBER(blk_mq_ctx_rq_dispatched) ||
- INVALID_MEMBER(blk_mq_ctx_rq_completed)) {
+ /*
+ * Currently it does not support earlier sbitmap and blk-mq
+ * implementation e.g. RHEL7, so filter them out.
+ */
+ if (VALID_STRUCT(sbitmap) &&
+ VALID_MEMBER(sbitmap_word_cleared) &&
+ VALID_MEMBER(request_state)) {
The struct sbitmap was introduced in the kernel v4.9-rc1:
88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library")
The member of cleared in the struct sbitmap_word was added in the kernel v5.0-rc1:
ea86ea2cdced ("sbitmap: ammortize cost of clearing bits")
The member of state in struct request was added in the kernel v4.18-rc1:
12f5b9314545 ("blk-mq: Remove generation seqeunce")
It indicates that the current "dev -d" command won't work on the vmcores
generated with the kernel v5.0-rc1 and
earlier(still use the rq_dispatched and rq_completed to count the IOs on the old
vmcores). These three symbols
exist in the different kernel versions and put them together, it seems to be confused.
And it would increase the
maintenance difficulty in the future.>
To be on the safe side, I would suggest doing that in the distribution, because the
developers often backport some
upstream patches into the distribution, it could be more reasonable. Any ideas?
sorry, but I cannot see what is confusing and causes maintenance difficulty.
Do you mean that we should use only one macro checking if the latest
supported version v5.0, i.e. VALID_MEMBER(sbitmap_word_cleared) here?
But as you say, distributions can backport only a part of upstream patches,
so having several checks is safer, I think.
Yes, VALID_STRUCT(sbitmap) is unnecessary and can be removed. I added it
for readability, it says that we cannot use the function without sbitmap.
Is it better to remove this?
btw, I'm preparing a patch to support sbitmap without ea86ea2cdced,
to extend the supported version of 'sbitmapq' command. (attached)
We can fix it first if you think it's better. And then, probably
we will keep only VALID_MEMBER(request_state).
Thanks,
Kazu