Thank you for the comment, Kazu.

On Fri, May 27, 2022 at 5:11 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> wrote:
On 2022/05/27 15:32, lijiang wrote:
>      >> +    /* static_rqs */
>      >> +    ret = 0;
>      >> +    addr = tag + OFFSET(blk_mq_tags_static_rqs);
>      >> +    if (!readmem(addr, KVADDR, &rqs_addr, sizeof(void *), "blk_mq_tags.static_rqs", RETURN_ON_ERROR))
>      >> +            return ret;
>      >> +    addr = rqs_addr + bitnr * sizeof(ulong); /* static_rqs[bitnr] */
>      >> +    if (!readmem(addr, KVADDR, &rq, sizeof(ulong), "blk_mq_tags.static_rqs[]", RETURN_ON_ERROR))
>      >> +            return ret;
>      >> +    ret = tag_iter->fn(rq, tag_iter->data);
>      >> +
>      >> +    return ret;
>      >
>      > I'm not familiar with blk-mq, I'd like some information.
>      >
>      > I think that the "dev -d" displays the same inflight stats in /proc/diskstats
>      > for devices without mq:
>      >
>      >     diskstats_show
>      >       part_in_flight
>      >         count up per_cpu disk_stats->in_flight
>      >
>      > so, if we do the same thing for mq devices, we should imitate this path:
>      >
>      >     diskstats_show
>      >       blk_mq_in_flight
>      >         blk_mq_queue_tag_busy_iter
>      >           queue_for_each_hw_ctx
>      >             bt_for_each
>      >               sbitmap_for_each_set
>      >                 bt_iter
>      >                   blk_mq_check_inflight
>      >
>      > On the path, I cannot see the static_rqs being counted.
>      > Why does it need to be counted?
>      >
>
> You are right. Basically, the implementation imitates the above path in the kernel. But not exactly, there are
> two changes in this patch:
> [1] add additional statistics for the static_rqs[] and sched_tags
> [2] the calculate_disk_io() imitates the blk_mq_check_inflight(), but doesn't check if the status of rq is in flight.
>
> For the [1], because the rqs[] only contains the requests from the driver tag, once the io scheduler is used, crash
> may miss the check of sched_tags and can not watch its value . You mean that crash shouldn't cover this case,
> or do I misunderstand the sched_tags and static_rqs[]?

hmm, if so, I would like to know the reason why kernel doesn't count them.

> For the [2], the intent is to calculate all requests in the current queue from the bitmap(not only inflight, but also handling
> rq), that can help to get actual read and write request counts. But, to be honest, I'm not sure whether this is a reasonable
> method. Are you concerned that it might have repeat read/write request counts? Or should the crash follow the above
> kernel path exactly? Any idea about this?

Yes, it's one of my concerns.  It does not count a request twice?

I didn't watch them, seems that the blk-mq bypass the io scheduler?
 
IMHO, basically crash should follow the kernel path exactly, including the
check for MQ_RQ_IN_FLIGHT.  The "dev -d" option displays the same stats as
/proc/diskstats for non-mq devices, so displaying different stats for mq
devices will be confusing.  And it will make tracking kernel changes easier.

 
The various scenarios of the block layer are complicated, it might be safe to keep exactly the same as the kernel path.
Let me adjust it a bit. Thank you for the suggestions.

Thanks.
Lianbo
 
Thanks,
Kazu