Taskbar Latency and Kernel Calls

I work quickly on my computer and I get frustrated when I am forced to wait on an operation that should be fast. A persistent nuisance on my over-powered home laptop is that closing windows on the taskbar is slow. I right-click on an entry, wait for the menu to appear, and then select “Close window”. The mouse movement should be the slow part of this but instead I find that the delay before the menu appears is the longest component.

This has been bothering me for a long time but I had been showing uncharacteristic self control and had resisted being distracted. Until today, when I finally broke down and grabbed anETW trace.

Disclaimer: I didn’t find a solution, but I did find some crazy code

The ETW trace records me right-clicking on the task bar to close two Explorer windows. I usedUIforETW’s Tracing to file with the default options, giving me a 20.9 MB trace .

Sometimes the challenge in trace analysis is to find where the issue is, but for this issue that part of the analysis was trivial. There were three clear signals that all pointed to the right place, and a painfully obvious culprit.

The first signal is the input events. UIforETW contains an integrated input logger (anonymized enough so that I don’t accidentally steal passwords or personal information) so I could just drill down to the MouseUp events with a Button Type of 2, which represents the right mouse button. This draws highlights on theWPA timeline for when those events happened – see the vertical lines below:

That tells me when I released the right mouse button, and there is very suggestive window focus change that happens about 600 ms later which I’m guessing is when the menu appeared. Finally there is a clear block of CPU activity in RuntimeBroker.exe between the mouse up and window focus events.

The window focus change and CPU activity are not proven to show the end time, but measurements with a screen recorder app showed the menu taking about 660 ms to appear so I’m going to trust them.

The next step was to see what RuntimeBroker.exe was doing. WhileCPU Usage (Precise) is great for seeing how much CPU time a process is using, and why it is sitting idle, the CPU Usage (Sampled) table is the right tool for figuring out where CPU time is being spent. I drilled down and quickly found 264 samples hitting in KernelBase.dll!ReadFile :

Looking around a bit I found other call stacks that also ended up in that function so I right-clicked on it and went View Callers-> By Function . The view this gave me (with stacks now inverted) shows that a total of 628 samples, along various call stacks, ended up going through that function, accounting for 70% of the 899 total samples in that process:

271 samples in that thread didn’t go through that function, and the remaining samples (not shown) were in other threads.

Note that the 899 samples in thread 10,252 represent two mouse clicks, so about 450 samples or about 450 ms (using the default 1 kHz sampling rate) per mouse-click.

Sometimes file I/O is CPU time

CPU Usage (Sampled) shows CPU time, so disk I/O usually doesn’t show up there because then the thread goes to sleep and waits on the disk. The fact that this shows up as CPU time suggests that the reads were all hitting in the system cache and the CPU time was the kernel overhead (note ntoskrnl.exe on the first sampled call stack) of grabbing data from the cache.

Now that we suspect file I/O it’s necessary to go to Graph Explorer-> Storage-> File I/O . With a bit of column rearranging we get this impressive result:

What this says is that, over the course of two right-mouse clicks, RuntimeBroker.exe , thread 10,252, issued 229,604 ReadFile calls, reading a total of 15,686,586 bytes. That is an average read of 68 bytes each time.

Think about that for a moment.

Remember that these are calls to the operating system – kernel calls. This means that there is no caching between RuntimeBroker.exe and this file. In fact, the file itself is only 4,027,904 bytes long, so each time I right-click an explorer icon on the task bar this file is read 1.9 times in the slowest possible way.

I don’t know what f01b4d95cf55d32a.automaticDestinations-ms is, and I don’t know why RuntimeBroker.exe reads from it so inefficiently. I also don’t know why right-clicking on other programs’ icons on the task bar is also a bit slow – it’s apparently a different issue, or an odd design decision.

Given that the menu always takes a long time to appear it is even possible that the 600 ms of inefficient file I/O from RuntimeBroker.exe is unrelated and is happening in parallel, but I doubt it. And it is sufficiently egregious that I decided to write it up as is. Tracking down what processes ShellExperienceHost.exe is waiting on quickly goes down a rabbit hole of indirection that made me decide to leave that to the code owners. Analysis of clicking on non-explorer programs shows that ShellExperienceHost.exe becomes active much faster when RuntimeBroker.exe is out of the picture, but the visible menu delays are still not great.

Summary

  • Don’t call ReadFile to get 68 bytes. Or, at least, don’t do it a hundred thousand times. I get cranky when databases do 4-KiB reads but this is amazing.
  • I hope somebody fixes this
  • I hope somebody investigates to find out why context menus are slow in general. Hey, I shared an ETW trace and everything!
我来评几句
登录后评论

已发表评论数()

相关站点

+订阅
热门文章