UFS Disk performance measurement

Why?

Please explain what your goal is. Don't you trust Seagate/WD/Hitachi/... to actually deliver what it says in their specification? Are you a file system implementor who wants to optimize their code? If you use your computer for some real task, and it runs too slowly, then the correct benchmark is your application.

And SirDice is correct, the most generic and accepted disk benchmark is indeed Bonnie.
 
Why?

Please explain what your goal is.

Mainly curiosity. I have a number of systems and have no real idea how each of them perform. It came as a bit of a shock when it took seven hours to make buildworld on one of them, so I'm interested in how each of the factors, disk type, cpu, memory effect performance.
 
Now it gets really hard. If your question had been "how fast is my disk", my answer (see above) would have been: "Download the spec sheet for it from the manufacturer's web site". Because that's the generally best answer. If your question had been "how fast is my storage subsystem (from user space, through file system, through perhaps block layers like RAID, through storage networking like SAS or FC, to the disks), then SirDice's answer "run a synthetic benchmark like bonnie" would have been correct. Note that both answers are fascinating, but irrelevant for the real world, because (a) disks don't exist in isolation, and (b) real workloads are not synthetic benchmarks.

So now you have a real-world performance problem: You run a certain workload (make buildworld), and you want to find out why it is as slow as it is. This is performance analysis and tuning. There are dozens of textbooks about this topic. The general workflow is roughly the following: Instrument your system with measurement points. For example, measure whether your CPU is idle or busy, whether your disks are idle or busy. In general, the thing you measure is the "utilization" of a "resource", like disk or CPU. For CPUs, that's pretty easy: Programs like vmstat or top tell you whether the CPU is busy or idle, and in general a busy CPU runs slower than an idle one. So you find out what your workload spends most of its time on, and then start optimizing that. This is a corollary of Amdahl's law. For example, if you find that your program keeps all 8 cores continuously 100% busy, then you stop worrying about disk, and either use a more efficient compiler, or buy a faster CPU, ran a smaller workload, or learn how to be more patient. So the trick is: Find the BOTTLENECK. That's an important term, because it determines why your code is slow, and what you need to improve first. Then fix the narrowest bottleneck you found, and measure again.

Where it gets hard: Certain things are hard to measure because of feedback effects. For example, if you don't have enough memory, your workload may still run, but less efficiently: You might run with less buffer cache in the file system, so disks have to be read or written more often. Or your code knows how much memory is available, and adapts its behavior accordingly. The other thing is that you can't directly measure the time spent accessing memory, since to vmstat or top this looks like either CPU usage (if you are using memory inefficiently, for example stomping the cache, or your NUMA setup is wrong), or like disk usage (if you are swapping or not using the buffer cache). Where it gets really nasty is optimizing disk, because disks are one of the few things on the planet that work more efficiently (in terms of throughput) the more overloaded they are, as long as you have parallelism and can handle latencies. If your disk can under ideal conditions handle 100 IOps, then your overall workload might run much faster if you tune it (for example with parallel make) to continuously run the disks at 98 IOps, rather than being latency sensitive and run your workload at only 70 IOps, where the disks are coasting (and wasting $$$).

The important part is: This needs to be continuously data driven. Always take measurements, record them, record the conditions under which the measurements were taken, and only change one thing at a time.

And observe that synthetic benchmarks and manufacturer's spec sheets hardly enter into this process. They are actually nearly useless, and frequently actually detrimental.
 
it took seven hours to make buildworld on one of them, so I'm interested in how each of the factors,
Cores and Disk Speed is the way to get faster compiling.
Memory not so much.
Throw 48 cores and an NVMe at it and you'r done in 20-30 minutes.

You might want to familiarize yourself with some compiling options like make -DNOCLEAN
If your doing uboot changes there is no need to rebuild everything.
 
Back
Top