Testing the performance impact of kernel modifications

Discussion:

Carter Cheng

2018-10-15 15:42:03 UTC

Hi,

I was wondering what are some good ways to assess the performance impact of
kernel modifications. Are there some papers in the literature where this is
done? Does one need to differentiate between CPU bound and different types
of I/O bound processes etc?

Regards,

Carter.

v***@vt.edu

2018-10-15 17:07:54 UTC

Permalink

Post by Carter Cheng
I was wondering what are some good ways to assess the performance impact of
kernel modifications. Are there some papers in the literature where this is
done? Does one need to differentiate between CPU bound and different types
of I/O bound processes etc?

That is *so* totally dependent on exactly what the modification is, that
there's no right answer here.

The things you will want to measure for a new TCP flow control module (to
measure the difference between, say, cubic and new_reno and fq_codel and
your new module) will be *totally* different from changes to an LSM, which again
will be different from an overhaul of a disk I/O scheduler.

And then, the environment matters as well. The performance metrics that I care
about on my laptop (which is used as a desktop replacement) are "can I do a
kernel build and my desktop environment still work well" type things. But the
numbers I care about on the machines I maintain across the hall in the data
center are different - those are disk storage, backup, and archive - so I'm
willing to burn a lot of CPU in both kernel and userspace if it gets me more
IOPs and throughput - important when you have 500+ million files in a single
petabyte-plus file system. Meanwhile, the guys a few cubicles down are doing
HPC, which means they want as little kernel CPU usage as possible because that
gets in the way of user computations.

And sometimes, it doesn't matter in the slightest what the performance impact is,
because the change is required for correctness - running incorrect code faster is
still running incorrect code. See the recent Spectre patches for an example.

Carter Cheng

2018-10-15 17:23:45 UTC

Permalink

I am actually looking at some changes that litter the kernel with short
code snippets and thus according to papers i have read can result in CPU
hits of around 48% when applied is userspace. I am curious how you would
best measure the impact of similar modifications (since obviously one isn't
always in the kernel code when executing a process). My interest is in
testing different approaches of making pointer runtime checks.

My theory would be that perhaps writing some test code that exercises
different extremes (calling the syscall api continually to do a range of
tasks) to see how much slower the code would be between the safer version
and the old version. This however might not reflect real world performance
for I/O (sockets) such as web servers that spend less time in the kernel. I
have seen a paper that benchmarked kvm against xen server but I haven't
seen any kernel space papers measuring degradations in overall system
performance when adding safety checks(perhaps redundant sometimes) into the
kernel.

Post by v***@vt.edu

Post by Carter Cheng
I was wondering what are some good ways to assess the performance impact

Post by Carter Cheng
kernel modifications. Are there some papers in the literature where this

Post by Carter Cheng
done? Does one need to differentiate between CPU bound and different

types

Post by Carter Cheng
of I/O bound processes etc?

That is *so* totally dependent on exactly what the modification is, that
there's no right answer here.
The things you will want to measure for a new TCP flow control module (to
measure the difference between, say, cubic and new_reno and fq_codel and
your new module) will be *totally* different from changes to an LSM, which again
will be different from an overhaul of a disk I/O scheduler.
And then, the environment matters as well. The performance metrics that I care
about on my laptop (which is used as a desktop replacement) are "can I do a
kernel build and my desktop environment still work well" type things. But the
numbers I care about on the machines I maintain across the hall in the data
center are different - those are disk storage, backup, and archive - so I'm
willing to burn a lot of CPU in both kernel and userspace if it gets me more
IOPs and throughput - important when you have 500+ million files in a single
petabyte-plus file system. Meanwhile, the guys a few cubicles down are doing
HPC, which means they want as little kernel CPU usage as possible because that
gets in the way of user computations.
And sometimes, it doesn't matter in the slightest what the performance impact is,
because the change is required for correctness - running incorrect code faster is
still running incorrect code. See the recent Spectre patches for an example.

v***@vt.edu

2018-10-15 19:19:46 UTC

Permalink

Post by Carter Cheng
I am actually looking at some changes that litter the kernel with short
code snippets and thus according to papers i have read can result in CPU
hits of around 48% when applied is userspace.

You're going to need to be more specific. Note that 48% increase in a micro-benchmark
doesn't necessarily translate to a measurable performance change - for example, I have a
kernel build running right now with a cold file cache, and it's only using 6-8% of the CPU in
kernel mode (the rest being gcc in userspace and waiting for the spinning-oxide disk). If the
entire kernel slowed down by 50% that would only be 3-4% change visible at the macro level.

Post by Carter Cheng
but I haven't seen any kernel space papers measuring degradations in overall
system performance when adding safety checks(perhaps redundant sometimes) into
the kernel

Well.. here's the thing. Papers are usually written by academics and trade
journal pundits, not people who write code for a living. As a result, they end
up comparing released code versions. As a worked example, see how the whole
Spectre thing turned out - the *initial* fears were that we'd see a huge
performance drop. But the patches that finally shipped for the Linux kernel
were after a bunch of clever people had thought about it and come up with less
intrusive ways to close the security issue.

(Having said that, the guys at Phoronix do a reasonable job of doing
macro-level benchmarks of each kernel release and pointing out if there's a big
hit in a subsystem).

And as I said earlier - sometimes it doesn't matter, because correctness trumps performance.

Carter Cheng

2018-10-15 19:48:20 UTC

Permalink

Basically I am looking for methodology guidelines for doing my own testing
on a bunch of techniques in different papers and seeing what the
performance impact is overall. Are there guidelines for doing such things?

Post by v***@vt.edu

You're going to need to be more specific. Note that 48% increase in a micro-benchmark
doesn't necessarily translate to a measurable performance change - for example, I have a
kernel build running right now with a cold file cache, and it's only using
6-8% of the CPU in
kernel mode (the rest being gcc in userspace and waiting for the
spinning-oxide disk). If the
entire kernel slowed down by 50% that would only be 3-4% change visible at the macro level.

Post by Carter Cheng
but I haven't seen any kernel space papers measuring degradations in

overall

Post by Carter Cheng
system performance when adding safety checks(perhaps redundant

sometimes) into

Post by Carter Cheng
the kernel

Well.. here's the thing. Papers are usually written by academics and trade
journal pundits, not people who write code for a living. As a result, they end
up comparing released code versions. As a worked example, see how the whole
Spectre thing turned out - the *initial* fears were that we'd see a huge
performance drop. But the patches that finally shipped for the Linux kernel
were after a bunch of clever people had thought about it and come up with less
intrusive ways to close the security issue.
(Having said that, the guys at Phoronix do a reasonable job of doing
macro-level benchmarks of each kernel release and pointing out if there's a big
hit in a subsystem).
And as I said earlier - sometimes it doesn't matter, because correctness
trumps performance.

SeyedAlireza Sanaee

2018-10-23 13:59:46 UTC

Permalink

I believe there is no definite methodology, it is all experimental and
dependent on the applications you are running and as Valdis told on the
changes you would like to make. Basically, when a paper offers a new
algorithm or design then they should have tested it on their own testbed.
They may report their experimental methodology in the paper or even have
the experiment scripts on the Github. However, other than the applications,
and your changes, the testbed itself is also important. Their system may
work with a CPU frequency different than yours. So you might not see the
performance gain as they reported and achieved in the paper.

For instance, concerning some network enhancement in TCP stack, some people
may improve the end to end latency just 10s of microseconds and you suppose
to capture those minor microseconds in your experiments. It is really hard
but not impossible, it basically takes time and effort + *extensive
evaluations*.

I know that nowadays system software papers are pretty practical, and they
try to build a working systems. I'm particularly talking about SOSP and
OSDI papers.

Post by Carter Cheng
Basically I am looking for methodology guidelines for doing my own testing
on a bunch of techniques in different papers and seeing what the
performance impact is overall. Are there guidelines for doing such things?

Post by v***@vt.edu

You're going to need to be more specific. Note that 48% increase in a micro-benchmark
doesn't necessarily translate to a measurable performance change - for example, I have a
kernel build running right now with a cold file cache, and it's only
using 6-8% of the CPU in
kernel mode (the rest being gcc in userspace and waiting for the
spinning-oxide disk). If the
entire kernel slowed down by 50% that would only be 3-4% change visible
at the macro level.

Post by Carter Cheng
but I haven't seen any kernel space papers measuring degradations in

overall

Post by Carter Cheng
system performance when adding safety checks(perhaps redundant

sometimes) into

Post by Carter Cheng
the kernel

Well.. here's the thing. Papers are usually written by academics and trade
journal pundits, not people who write code for a living. As a result, they end
up comparing released code versions. As a worked example, see how the whole
Spectre thing turned out - the *initial* fears were that we'd see a huge
performance drop. But the patches that finally shipped for the Linux kernel
were after a bunch of clever people had thought about it and come up with less
intrusive ways to close the security issue.
(Having said that, the guys at Phoronix do a reasonable job of doing
macro-level benchmarks of each kernel release and pointing out if there's a big
hit in a subsystem).
And as I said earlier - sometimes it doesn't matter, because correctness
trumps performance.

_______________________________________________
Kernelnewbies mailing list
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Continue reading on narkive:

Search results for 'Testing the performance impact of kernel modifications' (Questions and Answers)

replies

What is bio - diesel?

started 2007-02-28 22:14:07 UTC

engineering