Until a couple of years ago, when we referred to performance measurement of an application, we meant the amount of time that it took the job to run vs. the specific resources it used - number of cores, number of servers if you are using a cluster, the specific characteristics of the server cores and memory and other server specs, plus IO/storage resources and specs.
Basically, we only measured one thing, the elapsed time of the job. Then, using the resources and specs, we computed lots of things - throughput efficiency, parallel scalability and efficiency, performance per core or per server, IO metrics, etc, etc.
Now, we make an additional measurement - power utilization, which we correlate in time with the execution of an application. We want to know the average power used during the execution of a single job, and we also look at the variation of power during a job, and the maximum power used.
Of course, lots of people measure power used by computers. But, since most of these people are system managers or system designers, they don't have a reason to correlate power with specific applications and compute jobs. They want to know the average and peak power used to run their overall workload, so they can plan for current and future power requirements. This is important work, but it does not give them the ability to optimize their workload.
If you measure the average power used during the execution of one compute job, and you multiply that power by the elapsed time of the job, you have Application Energy - the electrical energy used to run that specific job. This is a very convenient quantity, since it gives you a single number that relates power usage to compute jobs. You can use Application Energy to optimize your workload, just as you use elapsed time.
A couple of examples:
1. You can measure application energy for a given set of applications on two or more different server models, and then select the more energy-efficient model. You can use this App Energy comparison together with elapsed time comparison, and then make speed vs. energy tradeoffs. If a job runs 30% faster but consumes 50% more Application Energy on Server A than it does on Server B, which is a better choice for your requirements?
2. We are also using Application Energy to determine the most efficient way to run applications which run in parallel on a cluster of servers - a common way to run HPC codes. For one common HPC application, we ran the same job at 3 levels of parallelization and compared the elapsed times and Application Energy. We showed that the job used only 4% more Application Energy running 32-way-parallel (on 32 cores) vs. 16-way parallel. But the job used 20% more Application Energy running 64-way-parallel vs. 32-way-parallel. In other words, there is very little energy cost using 32 cores and returning the results to the user much faster vs. using 16 cores. But there is a substantial energy cost to use 64 cores, which returns the results even faster.
Does anyone find this interesting, or agree (or disagree) with this approach?