EAR 4.3
Reference Manual
The EAR Library

Overview

The EAR Library is the core of the EAR package. The EARL offers a lightweight and simple solution to select the optimal frequency for applications at runtime.

EARL is dynamically loaded next to the running applications by the EAR Loader. The loader detects whether the application is MPI or not. In case it is MPI, it also detects whether it is Intel or OpenMPI. Moreover, in the case of MPI, it intercepts the MPI symbols through the PMPI interface, and next symbols are saved in order to provide compatibility with MPI or other profiling tools. The EAR library is divided in several stages summarized in the following picture:

  1. Automatic detection of application outer loops. This is done by intercepting MPI calls and invoking the Dynamic Application Iterative Structure detector algorithm. DynAIS is highly optimized for new Intel architectures, reporting low overhead. For non-MPI applications, EAR implements a time-guided approach.
  2. Computation of the application signature. Once DynAIS starts reporting iterations for the outer loop, EAR starts to compute the application signature. This signature includes: iteration time, DC power consumption, bandwidth, cycles, instructions, etc. Since the DC power measurements error highly depends on the hardware, EAR automatically detects the hardware characteristics and sets a minimum time to compute the signature in order to minimize the average error.

The loop signature is used to classify the application activity in different phases. The current EAR version supports the following phases for: IO bound, CPU computation and GPU idle, CPU busy waiting and GPU computing, CPU-GPU computation, and CPU computation (for CPU only nodes). For phases including CPU computation, the optimization policy is applied. For other phases, the EAR library implements some predefined CPU/Memory/GPU frequency settings.

  1. Power and performance projection. EAR has its own performance and power models which requires the application and the system signatures as an input. The system signature is a set of coefficients characterizing each node in the system. They are computed during the learning phase at the EAR configuration step. EAR projects the power used and computing time (performance) of the running application for all the available frequencies in the system. These models are applied to CPU metrics and projects CPU performance and power when varying the CPU frequency. Using these projections the optimization policy can select the optimal CPU memory.
  1. Apply the selected energy optimization policy. EAR includes two power policies to be selected at runtime: minimize time to solution and minimize energy to solution, if permitted by the system administrator. At this point, EAR executes the power policy, using the projections computed in the previous phase, and selects the optimal frequency for an application and its particular run. An additional policy, monitoring only can also be used, but in this case no changes to the running frequency will be made but only the computation and storage of the application signature and metrics will be done. The short version of the names is used when submitting jobs (min_energy, min_time, monitoring). Current policies already includes memory frequency selection but in this case it is not based on models, it is a guided search. Check in your installation in the memory frequency optimization is enabled by default. In case the application is MPI, the policies already classifies the processes as balanced or unbalanced. In case they are unbalanced, a per-process CPU frequency is applied.

Some specific configurations are modified when jobs are executed sharing nodes with other jobs. For example the memory frequency optiization is disabled. See section environment variables page for more information on how to tune the EAR library optimization using environment variables.

Configuration

The EAR Library uses the $(EAR_ETC)/ear.conf file to be configured. Please visit the EAR configuration file page for more information about the options of EARL and other components.

The library receives its specific settings through a shared memory regions initialized by EARD.

Usage

For information on how to run applications alongside with EARL read the User guide. Next section contains more information regarding EAR's optimisation policies.

Policies

EAR offers three energy policies plugins: min_energy, min_time and monitoring. The last one is not a power policy, is used just for application monitoring where CPU frequency is not modified (neither memory or GPU frequency). For application analysis monitoringcan be used with specific CPU, memory and/or GPU frequencies.

The energy policy is selected by setting the --ear-policy=policy option when submitting a SLURM job. A policy parameter, which is a particular value or threshold depending on the policy, can be set using the flag --ear-policy-th=value. Its default value is defined in the configuration file, for more information check the configuration page for more information.

Plugin min_energy

The goal of this policy is to minimise the energy consumed with a limit to the performance degradation. This limit is is set in the SLURM --ear-policy-th option or the configuration file. The min_energy policy will select the optimal frequency that minimizes energy enforcing (performance degradation <= parameter). When executing with this policy, applications starts at default frequency(specified at ear.conf).

PerfDegr = (CurrTime - PrevTime) / (PrevTime)

Plugin min_time

The goal of this policy is to improve the execution time while guaranteeing a minimum ratio between performance benefit and frequency increment that justifies the increased energy consumption from this frequency increment. The policy uses the SLURM parameter option mentioned above as a minimum efficiency threshold.

Example: if --ear-policy-th=0.75, EAR will prevent scaling to upper frequencies if the ratio between performance gain and frequency gain do not improve at least 75% (PerfGain >= (FreqGain * threshold).

PerfGain=(PrevTime-CurrTime)/PrevTime
FreqGain=(CurFreq-PrevFreq)/PrevFreq

When launched with min_time policy, applications start at a default frequency (defined at ear.conf). Check the configuration page for more information.

Example: given a system with a nominal frequency of 2.3GHz and default P_STATE set to 3, an application executed with min_time will start with frequency F\\\[i\\\]=2.0Ghz (3 P_STATEs less than nominal). When application metrics are computed, the library will compute performance projection for F\\\[i+1\\\] and will compute the performance_gain as shown in the Figure 1. If performance gain is greater or equal than threshold, the policy will check with the next performance projection F\\\[i+2\\\]. If the performance gain computed is less than threshold, the policy will select the last frequency where the performance gain was enough, preventing the waste of energy.

Figure 1: min_time uses the threshold value as the minimum value for the performance gain between F\\\[i\\\] and F\\\[i+1\\\].

EAR API

EAR offers a user API for applications. The current EAR version only offers two functions, one to read the accumulated energy and time and another to compute the difference between the two measurements.

  • int ear_connect()
  • int ear_energy(unsigned long \\\*energy_mj, unsigned long \\\*time_ms)
  • void ear_energy_diff(unsigned long ebegin, unsigned long eend, unsigned long \\\*ediff, unsigned long tbegin, unsigned long tend, unsigned long \\\*tdiff)
  • int ear_set_cpufreq(cpu_set_t \\\*mask,unsigned long cpufreq);
  • int ear_set_gpufreq(int gpu_id,unsigned long gpufreq)
  • int ear_set_gpufreq_list(int num_gpus,unsigned long \\\*gpufreqlist)
  • void ear_disconnect()

EAR's header file and library can be found at $EAR_INSTALL_PATH/include/ear.h and $EAR_INSTALL_PATH/lib/libEAR_api.so respectively. The following example reports the energy, time, and average power during that time for a simple loop including a sleep(5).

#define _GNU_SOURCE
#include <ear.h>
int main(int argc,char *argv[])
{
unsigned long e_mj=0,t_ms=0,e_mj_init,t_ms_init,e_mj_end,t_ms_end=0;
unsigned long ej,emj,ts,tms,os,oms;
unsigned long ej_e,emj_e,ts_e,tms_e,os_e,oms_e;
int i=0;
struct tm *tstamp,*tstamp2,*tstamp3,*tstamp4;
char s[128],s2[128],s3[128],s4[128];
/* Connecting with ear */
if (ear_connect()!=EAR_SUCCESS)
{
printf("error connecting eard\n");
exit(1);
}
/* Reading energy */
if (ear_energy(&e_mj_init,&t_ms_init)!=EAR_SUCCESS)
{
printf("Error in ear_energy\n");
}
while(i<5)
{
sleep(5);
/* READING ENERGY */
if (ear_energy(&e_mj_end,&t_ms_end)!=EAR_SUCCESS)
{
printf("Error in ear_energy\n");
}
else
{
ts=t_ms_init/1000;
ts_e=t_ms_end/1000;
tstamp=localtime((time_t *)&ts);
strftime(s, sizeof(s), "%c", tstamp);
tstamp2=localtime((time_t *)&ts_e);
strftime(s2, sizeof(s), "%c", tstamp2);
printf("Start time %s End time %s\n",s,s2);
ear_energy_diff(e_mj_init,e_mj_end, &e_mj, t_ms_init,t_ms_end,&t_ms);
printf("Time consumed %lu (ms), energy consumed %lu(mJ),
Avg power %lf(W)\n",t_ms,e_mj,(double)e_mj/(double)t_ms);
e_mj_init=e_mj_end;
t_ms_init=t_ms_end;
}
i++;
}
ear_disconnect();
}