EAR is composed of five main components:
The following image shows the main interactions between components:
For a more detailed information about EAR components, visit the Architecture page.
This section provides a, summed up, step by step installation and execution guide for EAR. For a more in depth explanation of the necessary steps see the Installation from source page or the Installing from RPM section, following the Configuration guide, or contact us at ear-s.nosp@m.uppo.nosp@m.rt@bs.nosp@m.c.es
Requirements to compile EAR are:
To install EAR from rpm (only binaries) all these dependencies have been removed except mysqlclient. However, they are needed when running EAR.
SLURM must also be present if the SLURM plugin wants to be used. Since current EAR version only supports automatic execution of applications with EAR library using the SLURM plugin, it must be running when EAR library wants to be used (not needed for node monitoring).
Lastly, but not less important:
sudo sh -c "echo 2 > /proc/sys/kernel/perf_event_paranoid"
in compute nodes.Run ./configure --help
to see all the flags and options.
Once downloaded the code from repository, execute:
autoreconf -i
.Additionally to the Makefile, MAKE_NAME
forces to copy the generated Makefile with the name Makefile._make_extension_. It simplifies the fact of having multiple configurations (1 for each library version needed). More relevant options are:
--disable-mpi
must be set to generate a configuration for non-MPI version of the library.MPI_VERSION=ompi
for OpenMPI compatible version.Before running make
, review the Makefile and the configuration log to validate all the requirements of your installation have been automatically detected. In particular, if you need to use some specific library such likwid, freeipmi or CUDA. If CUDA path is specified, EAR will be compiled with GPU support. Check also that MySQL ot PostgreSQL paths have been detected. You can use options USER
and GROUP
if you want to install EAR with a special USER/GROUP.
The following shows how to configure EAR to be compiled with Intel MPI:
At this point the EAR binaries will be installed including one version of the EAR library for MPI (default), EAR documentation, EAR service files for EAR daemons and templates for ear.conf
files and SLURM plugin. The configure tool tries to automatically detect paths to mysql and/or postgress, scheduler sources, etc. It is mandatory to detect the scheduler path, by default SLURM is assumed. After the configure, check in the Makefile all the options have been detected. After the make install, you should have the following folders in the ear-install-path: bin, sbin, etc, lib, include, man. The bin directory includes commands and tools, the sbin includes EAR services, the lib includes all the libraries and plugins, and etc includes templates and examples for EAR service files, ear.conf file, the EAR module, etc.
Prepare the configuration
Either installing from sources or rpm, EAR installs a template for ear.conf
file in $EAR_ETC/ear/ear.conf.template
and $EAR_ETC/ear/ear.conf.full.template
. The full version includes all fields. Copy only one as $EAR_ETC/ear/ear.conf
and update with the desired configuration. Go to the configuration section to see how to do it. The ear.conf
is used by all the services. It is recommended to have in a shared folder to simplify the changes in the configuration.
EAR module
Install and load EAR module to enable commands. It can be found at $EAR_ETC/module
. You can add ear module whan it is not in standard path by doing module use $EAR_ETC/module
and then module load ear
.
EAR Database
Create EAR database with edb_create
, installed at $EAR_INSTALL_PATH/sbin
. The edb_create -p
command will ask you for the DB root password. If you get any problem here, check first whether the node where you are running the command can connect to the DB server. In case problems persists, execute edb_create -o to report the specific SQL queries generated. In case of trouble, contact with ear-s.nosp@m.uppo.nosp@m.rt@bs.nosp@m.c.es or open in issue.
Energy models
EAR uses a power and performance model based on systems signatures. These system signatures are stored in coefficient files.
Before starting EARD, and just for testing, it is needed to create a dummy coefficient file and copy in the coefficients path, by default placed at$EAR_ETC/coeffs
. Use the coeffs_null
application from tools section.
EAR version 4.1 does not require null coefficients.
EAR services
Create soft links or copy EAR service files to start/stop services using system commands such as systemctl
in the services folder. EAR service files are generated at $EAR_ETC/systemd
and they can usually be placed in $(ETC)/systemd
.
Enable and start EARDs and EARDBDs via services (e.g., sudo systemctl start eard
, sudo systemctl start eardbd
). EARDBD and EARD outputs can be found at $EAR_TMP/eardbd.server.log
and $EAR_TMP/eard.log
respectively when DBDaemonUseLog and NodeUseLog options are set to 1 in the ear.conf
file, respectively. Otherwise, their outputs are generated at stderr and can be seen using the journalctl
command (i.e., journalctl -u eard).
By default, a certain level of verbosity is set. It is not recommended to modify it but you can change it by modifying the value of constants in file src/common/output/output_conf.h
.
Quick validation
Check that EARDs are up and running correctly with econtrol --status
(note that daemons will take around a minute to correctly report energy and not show up as an error in econtrol). EARDs create a per-node text file with values reported to the EARDBD (local to compute nodes). In case there are problems when running econtrol, you can also find this file at $EAR_TMP/nodename.pm_periodic_data.txt
.
Check that EARDs are reporting metrics to database with ereport. ereport -n all
should report the total energy sent by each daemon since the setup.
file in theIt is recommented to create a soft link to the
$EAR_ETC/slurm/ear.plugstack.conf
/etc/slurm/plugstack.conf.d
directory to simplify the EAR plugin management.For a first test it is recommened to set
default=off
in theear.plugstack.conf
(to disable the automatic loading of the EAR library).
EAR plugin validation
At this point you must be able to see EAR options when doing, for example, srun --help
. You must see something like below as part of the output. The EAR plugin must be enabled at login and compute nodes.
eacct
command.Note that only privileged users can check other users’ applications.
--ear=on
and check that now the output of eacct
includes the Library metrics.default=on
to set the EAR Library loading by default at ear.plugstack.conf
. If default is turned off, EARL can be explicitly loaded by setting the flag --ear=off
at job submission.At this point, you can use EAR for monitoring and accounting purposes but it cannot use the power policies offered by EARL. To enable them, first perform a learning phase and compute node coefficients. See the EAR learning phase wiki page. For the coefficients to be active, restart daemons.
Important Reloading daemons will NOT make them load coefficients, restarting the service is the only way.
As commented in the overview, the EAR Library is loaded next to the user MPI application by the EAR Loader. The Library uses MPI symbols, so it is compiled by using the includes provided by your MPI distribution. The selection of the library version is automatic at runtime, but it is not required during the compilation and installation steps. Each compiled library version has its own file name that has to be defined by the MPI_VERSION
variable during the ./configure
or by editing the root Makefile.
The name list per distribution is exposed in the following table:
Distribution | Name | MPI_VERSION value |
---|---|---|
Intel MPI | libear.so (default) | not required |
MVAPICH | libear.so (default) | not required |
OpenMPI | libear.ompi.so | ompi |
If different MPI distributions share the same library name, it means their symbols are compatible between them, so compiling and installing the library one time will be enough. However, if you provide different MPI distributions to users, you will have to compile and install the library multiple times.
EAR makefiles include a specific target for each EAR component, supporting full or partial updates:
Command | Description |
---|---|
make -f Makefile.make_extension install | Reinstall all the files except etc and doc . |
make -f Makefile.make_extension earl.install | Reinstall only the EARL. |
make -f Makefile.make_extension eard.install | Reinstall only the EARD. |
make -f Makefile.make_extension earplug.install | Reinstall only the EAR SLURM plugin. |
make -f Makefile.make_extension eardbd.install | Reinstall only the EARDBD. |
make -f Makefile.make_extension eargmd.install | Reinstall only the EARGMD. |
make -f Makefile.make_extension reports.install | Reinstall only report plugins. |
Before compiling new libraries you have to install by typing make install
. Then you can run the ./configure
again, changing the MPICC
, MPICC_FLAGS
and MPI_VERSION
variables, or just opening the root Makefile and edit the same variables and MPI_BASE
, which just sets the MPI installation root path. Now type make full
to perform a clean compilation and make earl.install
, to install only the new version of the library.
If your MPI version is not fully compatible, please contact ear-s.nosp@m.uppo.nosp@m.rt@bs.nosp@m.c.es.
See the User guide to check the use cases supported and how to submit jobs with EAR.
EAR includes the specification files to create an rpm from an already existing installation. The spec file is placed at etc/rpms
. To create the RPM it is needed a valid installation from source. The RPM can be part of the system image. Visit the Requirements page for a quick overview of the requirements.
Execute the rpmbuild.sh
script to create the EAR rpm file. Once created, it can be included in the compute nodes images. It is recommened only when no more changes are expected on the installation. Once you have the rpm file, execute the following steps:
$(EAR_TMP)
in this guide for simplicity)./usr
and /etc
.rpm -ivh --relocate /usr=/new/install/path --relocate /etc=/new/etc/path ear.version.rpm
.You can also use the
--nodeps
if your dependency test fails.
*.in
are compiled to the ready to use version, replacing tags for correct paths. You will have more information of those files in the following pages. Check the next section for more information.rpm -e ear.version
to uninstall.The *.in
configuration files are compiled into etc/ear/ear.conf.template
and etc/ear/ear.full.conf.template
, etc/module/ear
, etc/slurm/ear.plugstack.conf
and various etc/systemd/ear*.service
. You can find more information in the configuration page. Below table describes the complet heriarchy of the EAR installation:
Directory | Content / description |
---|---|
/usr/lib | Libraries and the scheduler plugin. |
/usr/lib/plugins | EAR plugins. |
/usr/bin | EAR commands. |
/usr/bin/tools | EAR tools for coefficients computation. |
/usr/sbin | Privileged components: EARD, EARDBD, EARGMD. |
/etc/ear | Configuration files templates. |
/etc/ear/coeffs | Folder to store coefficient files. |
/etc/module | EAR module. |
/etc/slurm | EAR SLURM plugin configuration file. |
/etc/systemd | EAR service files. |
EAR uses some third party libraries. EAR RPM will not ask for them when installing but they must be available in LD_LIBRARY_PATH
when running an application and you want to use EAR. Depending on the RPM, different version must be required for these libraries:
Library | Minimum version | References |
---|---|---|
MPI | - | - |
MySQL* | 15.1 | MySQL or MariaDB |
PostgreSQL* | 9.2 | PostgreSQL |
Autoconf | 2.69 | Website |
GSL | 1.4 | Website |
These libraries are not required, but can be used to get additional functionality or metrics:
Library | Minimum version | References |
---|---|---|
SLURM | 17.02.6 | Website |
PBS** | 2021 | PBSPro or OpenPBS |
CUDA/NVML | 7.5 | CUDA |
CUPTI** | 7.5 | CUDA |
Likwid | 5.2.1 | Likwid |
FreeIPMI | 1.6.8 | FreeIPMI |
OneAPI/L0** | 1.7.9 | OneAPI |
LibRedFish** | 1.3.6 | LibRedFish |
** These will be available in next release.
Also, some drivers has to be present and loaded in the system when starting EAR:
Driver | File | Kernel version | References |
---|---|---|---|
CPUFreq | kernel/drivers/cpufreq/acpi-cpufreq.ko | 3.10 | Information |
Open IPMI | kernel/drivers/char/ipmi/*.ko | 3.10 | Information |
The best way to execute all EAR daemon components (EARD, EARDBD, EARGM) is by the unit services method.
NOTE EAR uses a MariaDB/MySQL server. The server must be started before EAR services are executed.
The way to launch the EAR daemons is via unit services. The generated unit services for the EAR Daemon, EAR Global Manager Daemon and EAR Database Daemon are generated and installed in $(EAR_ETC)/systemd
. You have to copy those unit service files to your systemd
operating system folder and then use the systemctl
command to run the daemons. Check the EARD, EARDBD, EARGMD pages to find the precise execution commands.
When using systemctl
commands, you can check messages reported to stderr
using journalctl
. For instance: journalctl -u eard -f
. Note that if NodeUseLog
is set to 1 in ear.conf
, the messages will not be printed to stderr
but to $EAR_TMP/eard.log
instead. DBDaemonUseLog
and GlobalmanagerUseLog
options in ear.conf
specifies the output for EARDBD and EARGM, respectivelly.
Additionally, services can be started, stopped or reloaded on parallel using parallel commands such as pdsh
. As an example: sudo pdsh -w nodelist systemctl start eard
.
In some cases, it might be a good idea to create a new install instead of updating your current one, like trying new configurations or when a big update is released.
The steps to do so are:
ear.conf
and coefficients) in the new one and update ear.conf
with the new ETC path and whatever changes may be needed./etc/systemd/system
folder (or equivalent, depending on your OS). Service files include ETC path and the absolute path for binaries./etc/slurm/plugstag.conf
with the new paths.Once all that is done, one should have two complete EAR installs that can be switched by changing the binaries that are executed by the services and changing the path in plugstag.conf
.
For a better overview of the installation process, return to the installation guide. To continue the installation, visit the configuration page to set up properly the EAR configuration file and the EAR SLURM plugin stack file.