![]() |
EAR
4.2.1
EAR Reference Manual
|
The following tables contain information directly related to applications executed on the system while EAR was monitoring. The main key is the JOBID.STEPID combination generated by the scheduler.
This tables contain periodic information gathered from the nodes. There is a single-node information table and an aggregated one to increase the speed of queries to get cluster-wide information.
ear.conf
).ereport
command and EARGM, as well as reducing database size (Periodic_metrics of older periods where precision at node level is not needed can be deleted and the aggregations can be used instead).ear.conf
. One record every T1 period (defined at ear.conf) is reported.This tables are the same as their non-learning counterparts, but are specifically used to store the applications executed during a learning phase.
NOTE In order to have GPU_signatures table created and Periodic_metrics containing GPU data, the databasease must be created (if you follow the
edb_create
approach, see the section down below) with GPUs enabled at the compilation time. See how to update from previous versions if you are updating EAR from a release not having GPU metrics.
To create the database a command (edb_create
) is provided by EAR, which can either create the database directly or provide the queries for the database creation so the administrator can use them or modify them at their discretion (any changes may alter the correct function of EAR's accounting).
Since a lot of data is reported by EAR to the database, EAR provides two commands to remove old data and free up space. These are intended to be used with a cron
job or a similar tool, but they can also be run manually without any issues. The two tools are edb_clean_pm
to remove periodic data accounting from nodes, and edb_clean_apps
to remove all the data related to old jobs.
For more information on this commands, check the commands' page on the wiki
When running edb_create
some tables might not be created, or may have some quirks, depending on some ear.conf
settings. The settings and alterations are as follows:
DBReportNodeDetail
: if set to 1, edb_create
will create two additional columns in the Periodic_metrics table for Temperature (in Celsius) and Frequency (in Hz) accounting.DBReportSigDetail
: if set to 1, Signatures will have additional fields for cycles, instructions, and FLOPS1-8 counters (number of instruction by type).DBMaxConnections
: this will restrict the number of maximum simultaneous commands connections.If any of the settings is set to 0, the table will have fewer details but the table's records will be smaller in stored size.
Any table with missing columns can be later altered by the admin to include said columns. For a full detail of each table's columns, run edb_create -o
with the desired ear.conf
settings.
There are various settings in ear.conf
that restrict data reported to the database and some errors might occur if the database configuration is different from EARDB's.
DBReportNodeDetail
: if set to 1, node managers will report temperature, average frequency, DRAM and PCK energy to the database manager, which will try to insert it to Periodic_metrics. If Periodic_metrics does not have the columns for both metrics, an error will occur and nothing will be inserted. To solve the error, set ReportNodeDetail
to 0 or manually update Periodic_metrics in order to have the necessary columns.DBReportSigDetail
: similarly to ReportNodeDetail
, an error will occur if the configuration differs from the one used when creating the database.DBReportLoops
: if set to 1, EARL detected application loops will be reported to the database, each with its corresponding Signature. Set to 0 to disable this feature. Regardless of the setting, no error should occur.If Signatures and/or Periodic_metrics have additional columns but their respective settings are set to 0, a NULL will be set in those additional columns, which will make those rows smaller in size (but bigger than if the columns did not exist).
Additionally, if EAR was compiled in a system with GPUs (or with the GPU flag manually enabled), another table to store GPU data will be created.
NOTE the nomenclature is modified from MySQL's type. Any type starting with
u
is unsigned.bigint
corresponds to an integer of 64 bits,int
is 32 andsmallint
is 16.For a detailed description of each field in any of the database's tables, see [here](EAR-database-table-descriptions).
NOTE This change only applies to the databases that have been created with the extended application signature (i.e. they have the FLOPS, instructions and cycles counters in their signatures).
Three new fields corresponding to L1, L2 and L3 cache misses have been added to the signatures.
A field in the Events table had its name changed to be more generic. One can do that with EITHER of the following commands:
Furthermore, some errors on big servers have been found due to the ids of a few fields being too small. To correct this, please run the following commands:
If GPUs are being used, also run:
Several fields have to be added in this update. To do so, run the following commands to the database's CLI client:
If no GPUs were used and they will not be used there are no changes necessary.
If GPUs were being used, type the following commands to the database's CLI client:
If no GPUs were being used but now are present, use the previous query plus the following one: