Monitor the System Event Log with ipmitool

ipmitool examples.

The Intelligent Platform Management Interface (IPMI) architecture provides a command line interface to the Baseboard Management Controller (BMC) on motherboards in Cray® CS™ blades/servers. IPMI is a standardized protocol to manage and monitor a system in-band or out-of-band. As a result, IPMI operates independently of the host OS and interfaces directly with the hardware. IPMI allows a system administrator to monitor system health and manage computer events from a remote location. IPMI provides features for monitoring, logging, recovery, and inventory control through hardware and firmware. These functions are provided independent of the main CPU, BIOS, and OS.

IPMI functions can be performed in these different states:
  • Before the OS boots, to enable remote monitoring or BIOS setting changes.
  • When the system is powered down, but still connected to power.
  • After OS or system failure, IPMI can be used when inband remote SSH login to the operating system is not available.

Functions of ipmitool

The ipmitool utility has three primary functions:
  • Read the System Event Log (SEL) generated by the BMC.
  • Interpret the Sensor Data Repository (SDR) for system temperatures, voltages, etc.
  • Compare SDR readings to defined thresholds to see if components are operating within operating specification ranges.

ipmitool service

Normally, the ipmi service is running on the host system. Even if it isn't running, the BMC is still capturing data from sensors and storing it in BMC memory.

This example shows that ipmitool doesn't provide any information and a /dev/ipmi file doesn't exist until the ipmi service is started/restarted.

[root@prod-1 /]# ipmitool mc info
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Get Device ID command failed
[root@prod-1 /]# ls -la /dev/ipmi*
ls: cannot access /dev/ipmi*: No such file or directory
[root@prod-1 /]# chkconfig ipmi on
Note: Forwarding request to 'systemctl enable ipmi.service'.
Failed to issue method call: File exists
[root@prod-1 /]# service ipmi restart
Redirecting to /bin/systemctl start  ipmi.service
[root@prod-1 /]# ls -la /dev/ipmi*
crw------- 1 root root 246, 0 Jul 20 13:17 /dev/ipmi0
[root@prod-1 /]# ipmitool mc info
Device ID                 : 33
Device Revision           : 1
Firmware Revision         : 1.20
IPMI Version              : 2.0
Manufacturer ID           : 343
Manufacturer Name         : Intel Corporation
Product ID                : 77 (0x004d)
Product Name              : Unknown (0x4D)
Device Available          : yes
Provides Device SDRs      : no
...

Remote system IP address

The -H option must be included on the command line in order to use the LAN interface (lanplus) option. The lanplus option provides communication to the BMC over an Ethernet LAN connection. The -H option requires entering the IP address assigned to the BMC on the remote system; this is not the remote system's IP address. The BMC shares an Ethernet port which in most cases is the eth0 port.

For most systems, the third octet of the eth0 address is replaced with 128 and that address is assigned as the BMC address. For example:
10.4.0.14   - eth0 (Management Network on remote system)
10.4.128.14 - BMC IP address on remote system
To find the BMC IP address, login to the remote system and run this ipmitool command:
[root@mgmt1 ~]# ssh root@prod-0008 'ipmitool lan print 1'
Set in Progress         : Set Complete
Auth Type Support       : MD5 PASSWORD
Auth Type Enable        : Callback : MD5 PASSWORD
                        : User     : MD5 PASSWORD
                        : Operator : MD5 PASSWORD
                        : Admin    : MD5 PASSWORD
                        : OEM      :
IP Address Source       : Static Address
IP Address              : 10.10.128.14
Subnet Mask             : 255.255.0.0
MAC Address             : 00:1e:67:67:06:62
SNMP Community String   : public
IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
...

User access privileges

Channel privilege limits determine the maximum privileges that each user can have on a given channel. Privilege levels can be displayed for channel 1 (typically en0 for BMC traffic), using the user list command:

[root@mgmt1 /]# ipmitool user list 1
ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
1                    true    false      true       ADMINISTRATOR
2   root             false   true       true       ADMINISTRATOR
3   test1            false   false      true       ADMINISTRATOR
4   ace              true    false      true       OPERATOR
5   test3            false   false      true       ADMINISTRATOR
[root@mgmt1 /]#

There are several privilege levels, but the two levels typically used are operator and administrator.

OPERATOR (level 3)
All BMC commands allowed, except for configuration commands. Operator privilege does not allow disabling of channels or changing user access privileges.
ADMINISTRATOR (level 4)
All BMC commands are allowed, including configuration commands.

The channel setaccess command is used to set/change privileges. Refer to the following example.

Create a new IPMI user

  1. First, list the current users:
    [root@prod-0001 ~]# ipmitool user list 1
    ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
    1                    true    false      true       ADMINISTRATOR
    2   root             false   true       true       ADMINISTRATOR
    3   test1            false   false      true       ADMINISTRATOR
    4   ace              true    false      true       OPERATOR
    [root@prod-0001 ~]#
    
  2. Create a new user name (admin) and password (admin). The new admin userid will be number 5.
    [root@prod-0001 ~]# ipmitool user set name 5 admin
    [root@prod-0001 ~]# ipmitool user set password 5 admin
    [root@prod-0001 ~]# ipmitool user list 1
    ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
    1                    true    false      true       ADMINISTRATOR
    2   root             false   true       true       ADMINISTRATOR
    3   test1            false   false      true       ADMINISTRATOR
    4   ace              true    false      true       OPERATOR
    5   admin            true    false      false      NO ACCESS
    
  3. Assign privileges to the new user (ADMINISTRATOR, level=4 for this example).

    IPMI Msg must be true (ipmi=on) to enable remote access to this system using this new admin user.

    channel setaccess <channel number> <userid> [<callin=on|off>] [<link=on|off>]  
                      [<ipmi=on|off>] [<privilege=level>] 
    [root@prod-0001 ~]# ipmitool channel setaccess 1 5 link=on ipmi=on privilege=4
    [root@prod-0001 ~]# ipmitool user list 1
    ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
    1                    true    false      true       ADMINISTRATOR
    2   root             false   true       true       ADMINISTRATOR
    3   test1            false   false      true       ADMINISTRATOR
    4   ace              true    false      true       OPERATOR
    5   admin            true    true       true       ADMINISTRATOR
    [

Run ipmitool on a Remote System

The examples below all use remote access techniques. However, simply removing the –H, -I, -U, and -P parameters and their associated values from the command string enables the same command to be run on the local system. The ipmitool can also be run through an SSH session to the remote system.

ipmitool -H <remote BMC IP > -I <LANinterface> -U <username> -P <password> 
         -L <privilege> <commands/options>

-I lanplus: To connect to a remote interface using -I, a username (-U) and password (-P) must be applied.

-L : The privilege option defaults to administrator privilege and can be omitted if the user (-U) on the remote system has administrator privileges.

-P : This password option is optional on the command line. If it isn't included, a prompt for the password appears before the command can execute, as shown in some examples below.

Display errors in the SEL log

[root@mgmt2 ~]# ipmitool -Ilanplus -H10.10.128.3 -Uadmin -Padmin sel elist
   1 | 03/11/2016 | 10:16:51 | Event Logging Disabled System Event Log | Log area reset | Asserted
   2 | 03/11/2016 | 10:25:02 | Power Unit Pwr Unit Status | Power off/down | Asserted
   3 | 03/11/2016 | 10:26:32 | Power Unit Pwr Unit Status | Power off/down | Deasserted
   4 | 03/11/2016 | 10:26:40 | System Event BIOS Evt Sensor | Timestamp Clock Sync | Asserted
   5 | 03/11/2016 | 10:26:40 | System Event BIOS Evt Sensor | Timestamp Clock Sync | Asserted
   6 | 03/11/2016 | 10:27:11 | System Event BIOS Evt Sensor | OEM System boot event | Asserted
   7 | 03/11/2016 | 10:30:48 | Power Unit Pwr Unit Status | Power off/down | Asserted
   8 | 03/11/2016 | 10:32:03 | Power Unit Pwr Unit Status | Power off/down | Deasserted
   9 | 03/11/2016 | 10:32:11 | System Event BIOS Evt Sensor | Timestamp Clock Sync | Asserted
   a | 03/11/2016 | 10:32:11 | System Event BIOS Evt Sensor | Timestamp Clock Sync | Asserted
   b | 03/11/2016 | 10:32:41 | Power Unit Pwr Unit Status | Power off/down | Asserted
   c | 03/11/2016 | 10:33:42 | Power Unit Pwr Unit Status | Power off/down | Deasserted
   d | 03/11/2016 | 10:33:50 | System Event BIOS Evt Sensor | Timestamp Clock Sync | Asserted
   e | 03/11/2016 | 10:33:51 | System Event BIOS Evt Sensor | Timestamp Clock Sync | Asserted
   f | 03/11/2016 | 10:34:22 | System Event BIOS Evt Sensor | OEM System boot event | Asserted
  10 | 04/06/2016 | 23:12:10 | Power Unit Pwr Unit Status | Power off/down | Asserted

Check Remote Power Status

[root@mgmt1 /]# ipmitool -H 10.4.128.2 -Ilanplus -Uace -Pace -Loperator power status
Chassis Power is on
[root@mgmt1 /]#
Check Server/Blade Status
[root@mgmt1 /]# ipmitool -H 10.4.128.2 -Ilanplus -Uadmin -Padmin -a chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     :
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false
Sleep Button Disable : not allowed
Diag Button Disable  : allowed
Reset Button Disable : allowed
Power Button Disable : allowed
Sleep Button Disabled: false
Diag Button Disabled : false
Reset Button Disabled: false
Power Button Disabled: false
[root@mgmt1 /]#

Display complete sensor data

[root@mgmt2 ~]# ipmitool -Ilanplus -H10.10.128.3 -Uadmin -Padmin sensor list
Pwr Unit Status  | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
IPMI Watchdog    | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
SMI TimeOut      | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
System Event Log | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
System Event     | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
Button           | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
VR Watchdog      | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
SSB Therm Trip   | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
IO Mod Presence  | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
SAS Mod Presence | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
BMC Health       | 0x0      | discrete   | 0x0000| na    | na    | na       | na       | na       | na
BB Inlet Temp    | 19.000   | degrees C  | ok    | na    | 5.000 | 10.000   | 60.000   | 65.000   | na
SSB Temp         | 44.000   | degrees C  | ok    | na    | 2.000 | 5.000    | 98.000   | 103.000  | na
BB BMC Temp      | 28.000   | degrees C  | ok    | na    | 5.000 | 10.000   | 85.000   | 90.000   | na
P1 VR Temp       | 21.000   | degrees C  | ok    | na    | 5.000 | 10.000   | 110.000  | 115.000  | na
IB Temp          | 28.000   | degrees C  | ok    | na    | 5.000 | 10.000   | 110.000  | 115.000  | na
LAN NIC Temp     | 40.000   | degrees C  | ok    | na    | 5.000 | 10.000   | 115.000  | 120.000  | na
P1 Status        | 0x0      | discrete   | 0x8000| na    | na    | na       | na       | na       | na
P2 Status        | 0x0      | discrete   | 0x8000| na    | na    | na       | na       | na       | na
P1 Therm Margin  | -77.000  | degrees C  | ok    | na    | na    | na       | na       | na       | na
P2 Therm Margin  | -78.000  | degrees C  | ok    | na    | na    | na       | na       | na       | na
P1 Therm Ctrl %  | 0.000    | percent    | ok    | na    | na    | na       | 30.000   | 50.000   | na
P2 Therm Ctrl %  | 0.000    | percent    | ok    | na    | na    | na       | 30.000   | 50.000   | na
...

Display general information about all sensors (sensor data record)

If the elist parameter is used, it will add the entity ID (2nd column) and the asserted discrete states (5th column).
[root@mgmt1 /]# ipmitool -H 10.4.128.2 -I lanplus -U ace -L OPERATOR -a sdr elist
Password:
Pwr Unit Status  | 01h | ok  | 21.1 |
Pwr Unit Redund  | 02h | ok  | 21.1 | Fully Redundant
IPMI Watchdog    | 03h | ok  |  7.1 |
Physical Scrty   | 04h | ok  | 23.1 |
FP NMI Diag Int  | 05h | ok  | 12.1 |
SMI Timeout      | 06h | ok  |  7.1 |
System Event Log | 07h | ok  |  7.1 |
System Event     | 08h | ok  |  7.1 |
Button           | 09h | ok  |  7.1 |
VR Watchdog      | 0Bh | ok  |  7.1 |
Fan Redundancy   | 0Ch | ok  | 29.1 | Fully Redundant
BMC FW Health    | 10h | ok  |  7.1 |
System Airflow   | 11h | ok  | 23.1 | 30 CFM
BB P1 VR Temp    | 20h | ok  |  7.1 | 19 degrees C
Front Panel Temp | 21h | ok  | 12.1 | 13 degrees CBB P2 VR Temp    | 23h | ok  |  7.1 | 19 degrees C
BB Vtt 2 Temp    | 24h | ok  |  7.1 | 25 degrees C
BB Vtt 1 Temp    | 25h | ok  |  7.1 | 18 degrees C
HSBP 1 Temp      | 29h | ok  |  7.1 | 26 degrees C
Exit Air Temp    | 2Eh | ok  |  7.1 | 22 degrees C
System Fan 1     | 30h | ok  | 29.1 | 3136 RPM
... 
Display specific SDR type (type: temp/current/fan/voltage/button/power supply)
[root@mgmt1 /]#  ipmitool -H 10.4.128.2 -I lanplus -Uace -Loperator -a sdr type fan
Password:
Fan Redundancy   | 0Ch | ok  | 29.1 | Fully Redundant
System Fan 1     | 30h | ok  | 29.1 | 3136 RPM
System Fan 2     | 32h | ok  | 29.2 | 3136 RPM
System Fan 3     | 34h | ok  | 29.3 | 3136 RPM
System Fan 4     | 36h | ok  | 29.4 | 3087 RPM
System Fan 5     | 38h | ok  | 29.5 | 3136 RPM
Fan 1 Present    | 40h | ok  | 29.1 | Device Present
Fan 2 Present    | 41h | ok  | 29.2 | Device Present
Fan 3 Present    | 42h | ok  | 29.3 | Device Present
Fan 4 Present    | 43h | ok  | 29.4 | Device Present
Fan 5 Present    | 44h | ok  | 29.5 | Device Present
PS1 Fan Fail     | A0h | ok  | 10.1 |
PS2 Fan Fail     | A4h | ok  | 10.2 |
[root@mgmt1 /]# 

Turn On Chassis ID LED

[root@mgmt1 /]# ipmitool chassis identify force    # Blink the blade/server ID LED indefinitely
Chassis identify interval: indefinite
[root@mgmt1 /]# ipmitool chassis identify 0    # Turn off the ID LED
Chassis identify interval: off

Display FRUs

[root@mgmt1 /]# ipmitool -H 10.4.128.2 -I lanplus -Uace -Pace -Loperator -a fru
FRU Device Description : Builtin FRU Device (ID 0)
 Chassis Type          : Rack Mount Chassis
 Chassis Part Number   : SERVER-2628X-MN
 Chassis Serial        : QSGR42103641
 Chassis Extra         : Cray Inc.
 Chassis Extra         : ................
 Board Mfg Date        : Mon May 26 14:36:00 2014
 Board Mfg             : Intel Corporation
 Board Product         : S2600GZ
 Board Serial          : QSGR42103641
 Board Part Number     : G11481-354
 Product Manufacturer  : Intel Corporation
 Product Name          : S2600GZ
 Product Part Number   : SERVER-2628X-MN
 Product Version       : ....................
 Product Serial        : 101149000
 Product Asset Tag     : ....................

FRU Device Description : Pwr Supply 1 FRU (ID 2)
 Product Manufacturer  : DELTA
 Product Name          : DPS-750XB A
 Product Part Number   : E98791-010
 Product Version       : 05
 Product Serial        : E98791D1417080061
...