Mellanox InfiniBand Firmware
Identify and upgrade Mellanox IB firmware.
Mellanox Firmware ID Numbers (PSID)
Firmware for Mellanox products is listed in the following table. Identify the Mellanox product based on part number and the PSID (Parameter-Set [firmware] ID number). Firmware images are custom built for specific PSIDs and mismatching of PSID between firmware and HCA is not allowed. Information on finding the PSID is provided in the next section.
Links to firmware packages on the internal CrayPort site are provided in the table. Other firmware can be downloaded from the Mellanox Firmware Downloads page: http://www.mellanox.com/page/firmware_download
If the PSID is not listed in the table: Create a General Inquiry case in the CrayPort portal. Include the PSID, current firmware version, and part number.
Mellanox Product | PSID | FW Version | Image Type | Description | Mfg Model | Cray Part |
IB Switch | MT_1010310021 | 9.3.6000 | FS2 -SwitchX | 36 Port FDR | MSX6025F-1SFS | 101032600 |
MT_1010210021 | 9.3.6000 | FS2 -SwitchX | 36 Port FDR | MSX6025F-1SFR | 167-00453A | |
MT_1010210026 | 9.3.6000 | FS2 -SwitchX | 36 Port FDR-10 | MSX6025T-1SFR | 101182700 | |
MT_0F90110002 | 7.4.3000 | Infiniscale IV | 18 Port QDR | MIS5023Q | CS-XQB4-MLNXIS502318 | |
MT_0D00110003 | 7.4.3000 | Infiniscale IV | 36 Port QDR | MIS5025Q-1BFC | 167-00347A | |
MT_1880110032 | 11.204.0124 | SwitchIB | 36 Port EDR | MSB7790-ES2F | 101278500 | |
IB Switch Managed | MT_1010310020 | 9.3.5080 | FS2 -SwitchX | 36 Port FDR | MSX6036 | 100911600 |
9.3.5080 | FS2 -SwitchX | 36 Port FDR | MSX6710 | CEMLNX-MSX6710 | ||
9.3.5080 | FS2 -SwitchX | 36 Port DDR | MSX1710 | 101344000 | ||
9.3.5080 | FS2 -SwitchX | 216 Port FDR | MSX6512 | 100990400 | ||
9.3.5080 | FS2 -SwitchX | 324 Port FDR | MSX6518 | 100882100 | ||
HCA Card | MT_0D90110009 | 2.9.1000 | ConnectX2 | QDR | MHQH19B-XTR | 132-00121A |
MT_0FC0110009 | 2.9.1000 | ConnectX2 | QDR | MHQH29CXSR/XTR | 100818100 | |
MT_1060110018 | 2.36.5000 | ConnectX3 | QDR | MCX353A-QCBT | 132-00137B | |
MT_1100120019 | 2.36.5000 | ConnectX3 | FDR | MCX353A-FCBT | 132-00145A | |
MT_1230110019 | 10.14.2036 | ConnectIB | FDR | MCB191A-FCAT | 101082500 | |
MT_1240110019 | 10.14.2036 | ConnectIB | FDR | MCB192A-FCAT | 101320300 | |
MT_1220110019 | 10.14.2036 | ConnectIB | FDR | MCB193A-FCAT | 132-00158A | |
MT_1210110019 | 10.14.2036 | ConnectIB | FDR | MCB194A-FCAT | 132-00159A | |
MT_1090120019 | 2.36.5000 | ConnectX3 | FDR | MCX354AFCB_A2-A5 | 100882000 | |
MT_2180110032 | 12.12.1100 | ConnectX4 | EDR | MCX455A-ECAT | 101268500 | |
MT_2190110032 | 12.12.1100 | ConnectX4 | EDR | MCX456A-ECAT | 101278400 | |
Onboard | INCX-3I358C10551 | 2.36.5000 | ConnectX3 | FDR | S2600JFF (MT4099) | |
INCX-3I358C10501 | 2.36.5000 | ConnectX3 | QDR | S2600JFQ (MT4099) | ||
INCX-3I358E10201 | 2.36.5000 | ConnectX3 | QDR | S2600WPQ (MT4099) | ||
INCX-3I358E10251 | 2.36.5000 | ConnectX3 | FDR | S2600WPF (MT4099) | ||
INCX-3I355920151 | 2.36.5000 | ConnectX3 | FDR-14 | S2600GZF (MT4099) | ||
INCX-3I355922151 | 2.36.5000 | ConnectX3 | FDR-14 | S2600GZF (MT4099) | ||
INT0030100001 | 10.12.0780 | ConnectX3 | FDR | S2600KPF (MT4113) | ||
INT0040100001 | 10.12.0780 | ConnectX3 | FDR | S2600TPF (MT4113) | ||
IB Switch Managed | 3.4.3002 | FS2-SwitchX | 108-Port FDR | MSX6506 | 100902700 | |
MT_0D00110012 | 7.4.2360 | Infiniscale IV | 36 Port QDR | MIS5030Q-1SFC | 167-0348A |
Mellanox Firmware for Switches and Cards
- Start the MST driver set and list mst devices:
[root@prod-7 ~]# mst start Starting MST (Mellanox Software Tools) driver set Loading MST PCI module - Success Loading MST PCI configuration module - Success Create devices [root@prod-7 ~]#
- Display device status:
[root@prod-7 /]# mst status -v MST modules: ------------ MST PCI module loaded MST PCI configuration module loaded PCI devices:------------ PCI devices: ------------ DEVICE_TYPE MST PCI RDMA NET NUMA ConnectX3(rev:1) /dev/mst/mt4099_pciconf0 ConnectX3(rev:1) /dev/mst/mt4099_pci_cr0 01:00.0 mlx4_0 net-ib0 0 [root@prod-7 /]#
- Display firmware version and PSID (firmware identification number) for the device (q = query):
[root@prod-7 /]# flint -d 01:00.0 q /* OR flint -d /dev/mst/mt4099_pci_cr0 query Image type: FS2 FW Version: 2.11.1308 Device ID: 4099 Description: Node Port1 Port2 Sys image GUIDs: 001e670300670c2c 001e670300670c2d 001e670300670c2e 001e670300670c2f MACs: 001e67670c2d 001e67670c2e VSD: n/a PSID: INCX-3I358C10551 [root@prod-7 /]#
- Compare the firmware version against the version listed for the PSID in the Mellanox PSID Firmware table.
If the PSID is not listed in the table, submit a General Inquiry case in the CrayPort portal.
- Download the firmware zip package to the /tmp directory on the management node. Then unzip the files.
If the compute nodes are not able to access the /tmp directory on the management node, the firmware files must be copied to the compute nodes.
Burn the Binary Firmware Image to Flash Memory
- flint and mstflint
- flint - is included in the Mellanox Firmware Tools (MFT) package. It comes from the mft rpm which only comes with MLNX OFED.
- Use the flint utility to burn the binary firmware image to the Mellanox device:
The -y parameter forces the mode to noninteractive and presupposes a "yes" when prompted.
flint -y -d <MST_device_name|bus #:device.function> -i <firmware-binary> burn
[root@prod-7 /tmp]# flint -y -d 01:00.0 -i fw-ConnectX3-rel-2_36_5000-ConnectX3-A1-JFP-FDR.bin burn Current FW version on flash: 2.11.1308 New FW version: 2.36.5000 Burning FS2 FW image without signatures - OK Restoring signature - OK [root@prod-7 tmp]#
- Reboot the system.
- Verify the new firmware version.
[root@prod-7 ~]# flint -d 01:00.0 q Image type: FS2 FW Version: 2.36.5000 FW Release Date: 26.1.2016 Product Version: 02.36.50.00 Device ID: 4099 Description: Node Port1 Port2 Sys image GUIDs: 001e670300670c2c 001e670300670c2d 001e670300670c2e 001e670300670c2f MACs: 001e67670c2d 001e67670c2e VSD: n/a PSID: INCX-3I358C10551 [root@prod-7 ~]#
Display Information About the Local HCA
nid00009:~ # ibv_devinfo |grep fw_ver fw_ver: 2.11.500 nid00009:~ #
Identify All HCAs in the Fabric
[root@mgmt1 ~]# ibhosts Ca : 0x001e6703003e35f7 ports 1 "snx11022n003 HCA-1" Ca : 0x0050cc03007926d7 ports 2 "snx11022n005 HCA-1" Ca : 0x001e6703003e41b7 ports 1 "snx11022n001 HCA-1" Ca : 0x001e67030047ecba ports 1 "snx11022n000 HCA-1" Ca : 0x001e6703003e1b17 ports 1 "snx11022n002 HCA-1" Ca : 0x0050cc0300798200 ports 2 "snx11022n004 HCA-1" Ca : 0x0002c90300047cf8 ports 2 "lake-cmc mlx4_0" Ca : 0xf4521403003446c0 ports 1 "blue-0004 HCA-1" Ca : 0x001e670300670c2c ports 1 "green-0004 HCA-1" Ca : 0x001e670300670624 ports 1 "blue-0003 HCA-1" Ca : 0x001e670300670824 ports 1 "green-0003 HCA-1" Ca : 0x001e670300670bec ports 1 "blue-0002 HCA-1" Ca : 0x001e67030066da44 ports 1 "green-0002 HCA-1" Ca : 0x001e67030066d9d4 ports 1 "blue-0001 HCA-1" Ca : 0x001e670300670664 ports 1 "green-0001 HCA-1" Ca : 0xf452140300452400 ports 1 "leaf HCA-3" Ca : 0xf4521403003446e0 ports 1 "mgmt2 HCA-1" Ca : 0xf4521403004524c0 ports 1 "mgmt1 HCA-1" [root@mgmt1 ~]#
Identify All Switches in the Fabric
[root@mgmt1 ~]# ibswitches Switch : 0x0002c90200430780 ports 36 "MF0;switch-11a11a:IS5035/U1" enhanced port 0 lid 17 lmc 0 Switch : 0x0002c9020042b380 ports 36 "MF0;switch-119ba2:IS5035/U1" enhanced port 0 lid 18 lmc 0 Switch : 0x0002c90200450318 ports 36 "MF0;ib-switch-1:IS5035/U1" enhanced port 0 lid 11 lmc 0 Switch : 0xf45214030089daa0 ports 36 "SwitchX - Mellanox Technologies" base port 0 lid 22 lmc 0 [root@mgmt1 ~]#
Identify Installed OpenFabrics Software Packages
[root@mgmt1 /]# ofed_info |head -1 MLNX_OFED_LINUX-2.4-1.0.4 (OFED-2.4-1.0.4): [root@mgmt1 /]# rpm -qa ofed* ofed-scripts-2.4-OFED.2.4.1.0.4.x86_64 [root@mgmt1 /]# ofed_info MLNX_OFED_LINUX-2.4-1.0.4 (OFED-2.4-1.0.4): ar_mgr: ofed/MLNX_OFED_LINUX-2.4-1.0.1/SRPMS/ar_mgr-1.0-0.26.g89dd0f0.src.rpm bupc: ofed/MLNX_OFED_LINUX-2.4-1.0.1/SRPMS/bupc-2.18.0-423.src.rpm cc_mgr: ofed/MLNX_OFED_LINUX-2.4-1.0.1/SRPMS/cc_mgr-1.0-0.25.g89dd0f0.src.rpm dapl: ofed/MLNX_OFED_LINUX-2.4-1.0.1/SRPMS/dapl-2.1.3mlnx-OFED.2.4.37.gb00992f.src.rpm dump_pr: ofed/MLNX_OFED_LINUX-2.4-1.0.1/SRPMS/dump_pr-1.0-0.22.g7764b1e.src.rpm ...
Run Mellanox Self Test
- HCA firmware version
- Kernel architecture
- Driver version
- Number of active HCA ports along with their states
- Node GUID
[root@mgmt1 /]# hca_self_test.ofed ---- Performing Adapter Device Self Test ---- Number of CAs Detected ................. 2 PCI Device Check ....................... PASS Kernel Arch ............................ x86_64 Host Driver Version .................... MLNX_OFED_LINUX-2.4-1.0.4 (OFED-2.4-1.0.4): 2.6.32-504.el6.x86_64 Host Driver RPM Check .................. PASS Firmware on CA #0 VPI .................. v2.33.5000 Firmware Check on CA #0 (VPI) .......... PASS Firmware on CA #1 NIC .................. v2.33.5000 Firmware Check on CA #1 (NIC) .......... PASS Host Driver Initialization ............. PASS Number of CA Ports Active .............. 2 Port State of Port #1 on CA #0 (VPI)..... UP 4X FDR (InfiniBand) Port State of Port #1 on CA #1 (NIC)..... UP 1X QDR (Ethernet) Port State of Port #2 on CA #1 (NIC)..... DOWN (Ethernet) Error Counter Check on CA #0 (VPI)...... PASS Error Counter Check on CA #1 (NIC)...... NA (Eth ports) Kernel Syslog Check .................... PASS Node GUID on CA #0 (VPI) ............... f4:52:14:03:00:45:24:c0 Node GUID on CA #1 (NIC) ............... f4:52:14:03:00:88:ef:a0 ------------------ DONE --------------------- [root@mgmt1 /]#