一、安装MFT
1,在NVIDIA网站按照OS版本和硬件架构下载合适的安装包,例如下图以Linux为例。
2,在OS中安装下载好的MFT。
[root@node1]# mkdir mft
[root@node1]# cd mft
[root@node1]# tar xzvf mft-4.26.1-3-x86_64-rpm.tgz
[root@node1]# cd mft-4.26.1-3
[root@node1]#./install.sh
二、查看网卡
1,首先需要启动MST。
[root@node1]# mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
2,使用mst status命令查看网卡,在这个输出中,可以看到在两个PCI插槽上有两个适配器卡。
[root@node1]# mst status
MST modules:
------------
MST PCI module loaded
MST PCI configuration module loaded
MST devices:
------------
/dev/mst/mt4103_pciconf0 - PCI configuration cycles access.
domain:bus:dev.fn=0000:81:00.0 addr.reg=88 data.reg=92
Chip revision is: 00
/dev/mst/mt4103_pci_cr0 - PCI direct access.
domain:bus:dev.fn=0000:81:00.0 bar=0xc8000000 size=0x100000
Chip revision is: 00
/dev/mst/mt4115_pciconf0 - PCI configuration cycles access.
domain:bus:dev.fn=0000:05:00.0 addr.reg=88 data.reg=92
Chip revision is: 00
3,可以通过lspci验证网卡类型(ConnectX-3 Pro和ConnectX-4)。
[root@node1]# lspci | grep Mellanox
05:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
05:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
81:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
三、收集网卡dump
Mellanox网卡dump可以通过mlxdump utility 或 mstdump utility进行收集。
通常使用其中一种进行dump收集即可。或由NVIDIA技术人员确定需要使用哪个工具。
3.1 mlxdump
1,通过mlxdump工具可以转储设备内部配置数据和其他内部数据(如计数器、状态机)。该数据可用于硬件故障处理。该工具有3种运行模式: [fast | normal | full]。通常使用full模式,full模式转储所有可用数据,所以收集日志的时间会长一些。
2,mlxdump命令格式。
mlxdump OPTION <command> [COMMAND OPTIONS] [-d|--device MstDevice] [-h|--help] [-v|--version]
-d|–device MstDevice | mst device name |
---|---|
-h|–help | Show help message and exit |
-v|–version | Show version and exit |
mstdump | Read mstdump information |
fsdump | Read Flow Steering information.Note: Reading flow steering information is supported in ConnectX-4 and above adapter cards. |
snapshot | Dump everything |
3,mlxdump命令示例。
以 “fast” 模式收集 “mlxdump.udmp”:
[root@node1]# mlxdump -d /dev/mst/mt4115_pciconf0 snapshot
以 “full” 模式收集 “mlxdump.udmp”:
[root@node1]# mlxdump -d /dev/mst/mt4115_pciconf0 snapshot -m full
以 “normal” 模式收集并将dump文件命名为 “mlxdump_13_1_2013.udmp”:
[root@node1]# mlxdump -d /dev/mst/mt4115_pciconf0 snapshot -m normal -o mlxdump_13_1_2023.udmp
生成 “flow steering” 信息:
[root@node1]# mlxdump -d /dev/mst/mt4115_pciconf0 fsdump --type=All --gvmi=0
生成 “mstdump” 信息:
[root@node1]# mlxdump -d /dev/mst/mt4115_pciconf0 mstdump
3.2 mstdump
1,mstdump工具转储设备内部配置寄存器。该转储文件由支持团队用于硬件故障排除。它可以应用于所有的NVIDIA设备。
2,mstdump命令格式。
# mstdump [-full] <mst device> > <dump file>_FW_VERSION
-h, –help | show help message and exit |
---|---|
-v, -version, –version | show program’s version number and exit |
-full, –full | Dump more expanded list of addresses |
-ignore_fail, –ignore_fail | Continue dumipng, even if some addresses fails |
-c CSV, -csv CSV, –csv CSV | Database path |
–cause address.offset | Specify address and offset |
–i2c_secondary I2C_SECONDARY | I2C secondary [0-127] |
3,命令示例。
[root@node1]# mstdump /dev/mst/mt4099_pci_cr0 > mt4099_12_16_2600.dmp
四、收集sysinfo-snapshot
1,自动化的sysinfo-snapshot工具旨在对服务器和Mellanox适配器上的所有配置和相关信息进行快照。
2,需要系统中已经安装由NVIDIA或OEM服务器厂家提供的OOB网卡驱动,然后运行驱动自带的命令 :sysinfo-snapshot.py,程序会在 /tmp 目录生成一个 tgz包。
[root@node01]# sysinfo-snapshot.py
Sysinfo-snapshot is still in process...please wait till completed successfully
Gathering the information may take a while, especially in large networks
Your patience is appreciated
------------------------------------------------------------
Running sysinfo-snapshot has ended successfully!
Temporary destination directory is /tmp/
Out file name is /tmp/sysinfo-snapshot-v3.4.0-node01-20240316-066386.tgz
3,如果驱动版本过低或者系统中使用的是in-box驱动,可单独下载sysinfo-snapshot来运行。
[root@node01]# tar xf sysinfo-snapshot_version_3_7_7.tgz
[root@node01]# ls
sysinfo-snapshot_version_3_7_7.tgz sysinfo-snapshot_version_3_7_7
[root@node01]# cd sysinfo-snapshot_version_3_7_7
[root@node01 sysinfo-snapshot_version_3_7_7]# ls
config.csv sysinfo-snapshot_v3.7.7.py
[root@node01 sysinfo-snapshot_version_3_7_7]# ./sysinfo-snapshot_v3.7.7.py
Sysinfo-snapshot is still in process...please wait till completed successfully
Gathering the information may take a while, especially in large networks
Your patience is appreciated
------------------------------------------------------------
Running sysinfo-snapshot has ended successfully!
Temporary destination directory is /tmp/
Out file name is /tmp/sysinfo-snapshot-v3.7.7-node01-20240316-065527.tgz
五、IB卡DUMP收集
1,如果网卡是IB卡,通常需要收集 ibdiagnet 日志,在IB子网中的任意一个节点上运行InfiniBand Management Tools。
2,命令示例。–pc表示重置所有fabric IB spec兼容的端口计数器(PortCounters和portcountersextend) RN、AR和HBF计数器。–pm_pause_time表示指定第一个计数器采样和第二个计数器采样之间等待的秒数。如果seconds给定值为0,则不会进行第二个计数器采样。(默认= 1)。-P all=1表示如果提供的计数器大于它提供的值,则打印它。如果使用了all
则所有计数器获得相同的阈值(默认为0)。–get_cable_info表示查询所有QSFP线缆信息,线缆信息将存储在“ibdiagnet2. Cable”中。-r表示提供关于fabric quality的报告。
3,运行如下命令,等待30秒后,将在默认目录 /var/tmp/ibdianget2 下生成相关文件。如果想指定生成日志的文件夹,可使用 -o参数然后打包整个目录。
[root@node01]# ibdiagnet --pc --pm_pause_time 30 -P all=1 --get_cable_info -r
[root@node01]# tar czf node01-ibdiagnet.tar.gz /var/tmp/ibdiagnet2/
tar: Removing leading `/' from member names
[root@node01]# ls node01-ibdiagnet.tar.gz
node01-ibdiagnet.tar.gz
4,以下转储文件由ibdiagnet生成(取决于ibdiagnet命令行参数或配置文件设置)。
Filename | Description |
---|---|
ibdiagnet2.log | log file |
ibdiagnet2.lst | Fabric links in LST format |
ibdiagnet2.net_dump | Fabric link dump including split cable mapping and FEC info |
ibdiagnet2.net_dump_ext | Extended fabric link dump with FEC, BER and additional phy data |
ibdiagnet2.sm | Subnet Managers |
ibdiagnet2.pm | IB spec compliant Ports Counters |
ibdiagnet2.mlnx_cntrs | Mellanox Diagnostic counters |
ibdiagnet2.fdbs | Unicast FDBsNote: Dump disabled by default (*). |
ibdiagnet2.mcfdbs | Multicast FDBs |
ibdiagnet2.ar | Adaptive routing tablesNote: Dump disabled by default (*). |
ibdiagnet2.far | Adaptive routing tables including SHIELD settings |
ibdiagnet2.far_flid | Adaptive routing tables including only non-local FLIDs |
ibdiagnet2.rn | SHIELD configuration tables |
ibdiagnet2.rnc | SHIELD counters (Old file, disable by default) |
ibdiagnet2.rnc2 | SHIELD, SHIELDv2, HBF counters |
ibdiagnet2.nodes_info | Nodes Information (FW version, etc) |
ibdiagnet2.db_csv | ibdiagnet internal database |
ibdiagnet2.pkey | Pkey tables |
ibdiagnet2.ppcc | Port Programmable Congestion Control file |
ibdiagnet2.vports | Virtualization: |
ibdiagnet2.vport_pkeys | virtualization pkey tables |
ibdiagnet2.aguid | alias GUIDs (ConnectX-3 only) |
ibdiagnet2.slvl | SLVL tables of the fabric switches |
ibdiagnet2.cables | Cable info |
Ibdiagnet2.flid | FLIDs configuration details |
ibdiagnet2.rails | “rails optimized” validation tests details |
ibdiagnet2.sharpibdiagnet.sharp_pm | SHARP data |
ibdiagnet2.ibnetdiscover | Discovered network in “ibnetdiscover” format |