ironic-inspector是一个用于硬件自检的辅助型服务,它可以对被ironic组件管理的裸金属节点进行硬件自检,通过在裸金属节点上运行内存系统,发现裸金属节点的硬件信息,例如CPU数量和型号、内存容量、磁盘数量和型号、各种PCI设备等等,最终将这些信息记录于ironic组件的数据库中。
ironic-inspector的存在拓宽了ironic组件发现裸金属节点硬件信息的能力。在没有ironic-inspector之前,ironic所获取的裸金属节点信息来源于用户的手动输入,这不但效率低下,在准确性方面也有所欠缺;而通过ironic-inspector以及IPA(Ironic Python Agent)的配合,裸金属节点硬件信息的发现能力理论上可以达到极致。
ironic-inspector自检时序如下图所示:
sequenceDiagram
ironic->>ironic inspector: 发送自检请求,/v1/introspection/{node}
ironic inspector-->>ironic: HTTP 202,已接受
ironic inspector->>ironic inspector: 检查节点状态,配置PXE
ironic inspector->>裸金属节点: 重启节点,等待回调
裸金属节点->>裸金属节点: 从ramdisk启动,收集硬件信息
裸金属节点->>ironic inspector: 返回收集的数据
ironic inspector->>ironic inspector: 处理数据
ironic inspector->>ironic: 更新节点的属性,并创建缺失的ironic port
自检的具体流程如下:
裸金属节点被注册且状态为manageable
通过API或者CLI调用ironic-inspector的自检接口
ironic-inspector接收到自检请求,开始自检
ramdisk收集所需要的信息,然后将其返回给ironic-inspector
ironic-inspector接收到来自ramdisk的数据,开始处理数据
节点被重新置为manageable状态,等待纳管
裸金属节点状态指从ironic组件的角度,一台裸金属节点所拥有的用于区分不同可执行操作的状态。裸金属状态机介绍可参考如下文档:Bare Metal State Machine — ironic 18.1.1.dev3 documentation (openstack.org)。
ironic-inspector要求裸金属节点必须处于manageable状态,才能够进行自检,在自检完成后回到manageable状态,自检过程中裸金属节点的状态如下图所示。
stateDiagram-v2
[*]-->enroll
enroll-->verifying: manage (via API)
verifying-->manageable: done
verifying-->enroll: fail
manageable-->inspecting: inspect (via API)
inspecting-->inspect_wait: wait
inspect_wait-->manageable: done
inspecting-->manageable: done
inspecting-->inspect_failed: fail
inspect_wait-->inspect_failed: fail
inspect_wait-->inspect_failed: abort (via API)
inspect_failed-->manageable: manage (via API)
inspect_failed-->inspecting: inspect (via API)
ironic-inspector支持在自检过程中运行一些简单的规则,这种规则为json格式,由一套专门的API来管理,这些规则会在处理完所有的钩子函数后运行。
一条规则包含条件语句和动作语句两部分。如果自检的数据符合判断条件,那么就会在节点上运行这些动作。
规则示例:
{
"description": "...",
"actions": [...],
"conditions": [...],
"scope": "SCOPE"
}
条件语句示例:
{"field": "data://inventory.cpu.architecture", "op": "eq", "value": "x86_64"}
{"field": "node://properties.cpus", "op": "eq", "value": "16"}
如上展示了两条条件语句,一条条件语句由如下字段组成:
data://
或node://
区分比较的数据来自于自检数据或是节点的属性,若忽略默认为自检数据动作语句示例:
{"action": "set-attribute", "path": "/driver_info/ipmi_address", "value": "{data[inventory][bmc_address]}"}
一条动作语句由如下字段组成:
action:表示执行的动作,可选如下选项
path
和value
字段的配合name
和value
字段的配合unique
字段为True,那么会覆盖而不是追加name
字段的配合name
字段的配合message
字段的配合,表示失败信息默认情况下,自检规则会作用于所有自检的裸金属节点,如果想要某条规则只作用于特定的节点,那么可以使用scope
字段限定规则的使用范围,该字段需要同时设置在规则和节点上才能生效。
在节点上设置inspection_scope
属性:
baremetal node set --property inspection_scope="SCOPE" <node>
scope
字段很少才会用到,且和条件语句的应用场景有些重合。
插件(Plugin)是ironic-inspector组件的重要组成部分,它通过插件处理自检的数据,并将数据更新到节点中。
每种插件均提供before_processing和before_update两种钩子函数:
ironic-inspector默认提供的插件如下:
RamdiskErrorHook:报告来自ramdisk的错误
RootDiskSelectionHook:通过ironic root_device字段选择root disk,该hook必须在schedulerHook之前,否则root_disk字段不会更新
SchedulerHook:检查并更新用于节点调度的基本属性,如CPU个数和架构、内存容量、磁盘容量等等
ValidateInterfacesHook:检测网络接口信息,创建新的ironic port,删除自检数据中不存在的ironic port,并按实际情况为这些port设置pxe_enabled
标记
before_processing
before_update:创建/删除ironic port,使之与实际相符
CapabilitiesHook:探测裸金属机器的capability,包括boot mode、cpu flag等
PciDevicesHook:确认裸金属机器上PCI设备的型号与数量
local_link_connection:处理lldp包中的必选字段,用于向ironic port写入local_link_connection_info的port_id和switch_id字段
LLDPBasicProcessingHook:处理自检数据中的lldp数据,用途待定
RaidDeviceDetection:处理创建raid后的root device
AccelDevicesHook:用于区分不同的加速设备
ExampleProcessingHook:记录自检数据的输入/输出
ironic-inspector具备为ironic注册新裸金属节点的能力。当收到来自节点的自检数据,且该节点无法被识别时,ironic-inspector会调用enroll_node_not_found_hook
函数为ironic注册裸金属。
为了启用发现能力,需要在ironic-inspector的配置文件中设置node_not_found_hook
字段为enroll
,并且设置enroll_node_driver
和enroll_node_fields
字段。
[processing]
node_not_found_hook = enroll
[discovery]
enroll_node_driver = ipmi
# 用于设置注册裸金属时添加的字段
enroll_node_fields = management_interface:ipmitool,resource_class:baremetal
在调用enroll_node_not_found_hook
函数之后,ironic-inspector会像处理一般节点一样的方式处理新注册的节点,因此可能还需要添加一些自检规则,用来为新节点添加ipmi_username
、deploy_kernel
等一系列字段。
[{
"description": "Set IPMI driver_info if no credentials",
"actions": [
{"action": "set-attribute", "path": "driver", "value": "ipmi"},
{"action": "set-attribute", "path": "driver_info/ipmi_username",
"value": "username"},
{"action": "set-attribute", "path": "driver_info/ipmi_password",
"value": "password"}
],
"conditions": [
{"op": "is-empty", "field": "node://driver_info.ipmi_password"},
{"op": "is-empty", "field": "node://driver_info.ipmi_username"}
]
},{
"description": "Set deploy info if not already set on node",
"actions": [
{"action": "set-attribute", "path": "driver_info/deploy_kernel",
"value": "<glance uuid>"},
{"action": "set-attribute", "path": "driver_info/deploy_ramdisk",
"value": "<glance uuid>"}
],
"conditions": [
{"op": "is-empty", "field": "node://driver_info.deploy_ramdisk"},
{"op": "is-empty", "field": "node://driver_info.deploy_kernel"}
]
}]
enroll_node_not_found_hook
函数还会在自检数据中添加一个auto_discovered
的标记,该标记用来区分手动注册的节点和自动发现的节点,因此可以在自检规则中根据该标记做一些特定的操作。如下表示如果有该标记,那么设置节点的driver为ipmi。
{
"description": "Enroll auto-discovered nodes with ipmi hardware type",
"actions": [
{"action": "set-attribute", "path": "driver", "value": "ipmi"}
],
"conditions": [
{"op": "eq", "field": "data://auto_discovered", "value": true}
]
}
自检请求发送的代码分析:
收到ramdisk回调后:
与ironic-inspector组件交互的方式有直接交互与通过ironic间接交互两种,这两种交互方式同时具备CLI和HTTP两种API接口。
接下来以裸金属节点bm-10为例,介绍CLI接口的用法。
自检前,需确认裸金属节点处于manageable状态。
# openstack baremetal node list
+--------------------------------------+-------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+-------+---------------+-------------+--------------------+-------------+
| a796c7e4-387c-47e0-bab7-e1f56621d4d0 | bm-10 | None | power off | manageable | False |
+--------------------------------------+-------+---------------+-------------+--------------------+-------------+
通过ironic组件进行自检,自检过程中裸金属节点的状态变化:manageable --> inspect --> inspect wait --> manageable。
# openstack baremetal node inspect bm-10
或是直接自检,自检过程中裸金属节点的状态不发生变化。
# openstack baremetal introspection start bm-10
自检完成后,可以发现其extra字段和properties均已更新,添加了很多内容,并且ironic port也更新至裸金属节点实际的网卡数量。
# openstack baremetal node show bm-10
+------------------------+-----------------------------------------------------------------------------------------+
| Field | Value |
+------------------------+-----------------------------------------------------------------------------------------+
| chassis_uuid | None |
| console_enabled | False |
| created_at | 2021-07-26T08:09:41+00:00 |
| driver | ipmi |
| driver_info | {u'ipmi_port': 623, u'ipmi_username': u'admin', u'deploy_kernel': u'3203927a-04c3-488c- |
| | bf60-bbcc42be8c86', u'ipmi_address': u'10.33.45.10', u'deploy_ramdisk': |
| | u'0372afc2-65bf-4462-9b81-5f1ab4d63fa2', u'ipmi_password': u'******'} |
| driver_internal_info | {} |
| extra | {u'disks': u'[{"rotational": true, "vendor": "ATA", "name": "/dev/sda", |
| | "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca25dcff84e", "model": |
| | "HGST HUS726040AL", "wwn": "0x5000cca25dcff84e", "serial": "K4H442HB", "size": |
| | 4000787030016}, {"rotational": true, "vendor": "ATA", "name": "/dev/sdb", |
| | "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca25dcff039", "model": |
| | "HGST HUS726040AL", "wwn": "0x5000cca25dcff039", "serial": "K4H41XSB", "size": |
| | 4000787030016}, {"rotational": true, "vendor": "ATA", "name": "/dev/sdc", |
| | "wwn_vendor_extension": null, "wwn_with_extension": "0x5000cca25dcff84d", "model": |
| | "HGST HUS726040AL", "wwn": "0x5000cca25dcff84d", "serial": "K4H442GB", "size": |
| | 4000787030016}]', u'system_vendor': u'{"serial_number": "HIK096396264-B", |
| | "product_name": "DS-VH2203X4-EBE/2", "manufacturer": "OEM"}', u'block_devices': |
| | {u'serials': [u'K4H442HB', u'K4H41XSB', u'K4H442GB']}, u'last_inspect_status': |
| | u'success', u'mac_address': u'0c:c4:7a:e2:27:a2', u'cpu': u'{"count": 24, "socket": 2, |
| | "frequency": "3200.0000", "flags": ["fpu", "vme", "de", "pse", "tsc", "msr", "pae", |
| | "mce", "cx8", "apic", "sep", "mtrr", "pge", "mca", "cmov", "pat", "pse36", "clflush", |
| | "dts", "acpi", "mmx", "fxsr", "sse", "sse2", "ss", "ht", "tm", "pbe", "syscall", "nx", |
| | "pdpe1gb", "rdtscp", "lm", "constant_tsc", "arch_perfmon", "pebs", "bts", "rep_good", |
| | "nopl", "xtopology", "nonstop_tsc", "aperfmperf", "eagerfpu", "pni", "pclmulqdq", |
| | "dtes64", "monitor", "ds_cpl", "vmx", "smx", "est", "tm2", "ssse3", "sdbg", "fma", |
| | "cx16", "xtpr", "pdcm", "pcid", "dca", "sse4_1", "sse4_2", "x2apic", "movbe", "popcnt", |
| | "aes", "xsave", "avx", "f16c", "rdrand", "lahf_lm", "abm", "epb", "invpcid_single", |
| | "intel_ppin", "ssbd", "ibrs", "ibpb", "tpr_shadow", "vnmi", "flexpriority", "ept", |
| | "vpid", "fsgsbase", "tsc_adjust", "bmi1", "avx2", "smep", "bmi2", "erms", "invpcid", |
| | "cqm", "xsaveopt", "cqm_llc", "cqm_occup_llc", "dtherm", "ida", "arat", "pln", "pts", |
| | "md_clear"], "architecture": "x86_64", "model_name": "Intel(R) Xeon(R) CPU E5-2620 v3 @ |
| | 2.40GHz"}'} |
| inspection_finished_at | None |
| inspection_started_at | 2021-07-26T08:34:45+00:00 |
| instance_info | {} |
| instance_uuid | None |
| last_error | None |
| maintenance | False |
| maintenance_reason | None |
| name | bm-10 |
| ports | [{u'href': u'http://ironic.openstack.svc.cluster.local:10080//v1/nodes/a796c7e4-387c- |
| | 47e0-bab7-e1f56621d4d0/ports', u'rel': u'self'}, {u'href': |
| | u'http://ironic.openstack.svc.cluster.local:10080//nodes/a796c7e4-387c- |
| | 47e0-bab7-e1f56621d4d0/ports', u'rel': u'bookmark'}] |
| power_state | power off |
| properties | {u'cpu_arch': u'x86_64', u'vendor': u'intel', u'cpus': u'24', u'capabilities': u'cpu_hu |
| | gepages:true,cpu_txt:true,accelerator_has_gpu:false,cpu_vt:true,cpu_aes:true,cpu_hugepa |
| | ges_1g:true', u'memory_mb': u'65536', u'local_gb': u'3725'} |
| provision_state | manageable |
| provision_updated_at | 2021-07-26T08:40:13+00:00 |
| reservation | None |
| target_power_state | None |
| target_provision_state | None |
| updated_at | 2021-07-26T08:40:13+00:00 |
| uuid | a796c7e4-387c-47e0-bab7-e1f56621d4d0 |
+------------------------+-----------------------------------------------------------------------------------------+
# openstack baremetal port list --node bm-10
+--------------------------------------+-------------------+
| UUID | Address |
+--------------------------------------+-------------------+
| 552673a2-1c96-4f5f-8b65-25d45d3a4325 | 0c:c4:7a:e2:27:a2 |
| 30ecf7bb-6ce2-412d-8e45-91973edb22ea | a0:36:9f:d8:18:77 |
| bbbb0e47-2aa4-4f4c-bd34-ad7c177c49d3 | a0:36:9f:d8:18:76 |
| 2358f8ed-d16a-4576-8d6d-c89c9eadc7a6 | 0c:c4:7a:e2:27:a3 |
+--------------------------------------+-------------------+
# openstack baremetal introspection list
+--------------------------------------+---------------------+---------------------+-------+
| UUID | Started at | Finished at | Error |
+--------------------------------------+---------------------+---------------------+-------+
| a796c7e4-387c-47e0-bab7-e1f56621d4d0 | 2021-07-26T08:34:46 | 2021-07-26T08:39:22 | None |
+--------------------------------------+---------------------+---------------------+-------+
# openstack baremetal introspection status bm-10
+-------------+--------------------------------------+
| Field | Value |
+-------------+--------------------------------------+
| error | None |
| finished | True |
| finished_at | 2021-07-26T08:39:22 |
| started_at | 2021-07-26T08:34:46 |
| state | finished |
| uuid | a796c7e4-387c-47e0-bab7-e1f56621d4d0 |
+-------------+--------------------------------------+
自检数据获取接口可以拿到从IPA返回的自检数据。
# openstack baremetal introspection data save bm-10 --file /tmp/inspector-data.json
自检过程中断接口可以中断自检过程,使裸金属节点的状态立即返回至manageable。
# openstack baremetal introspection abort bm-10
除此之外,ironic-inspector还提供了一些不常用的接口:
ironic-inspector的可扩展性较强,它提供了自检规则的概念,使用户能够依据实际环境的情况自定义自检的行为;所有自检处理函数均通过插件的方式提供,若预设的函数不满足需要,想要为其添加/修改某些功能,只需要增加少部分的代码就能够实现。
自检规则中有一系列的运算符和动作,这些预设的选项都存放于plugins/rules.py
文件,若不满足需求可在该文件中扩展。
所有的插件以及插件的处理函数均存放于plugins
目录,若不满足需求可在该目录中扩展。
除此之外,ironic port创建行为、启用的处理函数、发现节点的配置等等都在配置文件中定义,可按需修改。
Hardware introspection for OpenStack Bare Metal — ironic-inspector 10.7.0.dev19 documentation
手机扫一扫
移动阅读更方便
你可能感兴趣的文章