Skip to content

Conversation

@stevekeay
Copy link
Contributor

@stevekeay stevekeay commented Oct 17, 2025

Understack operation requires that every baremetal node has baremetal ports for each NIC we want to use.

Upstream ironic includes inspection hooks which can create baremetal ports and set local_link_info based on LLDP information, but UnderStack requires that Baremetal ports have specific attributes populated:

  1. Exactly ONE of the ports on a node have set the pxe flag set. We put this port into the provisioning/cleaning network to boot the IPA image. (Putting multiple ports into the provisioning/cleaning VLAN causes DHCP and ARP issues, so we nominate a single port to boot from. This should be the same port that the actual server will attempt to use to make PXE requests).
  2. local_link field must be populated with upstream connected switch HOSTNAME and INTERFACE name. We use these to drive switch automation.
  3. physical_network field must be populated with the appropriate VLAN Group name. We use this to drive VLAN number assignment and switch configuration.

In addition, the baremetal node should have traits indicating the networks (VLAN Groups) to which it is connected, for example if it has NICs connected to the STORAGE switch then we would add the CUSTOM_STORAGE_SWITCH trait.

Switch uplink connections for the node are identified during node inspection using LLDP. The "agent" inspector does this today, and we are enhancing the out-of-band inspection to provide the same data (where the hardware permits).

This inspection hook is understack-specific in that it assumes our switch hostnames will follow a certain naming convention, however the details of that convention are supplied as configuration.

We previously performed these activities as part of the "enrol" process, but performing these activities inside Ironic gives operators more visibility and allows them to drive remediation or updates via the openstack API. For example, if a physical node had cabling issues during enrol, these can be resolved and the node can be "inspected" to straighten out the baremetal ports without needing to trigger an external workflow or process.

Once this PR is done, we can remove those steps from the enrol process - see #1416

PREREQUISITES

Today when a node undergoes cleaning, provisioning or agent inspection, all but one of its ports are shut down. This defeats LLDP and prevents the inspection from seeing the link on the other ports. We need to change our network design/template so that during inspection all ports are UP (and talk LLDP) but don't have any other traffic. This includes ports already documented in ironic (being re-inspected) as well as ports that are currently unknown (ports being discovered and created in ironic for the first time).

NOTE that when these baremetal port updates occur, ironic emits events which should trigger a workflow to make corresponding changes in nautobot)

@stevekeay stevekeay force-pushed the ironic-inspection-hook branch from 9ff91f9 to 21d33c6 Compare October 17, 2025 14:59
@cardoe cardoe changed the title Add update_baremetal_port ironic inspection hook feat(ironic): Add update_baremetal_port ironic inspection hook Oct 20, 2025
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch 3 times, most recently from 30e1717 to 6038946 Compare October 21, 2025 09:41
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch 5 times, most recently from 9949abc to 3b3481f Compare November 4, 2025 12:58
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch 3 times, most recently from 1d0d819 to 68ce0f1 Compare November 24, 2025 12:52
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch 15 times, most recently from 9006136 to 3ade42b Compare November 26, 2025 14:57
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch 6 times, most recently from ed99ae7 to 6dc2467 Compare December 2, 2025 16:21
Copy link
Contributor

@cardoe cardoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two suggestions that I'm fine if they don't happen.

Comment on lines +34 to +47
extra = baremetal_port.extra
current_bios_name = extra.get("bios_name")

if current_bios_name != required_bios_name:
LOG.info(
"Port %(mac)s updating bios_name from %(old)s to %(new)s",
{"mac": mac, "old": current_bios_name, "new": required_bios_name},
)

if required_bios_name:
extra["bios_name"] = required_bios_name
else:
extra.pop("bios_name", None)

baremetal_port.extra = extra
baremetal_port.save()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also unconditionally set baremetal_port.description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description is cool, however the "description" suggests to me "human readable" whereas this field is intended for a machine to consume. Parsing a description seems fragile because people might be tempted to add other "helpful" information in there.

Also, our openstack today doesn't seem to have a description field:

openstack --os-cloud=prod-infra baremetal port show 57951a4b-ae40-4e1a-9ff6-524e36ec976c
+-----------------------+---------------------------------------------------------+
| Field                 | Value                                                   |
+-----------------------+---------------------------------------------------------+
| address               | c4:cb:e1:d5:92:54                                       |
| created_at            | 2025-03-31T14:33:26+00:00                               |
| extra                 | {}                                                      |
| internal_info         | {}                                                      |
| is_smartnic           | False                                                   |
| local_link_connection | {}                                                      |
| name                  | 57951a4b-ae40-4e1a-9ff6-524e36ec976c NIC.Embedded.1-1-1 |
| node_uuid             | 11fe6307-3c25-47eb-911c-a470e6094913                    |
| physical_network      | None                                                    |
| portgroup_uuid        | None                                                    |
| pxe_enabled           | False                                                   |
| updated_at            | None                                                    |
| uuid                  | 57951a4b-ae40-4e1a-9ff6-524e36ec976c                    |
+-----------------------+---------------------------------------------------------+

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's intended for machines then I agree with the field in extra. As far as the field not showing up, upgrade your client.

@stevekeay stevekeay force-pushed the ironic-inspection-hook branch 2 times, most recently from 9e7dfa3 to 40c284e Compare December 3, 2025 13:06
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch from 40c284e to 089cacc Compare December 3, 2025 13:09
After out-of-band inspection we run port-bios-name.

The interface names returned by the BMC are generally more meaningful to
the data center than the linux names we get in the agent inspection data

After Agent Inspection we run update-baremetal-port.

This consumes the LLDP information, so it must run after agent
inspection.  Out-of-band inspection doesn't give us the full LLDP data.

The default ironic local-link-connection is removed because the new hook
updates the same fields.

We add "validate-interfaces" because that populates data we consume.
@stevekeay stevekeay force-pushed the ironic-inspection-hook branch from 089cacc to bb4bd7a Compare December 3, 2025 13:12
@stevekeay stevekeay added this pull request to the merge queue Dec 3, 2025
Merged via the queue into main with commit fe70029 Dec 3, 2025
46 checks passed
@stevekeay stevekeay deleted the ironic-inspection-hook branch December 3, 2025 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants