« Bifrost » : différence entre les versions

De TeriaHowto
Sauter à la navigation Sauter à la recherche
Aucun résumé des modifications
mAucun résumé des modifications
(15 versions intermédiaires par le même utilisateur non affichées)
Ligne 23 : Ligne 23 :
cat > /etc/bifrost/bifrost.yml << EOF
cat > /etc/bifrost/bifrost.yml << EOF
ansible_python_interpreter: /var/lib/kolla/venv/bin/python
ansible_python_interpreter: /var/lib/kolla/venv/bin/python
enabled_hardware_types: ipmi,redfish
enabled_hardware_types: "ipmi,redfish"
enabled_deploy_interfaces: direct,ramdisk,anaconda
enabled_deploy_interfaces: "direct,ramdisk,anaconda"
cleaning: false
cleaning: false
network_interface: ens3
network_interface: "ens3"
mysql_username: root
mysql_username: "root"
mysql_password:
mysql_password:
create_image_via_dib: false
create_image_via_dib: false
dib_image_type: vm
dib_image_type: "vm"
create_ipa_image: false
create_ipa_image: false
dnsmasq_router: <@IP_router>
dhcp_provider: "none"
dnsmasq_dns_servers: <@IP_nameserver>
dnsmasq_router: "<@IP_router>"
dnsmasq_ntp_servers: <@IP_ntp_server>
dnsmasq_dns_servers: "<@IP_nameserver>"
dnsmasq_ntp_servers: "<@IP_ntp_server>"
use_firewalld: false
use_firewalld: false
default_boot_mode: bios
default_boot_mode: "uefi"
dhcp_pool_start: <@IP_dhcp_pool_start>
dhcp_pool_start: "<@IP_dhcp_pool_start>"
dhcp_pool_end: <@IP_dhcp_pool_end>
dhcp_pool_end: "<@IP_dhcp_pool_end>"
dhcp_lease_time: 12h
dhcp_lease_time: "infinite"
dhcp_static_mask: <netmastk>
dhcp_static_mask: "<netmastk>"
EOF
EOF


Ligne 55 : Ligne 56 :
* ''create_ipa_image'' is set to ''false'' in order to use pre-build IPA ([https://docs.openstack.org/ironic-python-agent/latest/ Ironic Python Agent]) kernel / initramfs
* ''create_ipa_image'' is set to ''false'' in order to use pre-build IPA ([https://docs.openstack.org/ironic-python-agent/latest/ Ironic Python Agent]) kernel / initramfs
* ''use_firewalld'' is set here to ''false'' because it prevents accessing the host with SSH by default ...
* ''use_firewalld'' is set here to ''false'' because it prevents accessing the host with SSH by default ...
* ''dhcp_provider'' is set here to ''none'' because static dnsmasq configuration is requested (cf [https://opendev.org/openstack/bifrost/src/branch/master/playbooks/roles/bifrost-ironic-install bifrost-ironic-install README])
One last important point : all the machines (bifrost host, bare metal nodes) must be synchronized in time with the same time zone (no UTC and CEST mix for example)


== Enroll node(s) ==
== Enroll node(s) ==
Ligne 126 : Ligne 130 :
</syntaxhighlight>
</syntaxhighlight>


''ipv4_address'', ''ipv4_subnet_mask'', ''ipv4_gateway'', ''ipv4_nameserver'', ''inventory_dhcp'' are only useful if a static IP configuration is required.
''ipv4_address'', ''ipv4_subnet_mask'', ''ipv4_gateway'', ''ipv4_nameserver'', ''inventory_dhcp'' are only useful if a preset IP configuration is required.


==== With anaconda (and kickstart) ====
==== With anaconda (and kickstart) ====
Ligne 209 : Ligne 213 :
</syntaxhighlight>
</syntaxhighlight>


=== Software RAID ===
If software RAID should be used, additionnal steps are required :
* Set the RAID config (more information provided by the [https://techblog.web.cern.ch/techblog/post/ironic_software_raid/ CERN tech blog] and [https://docs.openstack.org/ironic/latest/admin/raid.html official documentation])
<syntaxhighlight lang="bash">
baremetal node set --raid-interface agent <NODE>
baremetal node set <NODE> --target-raid-config '{ "logical_disks": [ { "raid_level": "1", "size_gb": "MAX", "controller": "software", "is_root_volume": true } ]}'
</syntaxhighlight>
* Clean up and build the software RAID configuration of the node :
<syntaxhighlight lang="bash">
baremetal node manage <NODE>
baremetal node clean <NODE> --clean-steps '[{"interface": "raid", "step": "delete_configuration"}, {"interface": "deploy", "step": "erase_devices_metadata"}, {"interface": "raid", "step": "create_configuration"}]'
baremetal node provide <NODE>
</syntaxhighlight>
Be careful, software RAID installation does not work with all generic cloud images. Almalinux generic cloud image does not support software RAID for example wheras Ubuntu generic cloud image does. Also, do not configure RAID software with Ironic when using Anaconda deployment : it is difficult to reuse an existing RAID config in a kickstart.


== Deploy node(s) ==
== Deploy node(s) ==
Ligne 214 : Ligne 235 :
=== The ''direct'' way ===
=== The ''direct'' way ===


''Direct'' way means that a cloud-image will be used and deployed thanks to the IPA.
[https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html ''Direct'' way] means that a cloud-image will be used and deployed thanks to the IPA on the first available disk on the node.


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Ligne 225 : Ligne 246 :
=== The ''anaconda'' way ===
=== The ''anaconda'' way ===


The ''anaconda'' is an option for highly customized deployments, thanks to a custom ''kickstart''. It works only with Linux distributions which use Anaconda.
The [https://docs.openstack.org/ironic/latest/admin/anaconda-deploy-interface.html ''anaconda'' way] is an option for highly customized deployments, thanks to a custom ''kickstart''. It works only with Red Hat based Linux distributions.


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Ligne 237 : Ligne 258 :


''network_interface'' and ''ssh_public_key_path'' are required by the playbook in the ''anaconda'' case, in order to build and provide the ''configdrive'' (which may or may not be used in this case ... Depending the ''kickstart'' content !)
''network_interface'' and ''ssh_public_key_path'' are required by the playbook in the ''anaconda'' case, in order to build and provide the ''configdrive'' (which may or may not be used in this case ... Depending the ''kickstart'' content !)
== Bonus : customize Almalinux 8 Generic Cloud Image ==
As mentioned earlier, the stock Almalinux Generic Cloud Image does not support software RAID. But it is Open Source so everything is possible ^^
=== Build environment ===
It is assumed an Almalinux (or Rocky, or Ubuntu, ...) host is available. The folling commands come from an Alamlinux host.
<syntaxhighlight lang="bash">
yum -y install ansible qemu-kvm.x86_64
yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
yum install -y packer
cd /usr/src/ && git clone https://github.com/AlmaLinux/cloud-images.git
</syntaxhighlight>
A host with VT-X (Intel) / AMD-V (AMD) CPU access (bare metal node or VM within an hypervisor with nested virtualization activated) is highly recommanded to speedup the build process.
=== Customization ===
* (optional) Edit ''variables.pkr.hcl'' to use a fast Alamlinux mirror
<syntaxhighlight lang="bash">
sed -i 's#https://repo.almalinux.org#http://mirror.rackspeed.de#g' variables.pkr.hcl
</syntaxhighlight>
* Patch ''http/almalinux-8.gencloud-x86_64.ks''
<syntaxhighlight lang="Diff">
--- a/http/almalinux-8.gencloud-x86_64.ks
+++ b/http/almalinux-8.gencloud-x86_64.ks
@@ -16,14 +16,24 @@ timezone UTC --isUtc
network --bootproto=dhcp
firewall --enabled --service=ssh
services --disabled="kdump" --enabled="chronyd,rsyslog,sshd"
selinux --enforcing
-# TODO: remove "console=tty0" from here
-bootloader --append="console=ttyS0,115200n8 console=tty0 crashkernel=auto net.ifnames=0 no_timer_check" --location=mbr --timeout=1
-zerombr
-clearpart --all --initlabel
-reqpart
-part / --fstype="xfs" --size=8000
+bootloader --append="console=ttyS0,115200n8 console=tty0 crashkernel=auto net.ifnames=0 no_timer_check rd.auto=1 rd.md=1 rd.md.conf=1" --location=mbr --timeout=1
+
+%pre --erroronfail
+
+parted -s -a optimal /dev/sda -- mklabel gpt
+parted -s -a optimal /dev/sda -- mkpart biosboot 1MiB 2MiB set 1 bios_grub on
+parted -s -a optimal /dev/sda -- mkpart '"EFI System Partition"' fat32 2MiB 202MiB set 2 esp on
+parted -s -a optimal /dev/sda -- mkpart boot xfs 202MiB 714MiB
+parted -s -a optimal /dev/sda -- mkpart root xfs 714MiB 100%
+
+%end
+
+part biosboot --fstype=biosboot --onpart=sda1
+part /boot/efi --fstype=efi --onpart=sda2
+part /boot --fstype=xfs --onpart=sda3
+part / --fstype=xfs --onpart=sda4
rootpw --plaintext almalinux
@@ -32,6 +42,7 @@ reboot --eject
%packages
@core
+mdadm
-biosdevname
-open-vm-tools
-plymouth
</syntaxhighlight>
A few points of attention :
* ''rd.auto=1 rd.md=1 rd.md.conf=1'' parameters from the ''bootloader'' line will inform the kernel during the boot process to look for a software RAID configuration.
* The ''%pre'' section is used to precisely partition the disk with UEFI support
=== Build ! ===
<syntaxhighlight lang="bash">
/usr/bin/packer init -upgrade .
/usr/bin/packer build -var qemu_binary="/usr/libexec/qemu-kvm" -only=qemu.almalinux-8-gencloud-uefi-x86_64 .
</syntaxhighlight>

Version du 27 août 2023 à 13:37

" The mission of Bifrost is to provide an easy path to deploy ironic in a stand-alone fashion ".
One use case of Bifrost could be to provision bare-metal nodes for a new Openstack cluster.

Installation

There are different ways to deploy Bifrost (cf https://docs.openstack.org/bifrost/latest/install/index.html) but the easiest (I think) is through to a dedicated (pre-built) container.

docker pull quay.io/openstack.kolla/bifrost-deploy:zed-rocky-9

docker run -it --net=host -v /dev:/dev -d \
--privileged --name bifrost_deploy \
quay.io/openstack.kolla/bifrost-deploy:zed-rocky-9

docker exec -it bifrost_deploy bash

Within the container :

mkdir -p /etc/bifrost
cat > /etc/bifrost/bifrost.yml << EOF
ansible_python_interpreter: /var/lib/kolla/venv/bin/python
enabled_hardware_types: "ipmi,redfish"
enabled_deploy_interfaces: "direct,ramdisk,anaconda"
cleaning: false
network_interface: "ens3"
mysql_username: "root"
mysql_password:
create_image_via_dib: false
dib_image_type: "vm"
create_ipa_image: false
dhcp_provider: "none"
dnsmasq_router: "<@IP_router>"
dnsmasq_dns_servers: "<@IP_nameserver>"
dnsmasq_ntp_servers: "<@IP_ntp_server>"
use_firewalld: false
default_boot_mode: "uefi"
dhcp_pool_start: "<@IP_dhcp_pool_start>"
dhcp_pool_end: "<@IP_dhcp_pool_end>"
dhcp_lease_time: "infinite"
dhcp_static_mask: "<netmastk>"
EOF

cd /bifrost/playbooks
ansible-playbook -vvvv \
-i /bifrost/playbooks/inventory/target \
/bifrost/playbooks/install.yaml \
-e @/etc/bifrost/bifrost.yml \
-e skip_package_install=true

A few points of attention :

  • network_interface is the network interface of the host running the container
  • create_ipa_image is set to false in order to use pre-build IPA (Ironic Python Agent) kernel / initramfs
  • use_firewalld is set here to false because it prevents accessing the host with SSH by default ...
  • dhcp_provider is set here to none because static dnsmasq configuration is requested (cf bifrost-ironic-install README)

One last important point : all the machines (bifrost host, bare metal nodes) must be synchronized in time with the same time zone (no UTC and CEST mix for example)

Enroll node(s)

To enroll one or several nodes, an inventory is used.

cd /bifrost/playbooks
export OS_CLOUD=bifrost
export BIFROST_INVENTORY_SOURCE=/tmp/baremetal.json
ansible-playbook -vvvv -i inventory/ enroll-dynamic.yaml

Some examples of /tmp/baremetal.json are given below.

The IPMI way

With a cloud image (Almalinux 8.7) and cloud-init

Create a JSON file (e.g. /tmp/baremetal.json) :

{
    "baremetal1": {
      "name": "baremetal1",
      "driver": "ipmi",
      "driver_info": {
        "ipmi_address": "<@IP_IPMI_BMC>",
        "ipmi_port": "<PORT_IPMI_BMC>",
        "ipmi_username": "<USER_IPMI_BMC>",
        "ipmi_password": "<PASSWORD_IPMI_BMC>",
      },
      "ipv4_address": "<@IP_node>",
      "ipv4_subnet_mask": "<netmask_node>",
      "ipv4_gateway": "<@IP_router>",
      "ipv4_nameserver": "<@IP_nameserver>",
      "inventory_dhcp": true,
      "nics": [
        {
          "mac": "<@MAC>"
        }
      ],
      "properties": {
        "cpu_arch": "x86_64"
      },
      "instance_info": {
        "image_source": "https://repo.almalinux.org/almalinux/8/cloud/x86_64/images/AlmaLinux-8-GenericCloud-8.7-20221111.x86_64.qcow2",
        "image_checksum": "b2b8c7fd3b6869362f3f8ed47549c804",
        "configdrive": {
          "meta_data": {
            "public_keys": {"0": "<SSH_PUBLIC_KEY_CONTENT>"},
            "hostname": "baremetal1.domain.ld"
          },
          "user_data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackages:\n  - git\n  - httpd\n"
        }
      }
    }
}

To generate user_data, this example could help :

cat > /tmp/cloud << EOF 
#cloud-config
package_update: true
package_upgrade: true
packages:
  - git
  - httpd
EOF
jq -Rs '.' /tmp/cloud
rm -f /tmp/cloud

ipv4_address, ipv4_subnet_mask, ipv4_gateway, ipv4_nameserver, inventory_dhcp are only useful if a preset IP configuration is required.

With anaconda (and kickstart)

Create a JSON file (e.g. /tmp/baremetal.json) :

{
    "baremetal1": {
      "name": "baremetal1",
      "driver": "ipmi",
      "driver_info": {
        "ipmi_address": "<@IP_IPMI_BMC>",
        "ipmi_port": "<PORT_IPMI_BMC>",
        "ipmi_username": "<USER_IPMI_BMC>",
        "ipmi_password": "<PASSWORD_IPMI_BMC>",
      },
      "ipv4_address": "<@IP_node>",
      "ipv4_subnet_mask": "<netmask_node>",
      "ipv4_gateway": "<@IP_router>",
      "ipv4_nameserver": "<@IP_nameserver>",
      "inventory_dhcp": true,
      "nics": [
        {
          "mac": "<@MAC>"
        }
      ],
      "properties": {
        "cpu_arch": "x86_64"
      },
      "instance_info": {
        "image_source": "http://mirror.rackspeed.de/almalinux/8/BaseOS/x86_64/os/",
        "kernel": "http://mirror.rackspeed.de/almalinux/8/BaseOS/x86_64/os/images/pxeboot/vmlinuz",
        "ramdisk": "http://mirror.rackspeed.de/almalinux/8/BaseOS/x86_64/os/images/pxeboot/initrd.img",
        "ks_template": "<kickstart_URL>" 
      }
    }
}

ks_template is an URL pointing to a kickstart which must respect mandatory sections (cf https://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/ks.cfg.template)

The Redfish way

Create a JSON file (e.g. /tmp/baremetal.json) :

{
    "baremetal1": {
      "name": "baremetal1",
      "driver": "redfish",
      "driver_info": {
        "redfish_address": "http(s)://<@IP>:<PORT>",
        "redfish_system_id": "/redfish/v1/Systems/<UUID>",
        "redfish_username": "<USERNAME>",
        "redfish_password": "<PASSWORD>"
      },
      "ipv4_address": "<@IP_node>",
      "ipv4_subnet_mask": "<netmask_node>",
      "ipv4_gateway": "<@IP_router>",
      "ipv4_nameserver": "<@IP_nameserver>",
      "inventory_dhcp": true,
      "nics": [
        {
          "mac": "<@MAC>"
        }
      ],
      "properties": {
        "cpu_arch": "x86_64"
      },
      "instance_info": {
        "image_source": "https://repo.almalinux.org/almalinux/8/cloud/x86_64/images/AlmaLinux-8-GenericCloud-8.7-20221111.x86_64.qcow2",
        "image_checksum": "b2b8c7fd3b6869362f3f8ed47549c804",
        "configdrive": {
          "meta_data": {
            "public_keys": {"0": "<SSH_PUBLIC_KEY_CONTENT>"},
            "hostname": "baremetal1.domain.ld"
          },
          "user_data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackages:\n  - git\n  - httpd\n"
        }
      }
    }
}

Software RAID

If software RAID should be used, additionnal steps are required :

baremetal node set --raid-interface agent <NODE>
baremetal node set <NODE> --target-raid-config '{ "logical_disks": [ { "raid_level": "1", "size_gb": "MAX", "controller": "software", "is_root_volume": true } ]}'
  • Clean up and build the software RAID configuration of the node :
baremetal node manage <NODE>
baremetal node clean <NODE> --clean-steps '[{"interface": "raid", "step": "delete_configuration"}, {"interface": "deploy", "step": "erase_devices_metadata"}, {"interface": "raid", "step": "create_configuration"}]'
baremetal node provide <NODE>

Be careful, software RAID installation does not work with all generic cloud images. Almalinux generic cloud image does not support software RAID for example wheras Ubuntu generic cloud image does. Also, do not configure RAID software with Ironic when using Anaconda deployment : it is difficult to reuse an existing RAID config in a kickstart.

Deploy node(s)

The direct way

Direct way means that a cloud-image will be used and deployed thanks to the IPA on the first available disk on the node.

cd /bifrost/playbooks
export OS_CLOUD=bifrost
export BIFROST_INVENTORY_SOURCE=/tmp/baremetal.json
ansible-playbook -vvvv -i inventory/bifrost_inventory.py deploy-dynamic.yaml

The anaconda way

The anaconda way is an option for highly customized deployments, thanks to a custom kickstart. It works only with Red Hat based Linux distributions.

cd /bifrost/playbooks
export OS_CLOUD=bifrost
export BIFROST_INVENTORY_SOURCE=/tmp/baremetal.json
baremetal node set <NODE_NAME> --deploy-interface anaconda
ansible-playbook -vvvv -i inventory/bifrost_inventory.py deploy-dynamic.yaml \
-e network_interface=<BIFROST_HOST_NETWORK_INTERFACE> -e ssh_public_key_path=<SSH_PUBLIC_KEY_PATH>

network_interface and ssh_public_key_path are required by the playbook in the anaconda case, in order to build and provide the configdrive (which may or may not be used in this case ... Depending the kickstart content !)

Bonus : customize Almalinux 8 Generic Cloud Image

As mentioned earlier, the stock Almalinux Generic Cloud Image does not support software RAID. But it is Open Source so everything is possible ^^

Build environment

It is assumed an Almalinux (or Rocky, or Ubuntu, ...) host is available. The folling commands come from an Alamlinux host.

yum -y install ansible qemu-kvm.x86_64
yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
yum install -y packer
cd /usr/src/ && git clone https://github.com/AlmaLinux/cloud-images.git

A host with VT-X (Intel) / AMD-V (AMD) CPU access (bare metal node or VM within an hypervisor with nested virtualization activated) is highly recommanded to speedup the build process.

Customization

  • (optional) Edit variables.pkr.hcl to use a fast Alamlinux mirror
sed -i 's#https://repo.almalinux.org#http://mirror.rackspeed.de#g' variables.pkr.hcl
  • Patch http/almalinux-8.gencloud-x86_64.ks
--- a/http/almalinux-8.gencloud-x86_64.ks
+++ b/http/almalinux-8.gencloud-x86_64.ks
@@ -16,14 +16,24 @@ timezone UTC --isUtc
 network --bootproto=dhcp
 firewall --enabled --service=ssh
 services --disabled="kdump" --enabled="chronyd,rsyslog,sshd"
 selinux --enforcing
 
-# TODO: remove "console=tty0" from here
-bootloader --append="console=ttyS0,115200n8 console=tty0 crashkernel=auto net.ifnames=0 no_timer_check" --location=mbr --timeout=1
-zerombr
-clearpart --all --initlabel
-reqpart
-part / --fstype="xfs" --size=8000
+bootloader --append="console=ttyS0,115200n8 console=tty0 crashkernel=auto net.ifnames=0 no_timer_check rd.auto=1 rd.md=1 rd.md.conf=1" --location=mbr --timeout=1
+
+%pre --erroronfail
+
+parted -s -a optimal /dev/sda -- mklabel gpt
+parted -s -a optimal /dev/sda -- mkpart biosboot 1MiB 2MiB set 1 bios_grub on
+parted -s -a optimal /dev/sda -- mkpart '"EFI System Partition"' fat32 2MiB 202MiB set 2 esp on
+parted -s -a optimal /dev/sda -- mkpart boot xfs 202MiB 714MiB
+parted -s -a optimal /dev/sda -- mkpart root xfs 714MiB 100%
+
+%end
+
+part biosboot --fstype=biosboot --onpart=sda1
+part /boot/efi --fstype=efi --onpart=sda2
+part /boot --fstype=xfs --onpart=sda3
+part / --fstype=xfs --onpart=sda4
 
 rootpw --plaintext almalinux
 
@@ -32,6 +42,7 @@ reboot --eject
 
 %packages
 @core
+mdadm
 -biosdevname
 -open-vm-tools
 -plymouth

A few points of attention :

  • rd.auto=1 rd.md=1 rd.md.conf=1 parameters from the bootloader line will inform the kernel during the boot process to look for a software RAID configuration.
  • The %pre section is used to precisely partition the disk with UEFI support

Build !

/usr/bin/packer init -upgrade .
/usr/bin/packer build -var qemu_binary="/usr/libexec/qemu-kvm" -only=qemu.almalinux-8-gencloud-uefi-x86_64 .