自动化kolla-ansible部署openstack+GPU透传方法
阅读原文时间:2021年04月25日阅读:1

自动化kolla-ansible部署openstack+GPU透传方法

欢迎加QQ群:1026880196 进行交流学习

1. CentOS7.x-8.x系列为虚拟机配置GPU直通

1. 编辑文件vim /etc/modules, 添加以下内容:
pci_stub
vfio
vfio_iommu_type1
vfio_pci
kvm
kvm_intel

  1. 在KVM主机上启用IOMMU

#对于Intel芯片:
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"

#对于AMD芯片:
GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt iommu=1"

vim /etc/default/grub

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet intel_iommu=on"
GRUB_DISABLE_RECOVERY="true"

3.  重新生成grub
   EFI
   grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg

非EFI
   grub2-mkconfig -o /boot/grub2/grub.cfg

4. 将下列内容加入到blacklist中以避免被宿主机占用,编辑文件
vim /etc/modprobe.d/blacklist.conf
blacklist snd_hda_intel
blacklist amd76x_edac
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
blacklist nvidia

  1. 查找显卡的Product ID 以及 Vendor ID:
    yum install pciutils -y
    lspci -nn | grep NVIDIA

如下:
[root@stein-a ~]#
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)
03:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)

  1. 编辑
    vim /etc/modprobe.d/vfio.conf

create new: for [ids=***], specify [vendor-ID:device-ID]

options vfio-pci ids=10de:1bb1,10de:10f0

  1. 写入到系统启动项
    echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf

  2. 重新生成initramfs
    mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
    dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

  3. 重启系统
    reboot

  4. 验证
    lspci -nnk -d 10de:1bb1
    dmesg | grep -i vfio

[root@stein-a ~]# lspci -nnk -d 10de:1bb1
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:11a3]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
[root@stein-a ~]# dmesg | grep -i vfio
[ 2.503115] VFIO - User Level meta-driver version: 0.3
[ 2.515645] vfio_pci: add [10de:1bb1[ffff:ffff]] class 0x000000/00000000
[ 2.515752] vfio_pci: add [10de:10f0[ffff:ffff]] class 0x000000/00000000
[root@stein-a ~]#

2. Ubuntu18.04系列为虚拟机配置GPU直通

1. 编辑文件vim /etc/modules, 添加以下内容:
pci_stub
vfio
vfio_iommu_type1
vfio_pci
kvm
kvm_intel

  1. 在KVM主机上启用IOMMU

#对于Intel芯片:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on"

#对于AMD芯片:
GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt iommu=1"

vim /etc/default/grub

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on"
GRUB_CMDLINE_LINUX=""

3.  重新生成grub
   EFI
   grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg

非EFI 
   grub2-mkconfig -o /boot/grub2/grub.cfg

4. 将下列内容加入到blacklist中以避免被宿主机占用,编辑文件
vim /etc/modprobe.d/blacklist.conf
blacklist snd_hda_intel
blacklist amd76x_edac
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
blacklist nvidia

  1. 查找显卡的Product ID 以及 Vendor ID:
    apt install pciutils -y
    lspci -nn | grep NVIDIA

如下:
[root@stein-a ~]# lspci -nn | grep NVIDIA
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)
03:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)

  1. 编辑
    vim /etc/modprobe.d/vfio.conf

create new: for [ids=***], specify [vendor-ID:device-ID]

options vfio-pci ids=10de:1bb1,10de:10f0

  1. 写入到系统启动项
    echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf

  2. 重新生成initramfs
    dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

  3. 重启系统
    reboot

  4. 验证
    lspci -nnk -d 10de:1bb1
    dmesg | grep -i vfio

root@kvm:~# lspci -nnk -d 10de:1bb1
dmesg | grep -i vfio
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104GL [Quadro P4000] [10de:1bb1] (rev a1)
Subsystem: NVIDIA Corporation GP104GL [Quadro P4000] [10de:11a3]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
root@kvm:~# dmesg | grep -i vfio
[ 3.838714] VFIO - User Level meta-driver version: 0.3
[ 3.846238] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 3.866370] vfio_pci: add [10de:1bb1[ffffffff:ffffffff]] class 0x000000/00000000
[ 3.886375] vfio_pci: add [10de:10f0[ffffffff:ffffffff]] class 0x000000/00000000

3. CentOS7.x系列 安装显卡驱动

1. 查看是否含有英伟达显卡
lspci | grep -i NVIDIA

#下面说明有1块英伟达的显卡
[root@train-all ~]# lspci | grep -i NVIDIA
04:00.0 VGA compatible controller: NVIDIA Corporation GP104GL [Quadro P4000] (rev a1)
04:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
[root@train-all ~]#

  1. 添加ELRepo源
    rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

  2. 安装ELRepo
    rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm

  3. 安装nvidia-detect
    yum install nvidia-detect -y

  4. 运行nvidia-detect
    nvidia-detect -v

  5. 查找驱动程序
    yum search kmod-nvidia

  6. 安装驱动程序
    yum install kmod-nvidia.x86_64 -y

  7. 查看禁用Nouveau
    lsmod | grep nouveau
    #若没有输出 则说明禁用成功,否则执行下面的命令

  8. 在/etc/modprobe.d/blacklist-nouveau.conf中创建一个文件,其内容如下:
    vi /etc/modprobe.d/blacklist-nouveau.conf

添加
blacklist nouveau
options nouveau modeset=0

  1. 重新生成内核initramfs
    dracut --force

  2. 重启系统
    reboot

  3. 测试
    nvidia-smi

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器

你可能感兴趣的文章