[kubernetes]二进制部署k8s集群-基于containerd
阅读原文时间:2023年08月18日阅读:13

k8s从1.24版本开始不再直接支持docker,但可以自行调整相关配置,实现1.24版本后的k8s还能调用docker。其实docker自身也是调用containerd,与其k8s通过docker再调用containerd,不如k8s直接调用containerd,以减少性能损耗。

除了containerd,比较流行的容器运行时还有podman,但是podman官方安装文档要么用包管理器在线安装,要么用包管理器下载一堆依赖再编译安装,内网离线环境下安装可能会比较麻烦,而containerd的安装包是静态二进制文件,解压后就能直接使用,离线环境下相对方便一点。

本文将在内网离线环境下用二进制文件部署一个三节点集群+harbor镜像仓库。集群中部署了三个apiserver,并配置nginx反向代理,提升master的高可用性(如对高可用有进一步要求,可以再加个keepalive)。

相关软件信息:

名称

版本

说明

containerd

cri-containerd-cni-1.7.2-linux-amd64

容器运行时

harbor

2.8.2

容器镜像仓库

etcd

3.4.24

键值对数据库

kubernetes

1.26.6

容器编排系统

nginx

1.25.1

负载均衡,反向代理apiserver

服务器信息:

IP

操作系统

硬件配置

Hostname

说明

192.168.3.31

Debian 11.6 amd64

4C4G

k8s31

nginx+etcd+master+node

192.168.3.32

Debian 11.6 amd64

4C4G

k8s32

etcd+master+node

192.168.3.33

Debian 11.6 amd64

4C4G

k8s33

etcd+master+node

192.168.3.43

Debian 11.6 amd64

4C4G

harbor,内网域名registry.atlas.cn

初始化部分需要三台k8s节点主机都执行, 根据实际情况修改参数。

  1. 修改主机名

    3.31服务器

    hostnamectl set-hostname k8s31

    3.32服务器

    hostnamectl set-hostname k8s32

    3.33服务器

    hostnamectl set-hostname k8s33

  2. 修改/etc/hosts文件,增加以下配置。

    192.168.3.31 k8s31
    192.168.3.32 k8s32
    192.168.3.33 k8s33

  3. 配置时间同步服务

    1. 安装chrony时间同步应用

    apt install -y chrony

    2. 添加内网的ntp服务器地址。如果在公网,可配置阿里云的ntp服务器地址:ntp.aliyun.com

    echo 'server 192.168.3.41 iburst' > /etc/chrony/sources.d/custom-ntp-server.sources

    3. 启动

    systemctl start chrony

    如果已经启动过了, 可以热加载配置: chronyc reload sources

  4. 关闭swap。如果安装系统时创建了swap,则需要关闭。

    swapoff -a

  5. 装载内核模块

    添加配置

    cat << EOF > /etc/modules-load.d/containerd.conf
    overlay
    br_netfilter
    EOF

    立即装载

    modprobe overlay
    modprobe br_netfilter

    检查装载。如果没有输出结果说明没有装载。

    lsmod | grep br_netfilter

  6. 配置系统参数

    1. 添加配置文件

    cat << EOF > /etc/sysctl.d/k8s-sysctl.conf
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.ipv4.ip_forward = 1
    user.max_user_namespaces=28633
    vm.swappiness = 0
    EOF

    2. 配置生效

    sysctl -p /etc/sysctl.d/k8s-sysctl.conf

  7. 启用ipvs。编写system配置文件,实现开机自动装载到内核。

    1. 安装依赖

    apt install -y ipset ipvsadm

    2. 新建文件并添加配置, 文件路径可任意

    tee /root/scripts/k8s.sh <<EOF
    modprobe -- ip_vs
    modprobe -- ip_vs_rr
    modprobe -- ip_vs_wrr
    modprobe -- ip_vs_sh
    modprobe -- nf_conntrack
    EOF

    3. 新建system文件, 实现开机执行脚本

    cat << EOF > /etc/systemd/system/myscripts.service
    [Unit]
    Description=Run a Custom Script at Startup
    After=default.target

    [Service]
    ExecStart=sh /root/scripts/k8s.sh

    [Install]
    WantedBy=default.target
    EOF

    4. 加载并启用system

    systemctl daemon-reload
    systemctl enable myscripts

由于部署集群时候需要先拉取一些镜像,所以harbor在集群外的一个服务器单独部署。官方安装脚本使用了docker,所以harbor节点需要先安装docker,安装步骤可参考 博客园 - linux离线安装docker与compose

harbor的安装步骤可参考 博客园 - centos离线安装harbor,这里大致写一下。

harbor的github release GitHub - goharbor/harbor/releases 提供离线安装包,要先下载好,然后解压。

  1. 创建ssl证书

    mkdir certs

    创建服务器证书密钥文件harbor.key

    openssl genrsa -des3 -out harbor.key 2048

    输入密码,确认密码,自己随便定义,但是要记住,后面会用到。

    创建服务器证书的申请文件harbor.csr

    openssl req -new -key harbor.key -out harbor.csr

    输入密钥文件的密码, 然后一路回车

    备份一份服务器密钥文件

    cp harbor.key harbor.key.org

    去除文件口令

    openssl rsa -in harbor.key.org -out harbor.key

    输入密钥文件的密码

    创建一个自当前日期起为期十年的证书 harbor.crt

    openssl x509 -req -days 3650 -in harbor.csr -signkey harbor.key -out harbor.crt

  2. 修改配置文件 harbor.yml,仅列出自修改项。数据存储目录和日志目录自定义了。

    hostname: 192.168.3.43

    certificate: /home/atlas/apps/harbor/certs/harbor.crt
    private_key: /home/atlas/apps/harbor/certs/harbor.key

    admin用户登录密码

    harbor_admin_password: Harbor2023

    数据卷目录

    data_volume: /home/atlas/apps/harbor/data

    日志目录

    location: /home/atlas/apps/harbor/logs/

  3. 执行安装脚本

    ./install.sh

  4. 浏览器访问 https://192.168.3.43 ,测试能否正常登录访问harbor。

  5. 从GitHub https://github.com/containerd/containerd/releases 下载二进制包

  6. 解压压缩包

    tar xf cri-containerd-cni-1.7.2-linux-amd64.tar.gz -C /

  7. 生成 containerd 配置文件

    mkdir /etc/containerd
    containerd config default > /etc/containerd/config.toml

  8. 编辑 /etc/containerd/config.toml ,修改以下内容

    修改数据存储目录

    root = "/home/apps/containerd"

    对于使用systemd作为init system的linux发行版,官方建议用systemd作为容器cgroup driver

    false改成true

    SystemdCgroup = true

    修改pause镜像下载地址,这里用的是内网域名地址

    sandbox_image = "registry.atlas.cn/public/pause:3.9"

    私有harbor的连接信息

    [plugins."io.containerd.grpc.v1.cri".registry.configs."registry.atlas.cn"]
    [plugins."io.containerd.grpc.v1.cri".registry.configs."registry.atlas.cn".tls]
    insecure_skip_verify = true
    [plugins."io.containerd.grpc.v1.cri".registry.configs."registry.atlas.cn".auth]
    username = "admin"
    password = "Harbor2023"

  9. 重加载systemd配置,启动containerd

    systemctl daemon-reload
    systemctl start containerd

后面的k8s和etcd集群都会用到ca证书。如果组织能提供统一的CA认证中心,则直接使用组织颁发的CA证书即可。如果没有统一的CA认证中心,则可以通过颁发自签名的CA证书来完成安全配置。这里自行生成一个ca证书。

# 生成私钥文件ca.key
openssl genrsa -out ca.key 2048
# 根据私钥文件生成根证书文件ca.crt
# /CN为master的主机名或IP地址
# days为证书的有效期
openssl req -x509 -new -nodes -key ca.key -subj "/CN=192.168.3.31" -days 36500 -out ca.crt

# 拷贝ca证书到/etc/kubernetes/pki
mkdir -p /etc/kubernetes/pki
cp ca.crt ca.key /etc/kubernetes/pki/

部署一个三节点etcd集群,集群间使用https协议加密通信。etcd的安装包可以从官网下载,下载后解压,将压缩包中的etcdetcdctl放到/usr/local/bin目录。

  1. 编辑文件etcd_ssl.cnf。IP地址为etcd节点。

    [ req ]
    req_extensions = v3_req
    distinguished_name = req_distinguished_name

    [ req_distinguished_name ]

    [ v3_req ]

    basicConstraints = CA:FALSE
    keyUsage = nonRepudiation, digitalSignature, keyEncipherment
    subjectAltName = @alt_names

    [ alt_names ]
    IP.1 = 192.168.3.31
    IP.2 = 192.168.3.32
    IP.3 = 192.168.3.33

  2. 创建etcd服务端证书

    openssl genrsa -out etcd_server.key 2048
    openssl req -new -key etcd_server.key -config etcd_ssl.cnf -subj "/CN=etcd-server" -out etcd_server.csr
    openssl x509 -req -in etcd_server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 3650 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_server.crt

  3. 创建etcd客户端证书

    openssl genrsa -out etcd_client.key 2048
    openssl req -new -key etcd_client.key -config etcd_ssl.cnf -subj "/CN=etcd-client" -out etcd_client.csr
    openssl x509 -req -in etcd_client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 3650 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_client.crt

  4. 编辑etcd的配置文件。注意,各节点的ETCD_NAME和监听地址不一样,ip和证书文件路径要根据实际来修改。以下示例为192.168.3.31的etcd配置

    ETCD_NAME=etcd1
    ETCD_DATA_DIR=/home/atlas/apps/etcd/data

    ETCD_CERT_FILE=/home/atlas/apps/etcd/certs/etcd_server.crt
    ETCD_KEY_FILE=/home/atlas/apps/etcd/certs/etcd_server.key
    ETCD_TRUSTED_CA_FILE=/home/atlas/apps/kubernetes/certs/ca.crt
    ETCD_CLIENT_CERT_AUTH=true
    ETCD_LISTEN_CLIENT_URLS=https://192.168.3.31:2379
    ETCD_ADVERTISE_CLIENT_URLS=https://192.168.3.31:2379

    ETCD_PEER_CERT_FILE=/home/atlas/apps/etcd/certs/etcd_server.crt
    ETCD_PEER_KEY_FILE=/home/atlas/apps/etcd/certs/etcd_server.key
    ETCD_PEER_TRUSTED_CA_FILE=/home/atlas/apps/kubernetes/certs/ca.crt
    ETCD_LISTEN_PEER_URLS=https://192.168.3.31:2380
    ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.3.31:2380

    ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
    ETCD_INITIAL_CLUSTER="etcd1=https://192.168.3.31:2380,etcd2=https://192.168.3.32:2380,etcd3=https://192.168.3.33:2380"
    ETCD_INITIAL_CLUSTER_STATE=new

  5. 编辑/etc/systemd/system/etcd.service,注意根据实际修改配置文件和etcd二进制文件的路径

    [Unit]
    Description=etcd key-value store
    Documentation=https://github.com/etcd-io/etcd
    After=network.target

    [Service]
    EnvironmentFile=/home/atlas/apps/etcd/conf/etcd.conf
    ExecStart=/usr/local/bin/etcd
    Restart=always

    [Install]
    WantedBy=multi-user.target

  6. 加载systemd配置,启动etcd

    systemctl daemon-reload
    systemctl start etcd

  7. 验证集群是否部署成功。注意根据实际修改证书文件路径和etcd节点的IP与端口

    etcdctl --cacert=/etc/kubernetes/pki/ca.crt --cert=/home/atlas/apps/etcd/certs/etcd_client.crt --key=/home/atlas/apps/etcd/certs/etcd_client.key --endpoints=https://192.168.3.31:2379,https://192.168.3.32:2379,https://192.168.3.33:2379 endpoint health

如果集群部署成功,应该有如下类似输出

https://192.168.3.33:2379 is healthy: successfully committed proposal: took = 27.841376ms
https://192.168.3.32:2379 is healthy: successfully committed proposal: took = 29.489289ms
https://192.168.3.31:2379 is healthy: successfully committed proposal: took = 35.703538ms

k8s的二进制文件安装包可以从github下载:https://github.com/kubernetes/kubernetes/releases

在changelog中找到二进制包的下载链接,下载server binary即可,里面包含了master和node的二进制文件。

解压后将其中的二进制文件挪到 /usr/local/bin

6.1 安装apiserver

  1. 编辑master_ssl.cnf。DNS.5 ~ DNS.7为三台服务器的主机名,另行设置/etc/hosts。IP.1为Master Service虚拟服务的Cluster IP地址,IP.2 ~ IP.4为apiserver的服务器IP

    [req]
    req_extensions = v3_req
    distinguished_name = req_distinguished_name
    [req_distinguished_name]

    [ v3_req ]
    basicConstraints = CA:FALSE
    keyUsage = nonRepudiation, digitalSignature, keyEncipherment
    subjectAltName = @alt_names

    [alt_names]
    DNS.1 = kubernetes
    DNS.2 = kubernetes.default
    DNS.3 = kubernetes.default.svc
    DNS.4 = kubernetes.default.svc.cluster.local
    DNS.5 = k8s31
    DNS.6 = k8s32
    DNS.7 = k8s33
    IP.1 = 169.169.0.1
    IP.2 = 192.168.3.31
    IP.3 = 192.168.3.32
    IP.4 = 192.168.3.33

  2. 生成证书文件

    openssl genrsa -out apiserver.key 2048
    openssl req -new -key apiserver.key -config master_ssl.cnf -subj "/CN=192.168.3.31" -out apiserver.csr

    ca.crt和ca.key是 "2. openssl生成证书"中的两个证书文件

    openssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile master_ssl.cnf -out apiserver.crt

  3. 使用cfssl创建sa.pub和sa-key.pem。cfssl和cfssljson可以从github下载

    cat< sa-csr.json
    {
    "CN":"sa",
    "key":{
    "algo":"rsa",
    "size":2048
    },
    "names":[
    {
    "C":"CN",
    "L":"BeiJing",
    "ST":"BeiJing",
    "O":"k8s",
    "OU":"System"
    }
    ]
    }
    EOF

    cfssl和cfssljson可自行在GitHub搜索下载

    cfssl gencert -initca sa-csr.json | cfssljson -bare sa -

    openssl x509 -in sa.pem -pubkey -noout > sa.pub

  4. 编辑kube-apiserver的配置文件,注意根据实际情况修改文件路径和etcd地址

    KUBE_API_ARGS="--secure-port=6443 <br /> --tls-cert-file=/home/atlas/apps/kubernetes/apiserver/certs/apiserver.crt <br /> --tls-private-key-file=/home/atlas/apps/kubernetes/apiserver/certs/apiserver.key <br /> --client-ca-file=/home/atlas/apps/kubernetes/certs/ca.crt <br /> --service-account-issuer=https://kubernetes.default.svc.cluster.local <br /> --service-account-key-file=/home/atlas/apps/kubernetes/certs/sa.pub <br /> --service-account-signing-key-file=/home/atlas/apps/kubernetes/certs/sa-key.pem <br /> --apiserver-count=3 --endpoint-reconciler-type=master-count <br /> --etcd-servers=https://192.168.3.31:2379,https://192.168.3.32:2379,https://192.168.3.33:2379 <br /> --etcd-cafile=/home/atlas/apps/kubernetes/certs/ca.crt <br /> --etcd-certfile=/home/atlas/apps/etcd/certs/etcd_client.crt <br /> --etcd-keyfile=/home/atlas/apps/etcd/certs/etcd_client.key <br /> --service-cluster-ip-range=169.169.0.0/16 <br /> --service-node-port-range=30000-32767 <br /> --allow-privileged=true <br /> --audit-log-maxsize=100 <br /> --audit-log-maxage=15 <br /> --audit-log-path=/home/atlas/apps/kubernetes/apiserver/logs/apiserver.log --v=2"

  5. 编辑service文件。/etc/systemd/system/kube-apiserver.service

    [Unit]
    Description=Kubernetes API Server
    Documentation=https://github.com/kubernetes/kubernetes

    [Service]
    EnvironmentFile=/home/atlas/apps/kubernetes/apiserver/conf/apiserver
    ExecStart=/usr/local/bin/kube-apiserver $KUBE_API_ARGS
    Restart=always

    [Install]
    WantedBy=multi-user.target

  6. 加载service文件,启动kube-apiserver

    systemctl daemon-reload
    systemctl start kube-apiserver

  7. 生成客户端证书

    openssl genrsa -out client.key 2048

    /CN的名称用于标识连接apiserver的客户端用户名称

    openssl req -new -key client.key -subj "/CN=admin" -out client.csr
    openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -days 36500

  8. 创建客户端连接apiserver所需的kubeconfig配置文件。其中server为nginx监听地址。注意根据实际修改配置

    apiVersion: v1
    kind: Config
    clusters:

    • name: default
      cluster:
      server: https://192.168.3.31:9443
      certificate-authority: /home/atlas/apps/kubernetes/certs/ca.crt
      users:
    • name: admin
      user:
      client-certificate: /home/atlas/apps/kubernetes/apiserver/certs/client.crt
      client-key: /home/atlas/apps/kubernetes/apiserver/certs/client.key
      contexts:
    • context:
      cluster: default
      user: admin
      name: default
      current-context: default

6.2 安装kube-controller-manager

  1. 编辑配置文件 /home/atlas/apps/kubernetes/controller-manager/conf/env

    KUBE_CONTROLLER_MANAGER_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig <br /> --leader-elect=true <br /> --service-cluster-ip-range=169.169.0.0/16 <br /> --service-account-private-key-file=/home/atlas/apps/kubernetes/apiserver/certs/apiserver.key <br /> --root-ca-file=/home/atlas/apps/kubernetes/certs/ca.crt <br /> --v=0"

  2. 编辑service文件/etc/systemd/system/kube-controller-manager.service

    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/kubernetes/kubernetes

    [Service]
    EnvironmentFile=/home/atlas/apps/kubernetes/controller-manager/conf/env
    ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_ARGS
    Restart=always

    [Install]
    WantedBy=multi-user.target

  3. 加载配置文件并启动

    systemctl daemon-reload
    systemctl start kube-controller-manager

6.3 安装kube-scheduler

  1. 编辑配置文件

    KUBE_SCHEDULER_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig <br /> --leader-elect=true <br /> --v=0"

  2. 编辑service文件 /etc/systemd/system/kube-scheduler.service

    [Unit]
    Description=Kubernetes Scheduler
    Documentation=https://github.com/kubernetes/kubernetes

    [Service]
    EnvironmentFile=//home/atlas/apps/kubernetes/scheduler/conf/env
    ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_ARGS
    Restart=always

    [Install]
    WantedBy=multi-user.target

  3. 启动

    systemctl daemon-reload
    systemctl start kube-scheduler

6.4 安装nginx

这里用nginx对apiserver进行tcp反向代理,也可以使用haproxy。nginx编译安装可参考 博客园 - linux编译安装nginx,docker安装nginx更加简单,本文略过。以下为示例配置:

worker_processes  auto;

#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;

#pid        logs/nginx.pid;

events {
    worker_connections  65536;
}

stream{
    log_format json2 '$remote_addr [$time_local] '
        '$protocol $status $bytes_sent $bytes_received '
        '$session_time "$upstream_addr" '
        '"$upstream_bytes_sent" "$upstream_bytes_received" "$upstream_connect_time"'; 

    access_log logs/stream.log json2;
    upstream apiservers {
        server 192.168.3.31:6443;
        server 192.168.3.32:6443;
        server 192.168.3.33:6443;
    }
    server {
        listen 9443;
        proxy_pass apiservers;
    }
}

6.5 安装kubelet

  1. 编辑文件 /home/atlas/apps/kubernetes/kubelet/conf/env。注意修改hostname-override中的IP为Node节点自己的IP。如果修改了containerd的socket地址,则配置中也要按实际修改。

    KUBELET_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig <br /> --config=/home/atlas/apps/kubernetes/kubelet/conf/kubelet.config <br /> --hostname-override=192.168.3.31 <br /> --v=0 <br /> --container-runtime-endpoint="unix:///run/containerd/containerd.sock"

主要参数说明

参数

说明

--kubeconfig

设置与 apiserver 连接的配置,可以与 controller-manager 的 kubeconfig 相同。新的Node节点注意拷贝客户端相关证书文件,比如ca.crt, client.key, client.crt

--config

kubelet 配置文件,设置可以让多个Node共享的配置参数。

--hostname-override

本Node在集群中的名称,默认值为主机名

--network-plugin

网络插件类型,推荐使用CNI网络插件

  1. 编辑文件 /home/atlas/apps/kubernetes/kubelet/conf/kubelet.config

    kind: KubeletConfiguration
    apiVersion: kubelet.config.k8s.io/v1beta1
    address: 0.0.0.0
    port: 10250
    cgroupDriver: systemd
    clusterDNS: ["169.169.0.100"]
    clusterDomain: cluster.local
    authentication:
    anonymous:
    enabled: true

主要参数说明

参数

说明

address

服务监听IP地址

port

服务监听端口号,默认值为10250

cgroupDriver

cgroupDriver驱动,默认值为cgroupfs,可选 systemd

clusterDNS

集群DNS服务的IP地址

clusterDomain

服务DNS域名后缀

authentication

是否允许匿名访问或者是否使用webhook鉴权

  1. 编辑service文件 /etc/systemd/system/kubelet.service

    [Unit]
    Description=Kubernetes Kubelet Server
    Documentation=https://github.com/kubernetes/kubernetes
    After=docker.target

    [Service]
    EnvironmentFile=/home/atlas/apps/kubernetes/kubelet/conf/env
    ExecStart=/usr/local/bin/kubelet $KUBELET_ARGS
    Restart=always

    [Install]
    WantedBy=multi-user.target

  2. 加载service并启动kubelet

    systemctl daemon-reload && systemctl start kubelet

6.6 安装kube-proxy

  1. 编辑文件 /home/atlas/apps/kubernetes/proxy/conf/env。注意修改hostname-override中的IP为Node节点自己的IP。

    KUBE_PROXY_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig <br /> --hostname-override=192.168.3.31 <br /> --proxy-mode=ipvs <br /> --v=0"

  2. 编辑service文件 /etc/systemd/system/kube-proxy.service

    [Unit]
    Description=Kubernetes Kube-Proxy Server
    Documentation=https://github.com/kubernetes/kubernetes
    After=network.target

    [Service]
    EnvironmentFile=/home/atlas/apps/kubernetes/proxy/conf/env
    ExecStart=/usr/local/bin/kube-proxy $KUBE_PROXY_ARGS
    Restart=always

    [Install]
    WantedBy=multi-user.target

  3. 加载service并启动

    systemctl daemon-reload && systemctl start kube-proxy

6.7 安装calico

  1. 在master节点通过kubectl查询自动注册到 k8s 的 node 信息。由于 Master 开启了 https 认证,所以 kubectl 也需要使用客户端 CA证书连接Master,可以直接使用 apiserver 的 kubeconfig 文件。

    kubectl --kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig get nodes

若是不想每次敲命令都要指定kubeconfig文件,可以编辑~/.bashrc,增加如下内容后source ~/.bashrc

alias kubectl='/usr/local/bin/kubectl --kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig'

如果操作步骤和以上保持一致,命令执行应有类似如下输出。

NAME           STATUS   ROLES    AGE   VERSION
192.168.3.31   Ready    <none>   18m   v1.26.6
192.168.3.32   Ready    <none>   16m   v1.26.6
192.168.3.33   Ready    <none>   16m   v1.26.6

由于安装containerd时,安装包里已经包含了cni插件,所以节点状态都是ready。自测节点之间通信还是有问题,所以换成相对来说熟悉点的calico。

  1. 下载calico文件。

    wget https://docs.projectcalico.org/manifests/calico.yaml

  2. 编辑calico.yaml文件。因为内网离线部署,以提前在公网下载好calico的镜像并推送到内网的harbor镜像仓库,所以配置文件中的镜像修改成了内网的镜像。

    image: registry.atlas.cn/calico/cni:v3.26.1
    image: registry.atlas.cn/calico/node:v3.26.1
    image: registry.atlas.cn/calico/kube-controllers:v3.26.1

  3. 部署calico。PS:此处的kubectl命令以alias kubeconfig

    kubectl apply -f calico.yaml

  4. 查看calico的pod是否正常运行。如果正常,状态应该都是running;若不正常,则需要describe pod的信息查看什么问题

    kubectl get pods -A

6.8 集群内安装CoreDNS

  1. 编辑部署文件 coredns.yaml。其中corefile里面的forward地址是内网的dns服务器地址,若内网没有dns,可修改为/etc/resolv.conf。coredns的镜像地址也是提前推送的harbor的镜像。


    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: coredns
    namespace: kube-system
    labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    data:
    Corefile: |
    cluster.local {
    errors
    health {
    lameduck 5s
    }
    ready
    kubernetes cluster.local 169.169.0.0/16 {
    fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . 192.168.3.41
    cache 30
    loop
    reload
    loadbalance
    }
    . {
    cache 30
    loadbalance
    forward . 192.168.3.41
    }


    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: coredns
    namespace: kube-system
    labels:
    k8s-app: kube-dns
    kubernetes.io/name: "CoreDNS"
    spec:
    replicas: 1
    strategy:
    type: RollingUpdate
    rollingUpdate:
    maxUnavailable: 1
    selector:
    matchLabels:
    k8s-app: kube-dns
    template:
    metadata:
    labels:
    k8s-app: kube-dns
    spec:
    priorityClassName: system-cluster-critical
    tolerations:
    - key: "CriticalAddonsOnly"
    operator: "Exists"
    nodeSelector:
    kubernetes.io/os: linux
    affinity:
    podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
    podAffinityTerm:
    labelSelector:
    matchExpressions:
    - key: k8s-app
    operator: In
    values: ["kube-dns"]
    topologyKey: kubernetes.io/hostname
    imagePullSecrets:
    - name: registry-harbor
    containers:
    - name: coredns
    image: registry.atlas.cn/public/coredns:1.11.1
    imagePullPolicy: IfNotPresent
    resources:
    limits:
    memory: 170Mi
    requests:
    cpu: 100m
    memory: 70Mi
    args: [ "-conf", "/etc/coredns/Corefile" ]
    volumeMounts:
    - name: config-volume
    mountPath: /etc/coredns
    readOnly: true
    ports:
    - containerPort: 53
    name: dns
    protocol: UDP
    - containerPort: 53
    name: dns-tcp
    protocol: TCP
    - containerPort: 9153
    name: metrics
    protocol: TCP
    securityContext:
    allowPrivilegeEscalation: false
    capabilities:
    add:
    - NET_BIND_SERVICE
    drop:
    - all
    readOnlyRootFilesystem: true
    livenessProbe:
    httpGet:
    path: /health
    port: 8080
    scheme: HTTP
    initialDelaySeconds: 60
    timeoutSeconds: 5
    successThreshold: 1
    failureThreshold: 5
    readinessProbe:
    httpGet:
    path: /ready
    port: 8181
    scheme: HTTP
    dnsPolicy: Default
    volumes:
    - name: config-volume
    configMap:
    name: coredns
    items:
    - key: Corefile
    path: Corefile


    apiVersion: v1
    kind: Service
    metadata:
    name: kube-dns
    namespace: kube-system
    annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
    labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "CoreDNS"
    spec:
    selector:
    k8s-app: kube-dns
    clusterIP: 169.169.0.100
    ports:

    • name: dns
      port: 53
      protocol: UDP
    • name: dns-tcp
      port: 53
      protocol: TCP
    • name: metrics
      port: 9153
      protocol: TCP
  2. 部署一个nginx用于测试。注意按实际修改镜像地址


    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: deploy-nginx
    spec:
    replicas: 2
    selector:
    matchLabels:
    app: nginx
    env: dev
    template:
    metadata:
    labels:
    app: nginx
    env: dev
    spec:
    containers:
    - name: nginx
    image: registry.atlas.cn/public/nginx:1.25.1
    ports:
    - containerPort: 80


    apiVersion: v1
    kind: Service
    metadata:
    name: svc-nginx
    spec:
    ports:

    • protocol: TCP
      port: 80
      targetPort: 80
      selector:
      app: nginx
      env: dev

发布nginx服务

kubectl create -f nginx.yaml
  1. 运行一个ubuntu的pod。镜像基于原版的ubuntu:22.04修改,提前安装了dnsutils再封装推送到内网harbor,Dockerfile内容如下:

    FROM ubuntu:22.04
    RUN apt update -y && apt install -y dnsutils iputils-ping curl
    RUN apt clean && rm -rf /var/lib/apt/lists/*

使用文件声明pod。注意按实际修改镜像地址

apiVersion: v1
kind: Pod
metadata:
  name: ubuntu
  namespace: default
spec:
  containers:
  - name: ubuntu
    image: registry.atlas.cn/public/ubuntu:22.04.1
    command:
      - tail
      - -f
      - /dev/null

发布pod

kubectl create -f ubuntu.yaml

使用exec选项进入pod内

kubectl exec -it ubuntu -- bash

在pod内测试能否连通nginx。若一切响应正常,说明集群已基本搭建成功

# 测试能否解析出svc-nginx的ip
nslookup svc-nginx
# 测试能否调通svc-nginx:80
curl http://svc-nginx

以上步骤只是部署了一个能正常发布服务的基础k8s集群,生产环境中还要考虑存储、网络、安全等问题,相关内容比较多,本文不再赘述,可参考其它文档。

sysctl加载配置时报错

sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory

sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory

处理:装载内核模块

modprobe br_netfilter