分布式存储系统之Ceph集群CephFS基础使用
阅读原文时间:2023年07月08日阅读:2

  前文我们了解了ceph之上的RBD接口使用相关话题,回顾请参考https://www.cnblogs.com/qiuhom-1874/p/16753098.html;今天我们来聊一聊ceph之上的另一个客户端接口cephfs使用相关话题;

  CephFS概述

  文件系统是至今在计算机领域中用到的存储访问中最通用也是最普遍的接口;即便是我们前面聊到的RDB块设备,绝大多数都是格式化分区挂载至文件系统之上使用;使用纯裸设备的场景其实不多;为此,ceph在向外提供客户端接口中也提供了文件系统接口cephfs;不同于rbd的架构,cephfs需要在rados存储集群上启动一个mds的进程来帮忙管理文件系统的元数据信息;我们知道对于rados存储系统来说,不管什么方式的客户端,存储到rados之上的数据都会经由存储池,然后存储到对应的osd之上;对于mds(metadata server )来说,它需要工作为一个守护进程,为其客户端提供文件系统服务;客户端的每一次存取操作,都会先联系mds,找对应的元数据信息;但是mds它自身不存储任何元数据信息,文件系统的元数据信息都会存储到rados的一个存储池当中,而文件本身的数据存储到另一个存储池当中;这也意味着msd是一个无状态服务,有点类似k8s里的apiserver,自身不存储数据,而是将数据存储至etcd中,使得apiserver 成为一个无状态服务;mds为一个无状态服务,也就意味着可以有多个mds同时提供服务,相比传统文件存储系统来讲metadata server成为瓶颈的可能也就不复存在;

  提示:CephFS依赖于专用的MDS(MetaData Server)组件管理元数据信息并向客户端输出一个倒置的树状层级结构;将元数据缓存于MDS的内存中, 把元数据的更新日志于流式化后存储在RADOS集群上, 将层级结构中的的每个名称空间对应地实例化成一个目录且存储为一个专有的RADOS对象;

  CephFS架构

  提示:cephfs是建构在libcephfs之上,libcephfs建构在librados之上,即cephfs工作在librados的顶端,向外提供文件系统服务;它支持两种方式的使用,一种是基于内核空间模块(ceph)挂载使用,一种是基于用户空间FUSE来挂载使用;

  创建CephFS

  通过上述描述,cephfs的工作逻辑首先需要两个存储池来分别存放元数据和数据;这个我们在前边的ceph访问接口启用一文中有聊到过,回顾请参考https://www.cnblogs.com/qiuhom-1874/p/16727620.html;我这里不做过多说明;

  查看CephFS状态

[root@ceph-admin ~]# ceph fs status cephfs

cephfs - 0 clients

+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | ceph-mon02 | Reqs: 0 /s | 10 | 13 |
+------+--------+------------+---------------+-------+-------+
+---------------------+----------+-------+-------+
| Pool | type | used | avail |
+---------------------+----------+-------+-------+
| cephfs-metadatapool | metadata | 2286 | 280G |
| cephfs-datapool | data | 0 | 280G |
+---------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
[root@ceph-admin ~]#

  CephFS客户端账号

  启用CephX认证的集群上,CephFS的客户端完成认证后方可挂载访问文件系统;

[root@ceph-admin ~]# ceph auth get-or-create client.fsclient mon 'allow r' mds 'allow rw' osd 'allow rw pool=cephfs-datapool'
[client.fsclient]
key = AQDx2z5jgeqiIRAAIxQFz09BF99kcAYxiFwOWg==
[root@ceph-admin ~]# ceph auth get client.fsclient
exported keyring for client.fsclient
[client.fsclient]
key = AQDx2z5jgeqiIRAAIxQFz09BF99kcAYxiFwOWg==
caps mds = "allow rw"
caps mon = "allow r"
caps osd = "allow rw pool=cephfs-datapool"
[root@ceph-admin ~]#

  提示:这里需要注意,对于元数据存储池来说,它的客户端是mds,对应数据的读写都是有mds来完成操作,对于cephfs的客户端来说,他不需要任何操作元数据存储池的权限,我们这里只需要授权用户对数据存储池有读写权限即可;对于mon节点来说,用户只需要有读的权限就好,对mds有读写权限就好;

  保存用户账号的密钥信息于secret文件,用于客户端挂载操作认证之用

[root@ceph-admin ~]# ceph auth print-key client.fsclient
AQDx2z5jgeqiIRAAIxQFz09BF99kcAYxiFwOWg==[root@ceph-admin ~]# ceph auth print-key client.fsclient -o fsclient.key
[root@ceph-admin ~]# cat fsclient.key
AQDx2z5jgeqiIRAAIxQFz09BF99kcAYxiFwOWg==[root@ceph-admin ~]#

  提示:这里只需要导出key的信息就好,对于权限信息,客户端用不到,客户端拿着key去ceph上认证,对应权限ceph是知道的;

  将密钥文件需要保存于挂载CephFS的客户端主机上,我们可以使用scp的方式推到客户端主机之上;客户端主机除了要有这个key文件之外,还需要有ceph集群的配置文件

  提示:我这里以admin host作为客户端使用,将对应key文件复制到/etc/ceph/目录下,对应使用内核模块挂载,指定mount到该目录读取对应key的信息即可;

  内核客户端安装必要工具和模块

  1、内核模块ceph.ko

  2、安装ceph-common程序包

  3、提供ceph.conf配置文件和用于认证的密钥文件

[root@ceph-admin ~]# ls /lib/modules/3.10.0-1160.76.1.el7.x86_64/kernel/fs/ceph/
ceph.ko.xz
[root@ceph-admin ~]# modinfo ceph
filename: /lib/modules/3.10.0-1160.76.1.el7.x86_64/kernel/fs/ceph/ceph.ko.xz
license: GPL
description: Ceph filesystem for Linux
author: Patience Warnick patience@newdream.net
author: Yehuda Sadeh yehuda@hq.newdream.net
author: Sage Weil sage@newdream.net
alias: fs-ceph
retpoline: Y
rhelversion: 7.9
srcversion: B1FF0EC5E9EF413CE8D9D1C
depends: libceph
intree: Y
vermagic: 3.10.0-1160.76.1.el7.x86_64 SMP mod_unload modversions
signer: CentOS Linux kernel signing key
sig_key: C6:93:65:52:C5:A1:E9:97:0B:A2:4C:98:1A:C4:51:A6:BC:11:09:B9
sig_hashalgo: sha256
[root@ceph-admin ~]# yum info ceph-common
Loaded plugins: fastestmirror
Repository epel is listed more than once in the configuration
Repository epel-debuginfo is listed more than once in the configuration
Repository epel-source is listed more than once in the configuration
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
Installed Packages
Name : ceph-common
Arch : x86_64
Epoch : 2
Version : 13.2.10
Release : 0.el7
Size : 44 M
Repo : installed
From repo : Ceph
Summary : Ceph Common
URL : http://ceph.com/
License : LGPL-2.1 and CC-BY-SA-3.0 and GPL-2.0 and BSL-1.0 and BSD-3-Clause and MIT
Description : Common utilities to mount and interact with a ceph storage cluster.
: Comprised of files that are common to Ceph clients and servers.

[root@ceph-admin ~]# ls /etc/ceph/
ceph.client.admin.keyring ceph.client.test.keyring ceph.conf fsclient.key rbdmap tmpJ434zL
[root@ceph-admin ~]#

  mount挂载CephFS

[root@ceph-admin ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 899M 0 899M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 9.6M 901M 2% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 49G 3.6G 45G 8% /
/dev/sda1 509M 176M 334M 35% /boot
tmpfs 182M 0 182M 0% /run/user/0
[root@ceph-admin ~]# mount -t ceph ceph-mon01:6789,ceph-mon02:6789,ceph-mon03:6789:/ /mnt -o name=fsclient,secretfile=/etc/ceph/fsclient.key
[root@ceph-admin ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 899M 0 899M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 9.6M 901M 2% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 49G 3.6G 45G 8% /
/dev/sda1 509M 176M 334M 35% /boot
tmpfs 182M 0 182M 0% /run/user/0
192.168.0.71:6789,172.16.30.71:6789,192.168.0.72:6789,172.16.30.72:6789,192.168.0.73:6789,172.16.30.73:6789:/ 281G 0 281G 0% /mnt
[root@ceph-admin ~]# mount |tail -1
192.168.0.71:6789,172.16.30.71:6789,192.168.0.72:6789,172.16.30.72:6789,192.168.0.73:6789,172.16.30.73:6789:/ on /mnt type ceph (rw,relatime,name=fsclient,secret=,acl,wsize=16777216)
[root@ceph-admin ~]#

  查看挂载状态

[root@ceph-admin ~]# stat -f /mnt
File: "/mnt"
ID: a0de3ae372c48f48 Namelen: 255 Type: ceph
Block size: 4194304 Fundamental block size: 4194304
Blocks: Total: 71706 Free: 71706 Available: 71706
Inodes: Total: 0 Free: -1
[root@ceph-admin ~]#

  在/mnt上存储数据,看看对应是否可以正常存储?

[root@ceph-admin ~]# find /usr/share/ -type f -name '*.jpg' -exec cp {} /mnt \;
[root@ceph-admin ~]# ll /mnt
total 3392
-rw-r--r-- 1 root root 961243 Oct 6 22:17 day.jpg
-rw-r--r-- 1 root root 961243 Oct 6 22:17 default.jpg
-rw-r--r-- 1 root root 980265 Oct 6 22:17 morning.jpg
-rw-r--r-- 1 root root 569714 Oct 6 22:17 night.jpg
[root@ceph-admin ~]#

  提示:可以看到我们可以正常向/mnt存储文件;

  将挂载信息写入/etc/fstab配置文件

  提示:这里需要注意,写入fstab文件中,如果是网络文件系统建议加上_netdev选项来指定该文件系统为网络文件系统,如果当系统启动,如果挂载不上,超时后自动放弃挂载,否则系统会一直尝试挂载,导致系统可能启动不起来;

  测试,取消/mnt的挂载,然后使用fstab配置文件挂载,看看是否可以正常挂载?

[root@ceph-admin ~]# umount /mnt
[root@ceph-admin ~]# ls /mnt
[root@ceph-admin ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 899M 0 899M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 9.6M 901M 2% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 49G 3.6G 45G 8% /
/dev/sda1 509M 176M 334M 35% /boot
tmpfs 182M 0 182M 0% /run/user/0
[root@ceph-admin ~]# mount -a
[root@ceph-admin ~]# ls /mnt
day.jpg default.jpg morning.jpg night.jpg
[root@ceph-admin ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 899M 0 899M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 9.6M 901M 2% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 49G 3.6G 45G 8% /
/dev/sda1 509M 176M 334M 35% /boot
tmpfs 182M 0 182M 0% /run/user/0
192.168.0.71:6789,172.16.30.71:6789,192.168.0.72:6789,172.16.30.72:6789,192.168.0.73:6789,172.16.30.73:6789:/ 281G 0 281G 0% /mnt
[root@ceph-admin ~]#

  提示:可以看到我们使用mount -a是可以直接将cephfs挂载至/mnt之上,说明我们配置文件内容没有问题;ok,到此基于内核空间模块(ceph.ko)挂载cephfs文件系统的测试就完成了,接下来我们来说说使用用户空间fuse挂载cephfs;

  FUSE挂载CephFS

  FUSE,全称Filesystem in Userspace,用于非特权用户能够无需操作内核而创建文件系统;客户端主机环境准备,安装ceph-fuse程序包,获取到客户端账号的keyring文件和ceph.conf配置文件即可;这里就不需要ceph-common;

[root@ceph-admin ~]# yum install -y ceph-fuse
Loaded plugins: fastestmirror
Repository epel is listed more than once in the configuration
Repository epel-debuginfo is listed more than once in the configuration
Repository epel-source is listed more than once in the configuration
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
Ceph | 1.5 kB 00:00:00
Ceph-noarch | 1.5 kB 00:00:00
base | 3.6 kB 00:00:00
ceph-source | 1.5 kB 00:00:00
epel | 4.7 kB 00:00:00
extras | 2.9 kB 00:00:00
updates | 2.9 kB 00:00:00
(1/4): extras/7/x86_64/primary_db | 249 kB 00:00:00
(2/4): epel/x86_64/updateinfo | 1.1 MB 00:00:00
(3/4): epel/x86_64/primary_db | 7.0 MB 00:00:01
(4/4): updates/7/x86_64/primary_db | 17 MB 00:00:02
Resolving Dependencies
--> Running transaction check
---> Package ceph-fuse.x86_64 2:13.2.10-0.el7 will be installed
--> Processing Dependency: fuse for package: 2:ceph-fuse-13.2.10-0.el7.x86_64
--> Running transaction check
---> Package fuse.x86_64 0:2.9.2-11.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==================================================================================================================================

Package Arch Version Repository Size

Installing:
ceph-fuse x86_64 2:13.2.10-0.el7 Ceph 490 k
Installing for dependencies:
fuse x86_64 2.9.2-11.el7 base 86 k

Transaction Summary

Install 1 Package (+1 Dependent package)

Total download size: 576 k
Installed size: 1.6 M
Downloading packages:
(1/2): fuse-2.9.2-11.el7.x86_64.rpm | 86 kB 00:00:00

(2/2): ceph-fuse-13.2.10-0.el7.x86_64.rpm | 490 kB 00:00:15

Total 37 kB/s | 576 kB 00:00:15
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : fuse-2.9.2-11.el7.x86_64 1/2
Installing : 2:ceph-fuse-13.2.10-0.el7.x86_64 2/2
Verifying : 2:ceph-fuse-13.2.10-0.el7.x86_64 1/2
Verifying : fuse-2.9.2-11.el7.x86_64 2/2

Installed:
ceph-fuse.x86_64 2:13.2.10-0.el7

Dependency Installed:
fuse.x86_64 0:2.9.2-11.el7

Complete!
[root@ceph-admin ~]#

  挂载CephFS

[root@ceph-admin ~]# ceph-fuse -n client.fsclient -m ceph-mon01:6789,ceph-mon02:6789,ceph-mon03:6789 /mnt
2022-10-06 23:13:17.185 7fae97fbec00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.fsclient.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2022-10-06 23:13:17.185 7fae97fbec00 -1 monclient: ERROR: missing keyring, cannot use cephx for authentication
failed to fetch mon config (--no-mon-config to skip)
[root@ceph-admin ~]#

  提示:这里提示我们在/etc/ceph/目录下没有找到对应用户的keyring文件;

  导出client.fsclient用户密钥信息,并存放在/etc/ceph下取名为ceph.client.fsclient.keyring;

[root@ceph-admin ~]# ceph auth get client.fsclient -o /etc/ceph/ceph.client.fsclient.keyring
exported keyring for client.fsclient
[root@ceph-admin ~]# cat /etc/ceph/ceph.client.fsclient.keyring
[client.fsclient]
key = AQDx2z5jgeqiIRAAIxQFz09BF99kcAYxiFwOWg==
caps mds = "allow rw"
caps mon = "allow r"
caps osd = "allow rw pool=cephfs-datapool"
[root@ceph-admin ~]#

  再次使用ceph-fuse挂载cephfs

[root@ceph-admin ~]# ceph-fuse -n client.fsclient -m ceph-mon01:6789,ceph-mon02:6789,ceph-mon03:6789 /mnt
2022-10-06 23:16:43.066 7fd51d9c0c00 -1 init, newargv = 0x55f0016ebd40 newargc=7
ceph-fuse[8096]: starting ceph client
ceph-fuse[8096]: starting fuse
[root@ceph-admin ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 899M 0 899M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 9.6M 901M 2% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 49G 3.6G 45G 8% /
/dev/sda1 509M 176M 334M 35% /boot
tmpfs 182M 0 182M 0% /run/user/0
ceph-fuse 281G 4.0M 281G 1% /mnt
[root@ceph-admin ~]# ll /mnt
total 3392
-rw-r--r-- 1 root root 961243 Oct 6 22:17 day.jpg
-rw-r--r-- 1 root root 961243 Oct 6 22:17 default.jpg
-rw-r--r-- 1 root root 980265 Oct 6 22:17 morning.jpg
-rw-r--r-- 1 root root 569714 Oct 6 22:17 night.jpg
[root@ceph-admin ~]#

  将挂载信息写入/etc/fstab文件中

  提示:使用fuse方式挂载cephfs,对应文件系统设备为none,类型为fuse.ceph;挂载选项里只需要指定用ceph.id(ceph授权的用户名,不要前缀client),ceph配置文件路径;这里不需要指定密钥文件,因为ceph.id指定的用户名,ceph-fuse会自动到/etc/ceph/目录下找对应文件名的keyring文件来当作对应用户名的密钥文件;

  测试,使用fstab配置文件,看看是否可正常挂载?

[root@ceph-admin ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 919632 0 919632 0% /dev
tmpfs 931496 0 931496 0% /dev/shm
tmpfs 931496 9744 921752 2% /run
tmpfs 931496 0 931496 0% /sys/fs/cgroup
/dev/mapper/centos-root 50827012 3718492 47108520 8% /
/dev/sda1 520868 179572 341296 35% /boot
tmpfs 186300 0 186300 0% /run/user/0
[root@ceph-admin ~]# ll /mnt
total 0
[root@ceph-admin ~]# mount -a
ceph-fuse[8770]: starting ceph client
2022-10-06 23:25:57.230 7ff21e3f7c00 -1 init, newargv = 0x5614f1bad9d0 newargc=9
ceph-fuse[8770]: starting fuse
[root@ceph-admin ~]# mount |tail -2
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
ceph-fuse on /mnt type fuse.ceph-fuse (rw,relatime,user_id=0,group_id=0,allow_other)
[root@ceph-admin ~]# ll /mnt
total 3392
-rw-r--r-- 1 root root 961243 Oct 6 22:17 day.jpg
-rw-r--r-- 1 root root 961243 Oct 6 22:17 default.jpg
-rw-r--r-- 1 root root 980265 Oct 6 22:17 morning.jpg
-rw-r--r-- 1 root root 569714 Oct 6 22:17 night.jpg
[root@ceph-admin ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 899M 0 899M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 9.6M 901M 2% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 49G 3.6G 45G 8% /
/dev/sda1 509M 176M 334M 35% /boot
tmpfs 182M 0 182M 0% /run/user/0
ceph-fuse 281G 4.0M 281G 1% /mnt
[root@ceph-admin ~]#

  提示:可以看到使用mount -a 读取配置文件也是可以正常挂载,说明我们配置文件中的内容没有问题;

  卸载文件系统的方式

  第一种我们可以使用umount 挂载点来实现卸载;

[root@ceph-admin ~]# mount |tail -1
ceph-fuse on /mnt type fuse.ceph-fuse (rw,relatime,user_id=0,group_id=0,allow_other)
[root@ceph-admin ~]# ll /mnt
total 3392
-rw-r--r-- 1 root root 961243 Oct 6 22:17 day.jpg
-rw-r--r-- 1 root root 961243 Oct 6 22:17 default.jpg
-rw-r--r-- 1 root root 980265 Oct 6 22:17 morning.jpg
-rw-r--r-- 1 root root 569714 Oct 6 22:17 night.jpg
[root@ceph-admin ~]# umount /mnt
[root@ceph-admin ~]# ll /mnt
total 0
[root@ceph-admin ~]#

  第二种我们使用fusermount -u 挂载点来卸载

[root@ceph-admin ~]# mount -a
ceph-fuse[9717]: starting ceph client
2022-10-06 23:40:55.540 7f169dbc4c00 -1 init, newargv = 0x55859177fa40 newargc=9
ceph-fuse[9717]: starting fuse
[root@ceph-admin ~]# ll /mnt
total 3392
-rw-r--r-- 1 root root 961243 Oct 6 22:17 day.jpg
-rw-r--r-- 1 root root 961243 Oct 6 22:17 default.jpg
-rw-r--r-- 1 root root 980265 Oct 6 22:17 morning.jpg
-rw-r--r-- 1 root root 569714 Oct 6 22:17 night.jpg
[root@ceph-admin ~]# fusermount -u /mnt
[root@ceph-admin ~]# ll /mnt
total 0
[root@ceph-admin ~]#

  ok,到此基于用户空间fuse方式挂载cephfs的测试就完成了;