ceph-deploy 2.0.0 部署 Ceph Luminous 12.2.4

需求

Ceph Luminous已经发布到12.2.4版本,经历了前几个版本的磨炼,Luminous版本也越来越稳定,它的Bluestore属性和CephFS提供的多主MDS一直是我们关注的重点。

参考ceph和ceph-deploy官方文档,本文介绍下使用ceph-deploy部署最新版Ceph Luminous 12.2.4,以及部署中遇到的问题。

软件版本

ceph-deploy版本

1
2
# ceph-deploy --version
2.0.0

ceph-deploy 2.0.0的changelog:

1
2
3
4
5
6
7
8
2.0.0
16-Jan-2018

- Backward incompatible API changes for OSD creation - will use ceph-volume and no longer consume ceph-disk.
- Remove python-distribute dependency
- Use /etc/os-release as a fallback when linux_distribution() doesn’t work
- Drop dmcrypt support (unsupported by ceph-volume for now)
- Allow debug modes for ceph-volume

参考:http://docs.ceph.com/ceph-deploy/docs/changelog.html#id1

系统版本

1
2
3
4
5
6
# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.3.1611 (Core)
Release: 7.3.1611
Codename: Core

Ceph版本

yum源配置,选择最新的Luminous版本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# cat /etc/yum.repos.d/ceph.repo
[ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.163.com/ceph/rpm-luminous/el7/$basearch
enabled=1
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.163.com/ceph/rpm-luminous/el7/noarch
enabled=1
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.163.com/ceph/rpm-luminous/el7/SRPMS
enabled=0
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc

ceph安装

1
2
3
4
5
6
# yum install -y ceph ceph-radosgw
or
# ceph-deploy install [hosts]

# ceph -v
ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)

准备工作

使用ceph-deploy开始部署前,有如下几点要提前做好

  1. 部署节点安装 ceph-deploy

    1
    # yum install -y ceph-deploy
  2. 部署节点与ceph nodes之间ssh无密码访问

    1
    2
    # ssh-keygen
    # cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  3. ceph nodes的hostname配置

    1
    # vim /etc/hostname
  4. 部署节点/etc/hosts配置 ceph nodes的hostname与ip的对应

    1
    2
    3
    4
    # vim /etc/hosts
    100.60.0.20 ceph0
    100.60.0.21 ceph1
    100.60.0.22 ceph2
  5. ceph nodes配置ntp server

    1
    2
    # yum install -y ntp ntpdate ntp-doc
    # systemctl start ntpd
  6. ceph nodes安装ssh server

    1
    # yum install -y openssh-server
  7. ceph nodes关闭或配置防火墙

    1
    2
    # systemctl stop firewalld
    # systemctl disable firewalld

Ceph Monitor部署

开始部署Ceph Cluster,创建三个monitors:

1
2
# ceph-deploy new ceph0 ceph1 ceph2
# ceph-deploy mon create ceph0 ceph1 ceph2

创建ceph keys:

1
# ceph-deploy gatherkeys ceph0 ceph1 ceph2

一定要开启ceph 认证,不然该命令执行失败

ceph-deploy mon create命令执行完要有一定时间间隔,等到monitor集群正常了才能执行

分发ceph配置和admin key到ceph集群节点:

1
# ceph-deploy admin ceph0 ceph1 ceph2

检查当前ceph集群状态:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# ceph -s
cluster:
id: 5b2192ff-299c-4024-9f10-93af008e66d3
health: HEALTH_OK

services:
mon: 3 daemons, quorum ceph0,ceph2,ceph1
mgr: no daemons active
osd: 0 osds: 0 up, 0 in

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs:

参考:http://docs.ceph.com/ceph-deploy/docs/index.html#creating-a-new-configuration

Ceph manager部署

部署ceph-mgr组件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# ceph-deploy mgr create ceph0 ceph1 ceph2
# ceph -s
cluster:
id: 5b2192ff-299c-4024-9f10-93af008e66d3
health: HEALTH_OK

services:
mon: 3 daemons, quorum ceph0,ceph2,ceph1
mgr: ceph0(active), standbys: ceph2, ceph1
osd: 0 osds: 0 up, 0 in

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs:

参考:http://docs.ceph.com/docs/master/mgr/

Ceph Luminous主推的用于Ceph集群管理的组件,默认在所有部署ceph-mon的节点都启动一个ceph-mgr

Ceph OSD部署

使用ceph-deploy工具来部署ceph osd:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# ceph-deploy osd create -h
usage: ceph-deploy osd create [-h] [--data DATA] [--journal JOURNAL]
[--zap-disk] [--fs-type FS_TYPE] [--dmcrypt]
[--dmcrypt-key-dir KEYDIR] [--filestore]
[--bluestore] [--block-db BLOCK_DB]
[--block-wal BLOCK_WAL] [--debug]
[HOST]

positional arguments:
HOST Remote host to connect

optional arguments:
-h, --help show this help message and exit
--data DATA The OSD data logical volume (vg/lv) or absolute path
to device
--journal JOURNAL Logical Volume (vg/lv) or path to GPT partition
--zap-disk DEPRECATED - cannot zap when creating an OSD
--fs-type FS_TYPE filesystem to use to format DEVICE (xfs, btrfs)
--dmcrypt use dm-crypt on DEVICE
--dmcrypt-key-dir KEYDIR
directory where dm-crypt keys are stored
--filestore filestore objectstore
--bluestore bluestore objectstore
--block-db BLOCK_DB bluestore block.db path
--block-wal BLOCK_WAL
bluestore block.wal path
--debug Enable debug mode on remote ceph-volume calls

注:从Ceph Luminous 12.2.2开始, ceph-disk 就被弃用了,开始使用新的工具ceph-volume

参考:http://docs.ceph.com/docs/master/ceph-volume/

创建一个bluestore的osd,有以下几种设备选择:

  • A block device, a block.wal, and a block.db device
  • A block device and a block.wal device
  • A block device and a block.db device
  • A single block device

参考:http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#bluestore

block device也有如下三种选项:

  • 整块磁盘
  • 磁盘分区
  • 逻辑卷(a logical volume of LVM)

配置使用整块磁盘时,ceph-volume会自动创建一个logical volume使用

单独块设备创建OSD

整块磁盘

命令格式:

1
# ceph-deploy osd create [host] --data [/path/to/device]

首先销毁磁盘的分区信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@ceph0 ceph-deploy]# ceph-deploy disk zap ceph0 /dev/sdb
...
[ceph_deploy][ERROR ] Traceback (most recent call last):
[ceph_deploy][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/util/decorators.py", line 69, in newfunc
[ceph_deploy][ERROR ] return f(*a, **kw)
[ceph_deploy][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/cli.py", line 164, in _main
[ceph_deploy][ERROR ] return args.func(args)
[ceph_deploy][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/osd.py", line 438, in disk
[ceph_deploy][ERROR ] disk_zap(args)
[ceph_deploy][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/osd.py", line 336, in disk_zap
[ceph_deploy][ERROR ] if args.debug:
[ceph_deploy][ERROR ] AttributeError: 'Namespace' object has no attribute 'debug'
[ceph_deploy][ERROR ]

修改osd.py如下:

1
2
3
[root@ceph0 ceph-deploy]# vim /usr/lib/python2.7/site-packages/ceph_deploy/osd.py
#if args.debug:
if False:

再尝试disk zap成功:

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@ceph0 ceph-deploy]# ceph-deploy disk zap ceph0 /dev/sdb
...
[ceph0][INFO ] Running command: /usr/sbin/ceph-volume lvm zap /dev/sdb
[ceph0][DEBUG ] --> Zapping: /dev/sdb
[ceph0][DEBUG ] Running command: cryptsetup status /dev/mapper/
[ceph0][DEBUG ] stdout: /dev/mapper/ is inactive.
[ceph0][DEBUG ] Running command: wipefs --all /dev/sdb
[ceph0][DEBUG ] Running command: dd if=/dev/zero of=/dev/sdb bs=1M count=10
[ceph0][DEBUG ] stderr: 10+0 records in
[ceph0][DEBUG ] 10+0 records out
[ceph0][DEBUG ] 10485760 bytes (10 MB) copied
[ceph0][DEBUG ] stderr: , 0.0110867 s, 946 MB/s
[ceph0][DEBUG ] --> Zapping successful for: /dev/sdb

从输出看,ceph-deploy调用ceph-volume lvm zap,最后是执行dd命令往disk前10M写入全0数据

创建OSD:

1
2
3
4
5
# ceph-deploy osd create ceph0 --data /dev/sdb
...
[ceph0][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
...
[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.

检查OSD:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# df -h
Filesystem Size Used Avail Use% Mounted on
...
tmpfs 16G 48K 16G 1% /var/lib/ceph/osd/ceph-0

# ll /var/lib/ceph/osd/ceph-0/
total 48
-rw-r--r-- 1 ceph ceph 393 Apr 2 18:11 activate.monmap
lrwxrwxrwx 1 ceph ceph 93 Apr 2 18:11 block -> /dev/ceph-5b2192ff-299c-4024-9f10-93af008e66d3/osd-block-fe7cd3b1-8513-4be5-b9a8-88fd47dca679
-rw-r--r-- 1 ceph ceph 2 Apr 2 18:11 bluefs
-rw-r--r-- 1 ceph ceph 37 Apr 2 18:11 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 Apr 2 18:11 fsid
-rw------- 1 ceph ceph 55 Apr 2 18:11 keyring
-rw-r--r-- 1 ceph ceph 8 Apr 2 18:11 kv_backend
-rw-r--r-- 1 ceph ceph 21 Apr 2 18:11 magic
-rw-r--r-- 1 ceph ceph 4 Apr 2 18:11 mkfs_done
-rw-r--r-- 1 ceph ceph 41 Apr 2 18:11 osd_key
-rw-r--r-- 1 ceph ceph 6 Apr 2 18:11 ready
-rw-r--r-- 1 ceph ceph 10 Apr 2 18:11 type
-rw-r--r-- 1 ceph ceph 2 Apr 2 18:11 whoami

查看OSD block对应设备:

1
2
3
4
5
6
7
8
9
10
11
12
# lsblk /dev/sdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 3.7T 0 disk
└─ceph--5b2192ff--299c--4024--9f10--93af008e66d3-osd--block--fe7cd3b1--8513--4be5--b9a8--88fd47dca679 253:3 0 3.7T 0 lvm

# pvs /dev/sdb
PV VG Fmt Attr PSize PFree
/dev/sdb ceph-5b2192ff-299c-4024-9f10-93af008e66d3 lvm2 a-- 3.64t 0

# lvs ceph-5b2192ff-299c-4024-9f10-93af008e66d3
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-fe7cd3b1-8513-4be5-b9a8-88fd47dca679 ceph-5b2192ff-299c-4024-9f10-93af008e66d3 -wi-ao---- 3.64t

结论:

  • Ceph OSD的mount路径对应的是tmpfs,Linux基于内存的文件系统,而并没有单独的块设备与之对应
  • 整块磁盘创建一个PV,然后创建VG和一个LV给OSD的block使用

没有单独的块设备与tmpfs对应,那上面的数据存在哪里了?

答:存在OSD的metadata里了!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
文件:osd/OSD.cc
int OSD::write_meta(CephContext *cct, ObjectStore *store, uuid_d& cluster_fsid, uuid_d& osd_fsid, int whoami)
{
...
snprintf(val, sizeof(val), "%s", CEPH_OSD_ONDISK_MAGIC);
r = store->write_meta("magic", val);
if (r < 0)
return r;

snprintf(val, sizeof(val), "%d", whoami);
r = store->write_meta("whoami", val);
if (r < 0)
return r;

cluster_fsid.print(val);
r = store->write_meta("ceph_fsid", val);
if (r < 0)
return r;

string key = cct->_conf->get_val<string>("key");
if (key.size()) {
r = store->write_meta("osd_key", key);
if (r < 0)
return r;
} else {
...
r = store->write_meta("ready", "ready");
...
}

文件:os/bluestore/BlueStore.cc
int BlueStore::write_meta(const std::string& key, const std::string& value)
{
bluestore_bdev_label_t label;
string p = path + "/block";
int r = _read_bdev_label(cct, p, &label);
if (r < 0) {
return ObjectStore::write_meta(key, value);
}
label.meta[key] = value;
r = _write_bdev_label(cct, p, label);
assert(r == 0);
return ObjectStore::write_meta(key, value);
}

磁盘分区

命令格式:

1
# ceph-deploy osd create [host] --data [/path/to/device-partition]

首先创建磁盘分区:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# fdisk -l /dev/sdc
...
Disk /dev/sdc: 4000.8 GB, 4000787030016 bytes, 7814037168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt
Disk identifier: 84B34A2B-5F0D-4C4F-ADCD-1974B8FD5851


# Start End Size Type Name
1 2048 7814037134 3.7T Linux filesyste

# lsblk /dev/sdc
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdc 8:32 0 3.7T 0 disk
└─sdc1 8:33 0 3.7T 0 part

请使用gpt分区格式,其他的会报错

创建OSD:

1
2
3
4
5
6
7
8
9
10
11
# ceph-deploy osd create ceph0 --data /dev/sdc1
...
[ceph0][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdc1
...
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 1
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: None
[ceph0][DEBUG ] --> ceph-volume lvm create successful for: /dev/sdc1
[ceph0][INFO ] checking OSD status...
[ceph0][DEBUG ] find the location of an executable
[ceph0][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.

检查OSD:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# ll /var/lib/ceph/osd/ceph-1/
total 48
-rw-r--r-- 1 ceph ceph 393 Apr 2 18:48 activate.monmap
lrwxrwxrwx 1 ceph ceph 93 Apr 2 18:48 block -> /dev/ceph-589101eb-51c3-42b9-adad-915bfccfc4f2/osd-block-8553492d-cb56-4b69-ab9f-d0cfcf0d0970
-rw-r--r-- 1 ceph ceph 2 Apr 2 18:48 bluefs
-rw-r--r-- 1 ceph ceph 37 Apr 2 18:48 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 Apr 2 18:48 fsid
-rw------- 1 ceph ceph 55 Apr 2 18:48 keyring
-rw-r--r-- 1 ceph ceph 8 Apr 2 18:48 kv_backend
-rw-r--r-- 1 ceph ceph 21 Apr 2 18:48 magic
-rw-r--r-- 1 ceph ceph 4 Apr 2 18:48 mkfs_done
-rw-r--r-- 1 ceph ceph 41 Apr 2 18:48 osd_key
-rw-r--r-- 1 ceph ceph 6 Apr 2 18:48 ready
-rw-r--r-- 1 ceph ceph 10 Apr 2 18:48 type
-rw-r--r-- 1 ceph ceph 2 Apr 2 18:48 whoam

# pvs /dev/sdc1
PV VG Fmt Attr PSize PFree
/dev/sdc1 ceph-589101eb-51c3-42b9-adad-915bfccfc4f2 lvm2 a-- 3.64t 0

# lvs ceph-589101eb-51c3-42b9-adad-915bfccfc4f2
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-8553492d-cb56-4b69-ab9f-d0cfcf0d0970 ceph-589101eb-51c3-42b9-adad-915bfccfc4f2 -wi-ao---- 3.64t

结论:

  • 与使用整块磁盘基本一样,不同的只是用磁盘分区创建一个PV

逻辑卷

命令格式:

1
# ceph-deploy osd create [host] --data [vg/lv]

首先创建一个逻辑卷:

1
2
3
4
5
6
7
8
# pvcreate /dev/sdd
Physical volume "/dev/sdd" successfully created.

# vgcreate sddvg /dev/sdd
Volume group "sddvg" successfully created

# lvcreate -n sddlv -l 100%FREE sddvg
Logical volume "sddlv" created.

创建OSD:

1
2
3
4
5
6
7
8
9
10
11
# ceph-deploy osd create ceph0 --data sddvg/sddlv
...
[ceph0][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data sddvg/sddlv
...
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 2
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: None
[ceph0][DEBUG ] --> ceph-volume lvm create successful for: sddvg/sddlv
[ceph0][INFO ] checking OSD status...
[ceph0][DEBUG ] find the location of an executable
[ceph0][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.

若指定--data /dev/sddvg/sddlv,命令会报错,会被认为是block device

检查OSD:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# ll /var/lib/ceph/osd/ceph-2/
total 48
-rw-r--r-- 1 ceph ceph 393 Apr 2 18:55 activate.monmap
lrwxrwxrwx 1 ceph ceph 16 Apr 2 18:55 block -> /dev/sddvg/sddlv
-rw-r--r-- 1 ceph ceph 2 Apr 2 18:55 bluefs
-rw-r--r-- 1 ceph ceph 37 Apr 2 18:55 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 Apr 2 18:55 fsid
-rw------- 1 ceph ceph 55 Apr 2 18:55 keyring
-rw-r--r-- 1 ceph ceph 8 Apr 2 18:55 kv_backend
-rw-r--r-- 1 ceph ceph 21 Apr 2 18:55 magic
-rw-r--r-- 1 ceph ceph 4 Apr 2 18:55 mkfs_done
-rw-r--r-- 1 ceph ceph 41 Apr 2 18:55 osd_key
-rw-r--r-- 1 ceph ceph 6 Apr 2 18:55 ready
-rw-r--r-- 1 ceph ceph 10 Apr 2 18:55 type
-rw-r--r-- 1 ceph ceph 2 Apr 2 18:55 whoami

结论:

  • 与前两个一致,区别只是自己收到创建了PV,VG,LV

指定block.wal和block.db设备创建OSD

当指定block.walblock.db时,对应的设备可以为两种:

  1. 物理磁盘,但必须是一个磁盘分区
  2. 逻辑卷(a logical volume of LVM)

这里只区分指定的block.walblock.db设备,data设备选择整块磁盘。

bluestore的block.dbblock.wal大小

默认值如下,db size = 0,wal size = 100663296,都比较小。

1
2
3
4
5
6
7
8
9
10
11
# ceph-conf --show-config | grep bluestore_block
bluestore_block_create = true
bluestore_block_db_create = false
bluestore_block_db_path =
bluestore_block_db_size = 0
bluestore_block_path =
bluestore_block_preallocate_file = false
bluestore_block_size = 10737418240
bluestore_block_wal_create = false
bluestore_block_wal_path =
bluestore_block_wal_size = 100663296

参考:https://marc.info/?l=ceph-devel&m=149978799900866&w=2

block.dbblock.wal的大小要求也都比较小,后续测试我们选择block.dbblock.wal为10G。

磁盘分区

命令格式:

1
# ceph-deploy osd create [host] --data [/path/to/device] --block-db [/path/to/device-partition] --block-wal [/path/to/device-partition]

首先创建两个磁盘分区给block.wal和block.db使用:

1
2
3
4
5
6
7
8
9
10
# parted -s /dev/sdf print
Model: ATA INTEL SSDSC2BB48 (scsi)
Disk /dev/sdf: 480GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 1049kB 10.7GB 10.7GB
2 10.7GB 21.5GB 10.7GB

创建OSD:

1
2
3
4
5
6
7
8
9
10
11
12
# ceph-deploy disk zap ceph0 /dev/sde
# ceph-deploy osd create ceph0 --data /dev/sde --block-db /dev/sdf1 --block-wal /dev/sdf2
...
[ceph0][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sde --block.wal /dev/sdf2 --block.db /dev/sdf1
...
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 3
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: None
[ceph0][DEBUG ] --> ceph-volume lvm create successful for: /dev/sde
[ceph0][INFO ] checking OSD status...
[ceph0][DEBUG ] find the location of an executable
[ceph0][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.

检查OSD:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# df -h
Filesystem Size Used Avail Use% Mounted on
...
tmpfs 16G 56K 16G 1% /var/lib/ceph/osd/ceph-3

# ll /var/lib/ceph/osd/ceph-3
total 56
-rw-r--r-- 1 ceph ceph 393 Apr 3 09:20 activate.monmap
lrwxrwxrwx 1 ceph ceph 93 Apr 3 09:20 block -> /dev/ceph-7777d5e4-9b81-4c17-916d-7a1e48f6268e/osd-block-3ed97688-03b4-4ca6-a497-bbd30e865852
lrwxrwxrwx 1 root root 9 Apr 3 09:20 block.db -> /dev/sdf1
lrwxrwxrwx 1 root root 9 Apr 3 09:20 block.wal -> /dev/sdf2
-rw-r--r-- 1 ceph ceph 2 Apr 3 09:20 bluefs
-rw-r--r-- 1 ceph ceph 37 Apr 3 09:20 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 Apr 3 09:20 fsid
-rw------- 1 ceph ceph 55 Apr 3 09:20 keyring
-rw-r--r-- 1 ceph ceph 8 Apr 3 09:20 kv_backend
-rw-r--r-- 1 ceph ceph 21 Apr 3 09:20 magic
-rw-r--r-- 1 ceph ceph 4 Apr 3 09:20 mkfs_done
-rw-r--r-- 1 ceph ceph 41 Apr 3 09:20 osd_key
-rw-r--r-- 1 ceph ceph 10 Apr 3 09:20 path_block.db
-rw-r--r-- 1 ceph ceph 10 Apr 3 09:20 path_block.wal
-rw-r--r-- 1 ceph ceph 6 Apr 3 09:20 ready
-rw-r--r-- 1 ceph ceph 10 Apr 3 09:20 type
-rw-r--r-- 1 ceph ceph 2 Apr 3 09:20 whoami

结论:

  • 与使用单块盘基本一样,不同的只是指定了block.db -> /dev/sdf1block.wal -> /dev/sdf2

逻辑卷

命令格式:

1
# ceph-deploy osd create [host] --data [/path/to/device] --block-db [vg/lv]  --block-wal [vg/lv]

首先创建两个逻辑卷给block.wal和block.db使用:

1
2
3
4
# pvcreate /dev/sdm
# vgcreate ssdvg /dev/sdm
# lvcreate -n db-lv-0 -L 4G ssdvg
# lvcreate -n wal-lv-0 -L 8G ssdvg

创建OSD:

1
2
3
4
5
6
7
8
9
10
11
12
# ceph-deploy disk zap ceph0 /dev/sdg
# ceph-deploy osd create ceph0 --data /dev/sdg --block-db ssdvg/db-lv-0 --block-wal ssdvg/wal-lv-0
...
[ceph0][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdg --block.wal ssdvg/wal-lv-0 --block.db ssdvg/db-lv-0
...
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 4
[ceph0][DEBUG ] --> ceph-volume lvm activate successful for osd ID: None
[ceph0][DEBUG ] --> ceph-volume lvm create successful for: /dev/sdg
[ceph0][INFO ] checking OSD status...
[ceph0][DEBUG ] find the location of an executable
[ceph0][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph0 is now ready for osd use.

若指定--block-db /dev/ssdvg/db-lv-0 --block-wal /dev/ssdvg/wal-lv-0,命令会报错

检查OSD:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# df -h
Filesystem Size Used Avail Use% Mounted on
...
tmpfs 16G 56K 16G 1% /var/lib/ceph/osd/ceph-4

# ll /var/lib/ceph/osd/ceph-4
total 56
-rw-r--r-- 1 ceph ceph 393 Apr 3 09:30 activate.monmap
lrwxrwxrwx 1 ceph ceph 93 Apr 3 09:30 block -> /dev/ceph-017b646a-0332-4677-967b-95033a3a33ab/osd-block-24228535-1fb3-4bcd-bb87-f6d5f49ed24d
lrwxrwxrwx 1 root root 18 Apr 3 09:30 block.db -> /dev/ssdvg/db-lv-0
lrwxrwxrwx 1 root root 19 Apr 3 09:30 block.wal -> /dev/ssdvg/wal-lv-0
-rw-r--r-- 1 ceph ceph 2 Apr 3 09:30 bluefs
-rw-r--r-- 1 ceph ceph 37 Apr 3 09:30 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 Apr 3 09:30 fsid
-rw------- 1 ceph ceph 55 Apr 3 09:30 keyring
-rw-r--r-- 1 ceph ceph 8 Apr 3 09:30 kv_backend
-rw-r--r-- 1 ceph ceph 21 Apr 3 09:30 magic
-rw-r--r-- 1 ceph ceph 4 Apr 3 09:30 mkfs_done
-rw-r--r-- 1 ceph ceph 41 Apr 3 09:30 osd_key
-rw-r--r-- 1 ceph ceph 19 Apr 3 09:30 path_block.db
-rw-r--r-- 1 ceph ceph 20 Apr 3 09:30 path_block.wal
-rw-r--r-- 1 ceph ceph 6 Apr 3 09:30 ready
-rw-r--r-- 1 ceph ceph 10 Apr 3 09:30 type
-rw-r--r-- 1 ceph ceph 2 Apr 3 09:30 whoami

结论:

  • 与使用单块盘基本一样,不同的只是指定了block.db -> /dev/ssdvg/db-lv-0block.wal -> /dev/ssdvg/wal-lv-0
支持原创