CephFS client evict子命令使用

概述

在使用Ceph的CephFS时,每个client都会建立与MDS的连接,以获取CephFS的元数据信息。如果有多个Active的MDS,则一个client可能会与多个MDS都建立连接。

Ceph提供了client/session子命令来查询和管理这些连接,在这些子命令中,有一个命令来处理当CephFS的client有问题时,如何手动来断开这些client的连接,比如执行命令:# ceph tell mds.2 client evict,则会把与mds rank 2 连接的所有clients都断开。

那么执行client evict的影响是什么?是否可以恢复呢?本文将重点介绍一下这些。

命令格式

参考:http://docs.ceph.com/docs/master/cephfs/eviction/

测试环境:Ceph Mimic 13.2.1

1. 查看所有client/session

可以通过命令 client/session ls查看与ms rank [id] 建立connection的所有clients;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# ceph tell mds.0 client ls
2018-09-05 10:00:15.986 7f97f0ff9700 0 client.25196 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 10:00:16.002 7f97f1ffb700 0 client.25199 ms_handle_reset on 192.168.0.26:6800/1856812761
[
{
"id": 25085,
"num_leases": 0,
"num_caps": 5,
"state": "open",
"replay_requests": 0,
"completed_requests": 0,
"reconnecting": false,
"inst": "client.25085 192.168.0.26:0/265326503",
"client_metadata": {
"ceph_sha1": "5533ecdc0fda920179d7ad84e0aa65a127b20d77",
"ceph_version": "ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)",
"entity_id": "admin",
"hostname": "mimic3",
"mount_point": "/mnt/cephfuse",
"pid": "44876",
"root": "/"
}
}
]

比较重要的信息有:

  • id:client唯一id
  • num_caps:client获取的caps
  • inst:client端的ip和端口链接信息
  • ceph_version:client端的ceph-fuse版本,若使用kernel client,则为kernel_version
  • hostname:client端的主机名
  • mount_point:client在主机上对应的mount point
  • pid:client端ceph-fuse进程的pid

2. evict指定client

可以通过指定id来evict特定的client链接;

若有多个Active MDS,单个MDS Rank的evict也会传播到别的Active MDS

1
# ceph tell mds.0 client evict id=25085

evict client后,在对应的host上检查client的mountpoint已经不能访问:

1
2
3
4
5
6
7
root@mimic3:/mnt/cephfuse# ls
ls: cannot open directory '.': Cannot send after transport endpoint shutdown


root@mimic3:~# vim /var/log/ceph/ceph-client.admin.log
...
2018-09-05 10:02:54.829 7fbe732d7700 -1 client.25085 I was blacklisted at osd epoch 519

3. 查看ceph osd的blacklist

evict client后,会把client加入到osd blacklist中(后续有代码分析);

1
2
3
root@mimic1:~# ceph osd blacklist ls
listed 1 entries
192.168.0.26:0/265326503 2018-09-05 11:02:54.696345

加入到osd blacklist后,防止evict client的in-flight数据写下去,影响数据一致性;有效时间为1个小时;

4. 尝试恢复evict client

把ceph osd blacklist里与刚evict client相关的记录删除;

1
2
root@mimic1:~# ceph osd blacklist rm 192.168.0.26:0/265326503
un-blacklisting 192.168.0.26:0/265326503

在对应的host上检查client是否正常?发现client变得正常了!!

1
2
3
root@mimic3:~# cd /mnt/cephfuse
root@mimic3:/mnt/cephfuse# ls
perftest

而测试 Ceph Luminous 12.2.7 版本时,evcit client后无法立刻恢复,等一段时间后恢复!!

( “mds_session_autoclose”: “300.000000”,)

1
2
3
4
5
6
7
8
root@luminous2:~# ceph osd blacklist rm 192.168.213.25:0/1534097905
un-blacklisting 192.168.213.25:0/1534097905

root@luminous2:~# ceph osd blacklist ls
listed 0 entries

root@luminous2:/mnt/cephfuse# ls
ls: cannot open directory '.': Cannot send after transport endpoint shutdown

等待一段时间(300s)后,session变得正常!

1
2
root@luminous2:/mnt/cephfuse# ls
perftest

测试cephfs kernel client的evcit,client无法恢复!!

1
2
root@mimic3:~# cd /mnt/cephfs
-bash: cd: /mnt/cephfs: Permission denied

5. evict所有的client

若在evict命令后不指定具体的client id,则会把与该MDS Rank链接的所有client evict掉;

若有多个Active MDS,单个MDS Rank的evict也会传播到别的Active MDS

1
# ceph tell mds.0 client evict

这个命令慎用,也一定不要误用,影响比较大!!!

6. session kill命令

session子命令里还有一个kill命令,它比evict命令更彻底;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
root@mimic1:~# ceph tell mds.0 session kill 104704
2018-09-05 15:57:45.897 7ff2157fa700 0 client.25742 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 15:57:45.917 7ff2167fc700 0 client.25745 ms_handle_reset on 192.168.0.26:6800/1856812761


root@mimic1:~# ceph tell mds.0 session ls
2018-09-05 15:57:50.709 7f44eeffd700 0 client.95370 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 15:57:50.725 7f44effff700 0 client.95376 ms_handle_reset on 192.168.0.26:6800/1856812761
[]


root@mimic1:~# ceph osd blacklist ls
listed 1 entries
192.168.0.26:0/1613295381 2018-09-05 16:57:45.920138

删除 osd blacklist entry:

1
2
3
4
root@mimic1:~# ceph osd blacklist rm 192.168.0.26:0/1613295381
un-blacklisting 192.168.0.26:0/1613295381
root@mimic1:~# ceph osd blacklist ls
listed 0 entries

之后client链接没有再恢复!!!

1
2
3
root@mimic3:~# cd /mnt/cephfuse
root@mimic3:/mnt/cephfuse# ls
ls: cannot open directory '.': Cannot send after transport endpoint shutdown

session kill后,这个session无法再恢复!!!也要慎用!!!

代码分析

基于Ceph Mimic 13.2.1代码;

执行client evict的代码如下,可以看出里面会添加osd blacklist:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
bool MDSRank::evict_client(int64_t session_id,
bool wait, bool blacklist, std::stringstream& err_ss,
Context *on_killed)
{
...
// 获取指定id的session
Session *session = sessionmap.get_session(
entity_name_t(CEPH_ENTITY_TYPE_CLIENT, session_id));

// 定义kill mds session的函数
auto kill_mds_session = [this, session_id, on_killed]() {
assert(mds_lock.is_locked_by_me());
Session *session = sessionmap.get_session(
entity_name_t(CEPH_ENTITY_TYPE_CLIENT, session_id));
if (session) {
if (on_killed) {
server->kill_session(session, on_killed);
} else {
C_SaferCond on_safe;
server->kill_session(session, &on_safe);
mds_lock.Unlock();
on_safe.wait();
mds_lock.Lock();
}
}
...
};

// 定义添加OSD blacklist的函数
auto background_blacklist = [this, session_id, cmd](std::function<void ()> fn) {
...
Context *on_blacklist_done = new FunctionContext([this, session_id, fn](int r) {
objecter->wait_for_latest_osdmap(
new C_OnFinisher(
new FunctionContext(...), finisher)
);
});
...
monc->start_mon_command(cmd, {}, nullptr, nullptr, on_blacklist_done);
};


auto blocking_blacklist = [this, cmd, &err_ss, background_blacklist]() {
C_SaferCond inline_ctx;
background_blacklist([&inline_ctx]() {
inline_ctx.complete(0);
});
mds_lock.Unlock();
inline_ctx.wait();
mds_lock.Lock();
};

// 根据参数执行kill mds session和添加OSD的blacklist
if (wait) {
if (blacklist) {
blocking_blacklist();
}

// We dropped mds_lock, so check that session still exists
session = sessionmap.get_session(entity_name_t(CEPH_ENTITY_TYPE_CLIENT,
session_id));
...
kill_mds_session();
} else {
if (blacklist) {
background_blacklist(kill_mds_session);
} else {
kill_mds_session();
}
}
...
}

调用该函数的地方有:

1
2
3
4
5
6
7
8
9
10
Cscope tag: evict_client
# line filename / context / line
1 1965 mds/MDSRank.cc <<handle_asok_command>>
bool evicted = evict_client(strtol(client_id.c_str(), 0, 10), true,
2 2120 mds/MDSRank.cc <<evict_clients>>
evict_client(s->info.inst.name.num(), false,
3 782 mds/Server.cc <<find_idle_sessions>>
mds->evict_client(session->info.inst.name.num(), false, true,
4 1058 mds/Server.cc <<reconnect_tick>>
mds->evict_client(session->info.inst.name.num(), false, true, ss,

1、handle_asok_command:命令行处理client evict

2、evict_clients:批量evict clients

3、find_idle_sessions:对于stale状态的session,执行evict client

4、reconnect_tick:mds恢复后等待client reconnect,45s超时后evict clients

相关参数

于mds session相关的配置参数有:

1
2
3
4
5
# ceph daemon mgr.luminous2 config show | grep mds_session_
"mds_session_autoclose": "300.000000",
"mds_session_blacklist_on_evict": "true",
"mds_session_blacklist_on_timeout": "true",
"mds_session_timeout": "60.000000",

还有一些client相关的:

1
2
3
4
"client_reconnect_stale": "false",
"client_tick_interval": "1.000000",
"mon_client_ping_interval": "10.000000",
"mon_client_ping_timeout": "30.000000",

evict client后的处理

从上面的实践可以看出,evcit client后,client会被添加到osd blacklist里,超时时间为1小时;在这个时间段内,client是不能访问CephFS的;

但是通过命令:ceph osd blacklist rm <entry> 删除osd的blacklist后,client端立刻就能继续访问CephFS,一切都跟之前正常时候一样!

方法1:rm blacklist

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
root@mimic1:~# ceph tell mds.0 client evict id=25085
2018-09-05 11:07:43.580 7f80d37fe700 0 client.25364 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 11:07:44.292 7f80e8ff9700 0 client.25370 ms_handle_reset on 192.168.0.26:6800/1856812761


root@mimic1:~# ceph tell mds.0 client ls
2018-09-05 11:05:23.527 7f5005ffb700 0 client.25301 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 11:05:23.539 7f5006ffd700 0 client.94941 ms_handle_reset on 192.168.0.26:6800/1856812761
[]


root@mimic1:~# ceph osd blacklist rm 192.168.0.26:0/265326503
un-blacklisting 192.168.0.26:0/265326503


root@mimic1:~# ceph tell mds.0 client ls
2018-09-05 11:07:57.884 7fe07b7f6700 0 client.95022 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 11:07:57.900 7fe07c7f8700 0 client.25400 ms_handle_reset on 192.168.0.26:6800/1856812761
[]

然后在client host重新访问以下挂载点目录后,session变为正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
root@mimic1:~# ceph tell mds.0 client ls
2018-09-05 11:06:31.484 7f6c6bfff700 0 client.94971 ms_handle_reset on 192.168.0.26:6800/1856812761
2018-09-05 11:06:31.496 7f6c717fa700 0 client.94977 ms_handle_reset on 192.168.0.26:6800/1856812761
[
{
"id": 25085,
"num_leases": 0,
"num_caps": 4,
"state": "open",
"replay_requests": 0,
"completed_requests": 0,
"reconnecting": false,
"inst": "client.25085 192.168.0.26:0/265326503",
"client_metadata": {
"ceph_sha1": "5533ecdc0fda920179d7ad84e0aa65a127b20d77",
"ceph_version": "ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)",
"entity_id": "admin",
"hostname": "mimic3",
"mount_point": "/mnt/cephfuse",
"pid": "44876",
"root": "/"
}
}
]

方法2:wait 1小时

默认evict client后,添加osd blacklist的超时时间为1小时,考察1小时过后,session可以变为正常:

1
2
root@mimic1:~# ceph osd blacklist ls
listed 0 entries

然后在client host重新访问以下挂载点目录后,session变为正常

1
2
3
root@mimic3:~# cd /mnt/cephfuse/
root@mimic3:/mnt/cephfuse# ls
perftest

查看mds的sessions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
root@mimic1:~# ceph tell mds.0 session ls
2018-09-05 13:56:26.630 7fae7f7fe700 0 client.95118 ms_handle_reset on 192.168.0.26:6801/1541744746
2018-09-05 13:56:26.642 7fae94ff9700 0 client.25496 ms_handle_reset on 192.168.0.26:6801/1541744746
[
{
"id": 25085,
"num_leases": 0,
"num_caps": 1,
"state": "open",
"replay_requests": 0,
"completed_requests": 0,
"reconnecting": false,
"inst": "client.25085 192.168.0.26:0/265326503",
"client_metadata": {
"ceph_sha1": "5533ecdc0fda920179d7ad84e0aa65a127b20d77",
"ceph_version": "ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)",
"entity_id": "admin",
"hostname": "mimic3",
"mount_point": "/mnt/cephfuse",
"pid": "44876",
"root": "/"
}
}
]
支持原创