он журнал наверное не успевает реплеить, что ли
перед падением osd много сообщений
-5> 2020-05-24 21:16:11.305079 7fc6ab7fb700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fc67afe0700' had timed out after 15
-4> 2020-05-24 21:16:11.305080 7fc6ab7fb700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fc67b7e1700' had timed out after 15
-3> 2020-05-24 21:16:11.305082 7fc6ab7fb700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fc67e7e7700' had timed out after 15
-2> 2020-05-24 21:16:11.305084 7fc6ab7fb700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7fc6a2e3a700' had timed out after 60
-1> 2020-05-24 21:16:11.305088 7fc6ab7fb700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7fc6a2e3a700' had suicide timed out after 180
2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, long)+0x259) [0x560d9bafe7d9]
1/ 5 heartbeatmap