etcd空间满(V3接口)

📅 2026/6/15 17:36:57
etcd空间满(V3接口)
文章目录环境症状问题原因解决方案环境系统平台N/A版本4.5.8症状检查etcd集群结果如下下面结果是自己测试环境结果命令执行需要设置环境变量V3接口可根据解决方案中的步骤设置# etcdctl endpoint health --user root:Highgo123 http://192.168.56.12:2379 is unhealthy: failed to commit proposal: Active Alarm(s): NOSPACE http://192.168.56.10:2379 is unhealthy: failed to commit proposal: Active Alarm(s): NOSPACE http://192.168.56.11:2379 is unhealthy: failed to commit proposal: Active Alarm(s): NOSPACE Error: unhealthy cluster # etcdctl endpoint status -w table ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | http://192.168.56.10:2379 | 4dc3ceeac3266e5 | 3.5.9 | 209 MB | false | false | 95 | 37282 | 37282 | memberID:17186684763751740414 | | | | | | | | | | | alarm:NOSPACE | | http://192.168.56.11:2379 | d609d7b6f0361765 | 3.5.9 | 209 MB | false | false | 95 | 37282 | 37282 | memberID:17186684763751740414 | | | | | | | | | | | alarm:NOSPACE | | http://192.168.56.12:2379 | ee835ebbd20a73fe | 3.5.9 | 209 MB | true | false | 95 | 37282 | 37282 | memberID:17186684763751740414 | | | | | | | | | | | alarm:NOSPACE | --检查告警信息 # etcdctl alarm list memberID:17186684763751740414 alarm:NOSPACE问题原因etcd配置文件etcd.yml中quota-backend-bytes 存储空间配额可以理解为 ETCD 数据库大小默认限制 2G(推荐最大 8G)。当数据写入耗尽存储空间时ETCD 会引发整个集群范围的警告该警告将会导致集群切换为维护模式维护模式 仅接受键值读取和删除不支持写入。解决方案下面命令结果均为自己测试环境结果环境变量已设置客户环境如果没有设置环境变量需要自己根据实际环境设置环境变量设置如下export ETCDCTL_ENDPOINTShttp://192.168.56.10:2379,http://192.168.56.11:2379,http://192.168.56.12:2379 export ETCDCTL_API3 export PATH$PATH:/usr/local/hghac/etcd:/usr/local/hghac/hac/hghactl其中etcd使用v3版本的命令如果没有设置授权访问不用添加–user参数如果设置了可以到hghac.yml文件中查询 。检查etcd.yml文件确认quota-backend-bytes存储配额、auto-compaction-retention、auto-compaction-mode是否设置如果没有设置将3个参数设置存储配额quota-backend-bytes调整为8G如果存储配额quota-backend-bytes已经是8G使用方法一或方法二解决如果没有到达8G使用方法三调整参数。方法一二可能影响HGHAC使用建议关闭HGHAC后操作需要停止数据库和业务。 上述描述中的hghac.yml和etcd.yml可以通过下面命令查询# systemctl status etcd ● etcd.service - Etcd Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2025-02-06 14:47:39 CST; 1h 5min ago Main PID: 5042 (etcd) Tasks: 9 Memory: 172.2M CGroup: /system.slice/etcd.service └─5042 /usr/local/hghac/etcd/etcd --config-file/usr/local/hghac/etcd/etcd.yml 。。。。。。。 # systemctl status hghac ● hghac.service - hghac Loaded: loaded (/etc/systemd/system/hghac.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2025-02-06 10:48:17 CST; 5h 5min ago Main PID: 1233 (hghac) Tasks: 19 Memory: 32.8M CGroup: /system.slice/hghac.service ├─1233 /usr/local/hghac/hac/hghac/hghac /usr/local/hghac/hac/hghac.yml 。。。。。。。。其中查看状态就可以知道位置如果没有显示可以到服务配置文件hghac.service和etcd.service中查看。下面是三种方法操作完成后需要插入键值确认etcd是否正常方法一 压缩老数据并清理查看etcd大小# etcdctl endpoint status -w table --user root:Highgo123 ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | http://192.168.56.10:2379 | 4dc3ceeac3266e5 | 3.5.9 | 211 MB | false | false | 96 | 38756 | 38756 | memberID:350221866816923365 | | | | | | | | | | | alarm:NOSPACE | | http://192.168.56.11:2379 | d609d7b6f0361765 | 3.5.9 | 204 MB | true | false | 96 | 38757 | 38757 | memberID:350221866816923365 | | | | | | | | | | | alarm:NOSPACE | | http://192.168.56.12:2379 | ee835ebbd20a73fe | 3.5.9 | 207 MB | false | false | 96 | 38758 | 38758 | memberID:350221866816923365 | | | | | | | | | | | alarm:NOSPACE | -----------------------------------------------------------------------------------------------------------------------------------------------------------------获取当前版本# etcdctl endpoint status --write-outjson --user root:Highgo123 | egrep -o revision:[0-9]* | egrep -o [0-9].* 4011 4011 4011压缩掉所有旧版本# etcdctl compact 4011 --user root:Highgo123整理多余的空间# etcdctl --command-timeout30s defrag --user root:Highgo123command-timeout默认5s如果超时报错 context deadline exceeded可以将时间加长取消告警信息# etcdctl alarm disarm执行完成后再次检查etcd大小确认清理是否完成。方法二 Etcd集群重做见support 017213101方法三 修改etcd.yml参数修改etcd.yml已有参数修改没有的添加参数 参数如下vi /usr/local/hghac/etcd/etcd.yml quota-backend-bytes: 8589934592 auto-compaction-retention: 24h auto-compaction-mode: periodic添加完成后按照节点依次重启etcdsystemctl restart etcd 检查etcd是否正常 # etcdctl put newkey 123 --user root:Highgo123 OK如果无法重启该节点可以按照添加删除etcd节点的方法重新添加方法如下查询节点信息# etcdctl member list --user root:Highgo123 4dc3ceeac3266e5, started, etcd_01, http://192.168.56.10:2380, http://192.168.56.10:2379, false d609d7b6f0361765, started, etcd_02, http://192.168.56.11:2380, http://192.168.56.11:2379, false ee835ebbd20a73fe, started, etcd_03, http://192.168.56.12:2380, http://192.168.56.12:2379, false 删除无法重启的节点 etcdctl member remove 4dc3ceeac3266e5 --user root:Highgo123 添加该节点 etcdctl member add etcd_01 --peer-urlshttp://192.168.56.10:2380 --user root:Highgo123 修改配置文件etcd.yml中的参数initial-cluster-state initial-cluster-state: existing 根据etcd.yml文件中参数data-dir: /usr/local/hghac/etcd/etcd01 删除该文件 rm -rf /usr/local/hghac/etcd/etcd01 然后启动etcd systemctl start etcd备注模拟etcd空间满的方法 任意一个节点执行。while[1];doddif/dev/urandom bs1024count1024|ETCDCTL_API3etcdctl putkey--user root:Highgo123 || break; done