【Elasticsearch 的内存和硬盘使用情况查询】

文章目录

- Elasticsearch 的内存和硬盘使用情况
- - 查看内存使用情况
  - 查看硬盘使用情况
  - 判断是否超过存储限制

Elasticsearch 的内存和硬盘使用情况

查看内存使用情况

JVM 内存使用情况：

访问路径：
```
GET /_nodes/stats/jvm
```

示例：

curl -X GET "http://localhost:9200/_nodes/stats/jvm?pretty"

输出示例：

{"nodes": {"node_id": {"name": "node_name","transport_address": "127.0.0.1:9300","host": "127.0.0.1","ip": "127.0.0.1:9300","version": "7.10.0","build_flavor": "default","build_type": "tar","build_hash": "unknown","roles": ["master", "data", "ingest"],"jvm": {"timestamp": 1609459200000,"uptime_in_millis": 123456789,"mem": {"heap_used_in_bytes": 123456789,"heap_used_percent": 45,"heap_committed_in_bytes": 268435456,"heap_max_in_bytes": 536870912,"non_heap_used_in_bytes": 12345678,"non_heap_committed_in_bytes": 23456789,"pools": {"young": {"used_in_bytes": 12345678,"max_in_bytes": 536870912,"peak_used_in_bytes": 12345678,"peak_max_in_bytes": 536870912},"survivor": {"used_in_bytes": 1234567,"max_in_bytes": 536870912,"peak_used_in_bytes": 1234567,"peak_max_in_bytes": 536870912},"old": {"used_in_bytes": 123456789,"max_in_bytes": 536870912,"peak_used_in_bytes": 123456789,"peak_max_in_bytes": 536870912}}},"threads": {"count": 100,"peak_count": 120},"gc": {"collectors": {"young": {"collection_count": 100,"collection_time_in_millis": 12345},"old": {"collection_count": 10,"collection_time_in_millis": 1234}}}}}}
}

解释：
- heap_used_in_bytes：当前堆内存使用量（字节）。
- heap_used_percent：当前堆内存使用百分比。
- heap_committed_in_bytes：JVM 已分配的堆内存（字节）。
- heap_max_in_bytes：JVM 可用的最大堆内存（字节）。
- non_heap_used_in_bytes：当前非堆内存使用量（字节）。
- non_heap_committed_in_bytes：JVM 已分配的非堆内存（字节）。
- pools：JVM 内存池的详细信息，包括年轻代（young）、幸存代（survivor）和老年代（old）。
- threads：JVM 线程的数量和峰值数量。
- gc：垃圾回收器的统计信息，包括年轻代和老年代的收集次数和时间。

集群健康状态：
- 访问路径：
```
GET /_cluster/health
```
- 示例：
```
curl -X GET "http://localhost:9200/_cluster/health?pretty"
```
- 输出示例：
```
{"cluster_name": "my_cluster","status": "green","timed_out": false,"number_of_nodes": 3,"number_of_data_nodes": 3,"active_primary_shards": 10,"active_shards": 20,"relocating_shards": 0,"initializing_shards": 0,"unassigned_shards": 0,"delayed_unassigned_shards": 0,"number_of_pending_tasks": 0,"number_of_in_flight_fetch": 0,"task_max_waiting_in_queue_millis": 0,"active_shards_percent_as_number": 100.0
}
```
- 解释：
  - cluster_name：集群名称。
  - status：集群健康状态（green、yellow、red）。
  - timed_out：请求是否超时。
  - number_of_nodes：集群中的节点数量。
  - number_of_data_nodes：集群中的数据节点数量。
  - active_primary_shards：活动的主分片数量。
  - active_shards：活动的分片总数。
  - relocating_shards：正在迁移的分片数量。
  - initializing_shards：正在初始化的分片数量。
  - unassigned_shards：未分配的分片数量。
  - delayed_unassigned_shards：延迟未分配的分片数量。
  - number_of_pending_tasks：待处理任务的数量。
  - number_of_in_flight_fetch：正在获取的分片数量。
  - task_max_waiting_in_queue_millis：任务在队列中等待的最长时间（毫秒）。
  - active_shards_percent_as_number：活动分片的百分比。

查看硬盘使用情况

节点存储使用情况：

访问路径：
```
GET /_nodes/stats/fs
```

示例：

curl -X GET "http://localhost:9200/_nodes/stats/fs?pretty"

输出示例：

{"nodes": {"node_id": {"name": "node_name","transport_address": "127.0.0.1:9300","host": "127.0.0.1","ip": "127.0.0.1:9300","version": "7.10.0","build_flavor": "default","build_type": "tar","build_hash": "unknown","roles": ["master", "data", "ingest"],"fs": {"timestamp": 1609459200000,"total": {"total_in_bytes": 500000000000,"free_in_bytes": 100000000000,"available_in_bytes": 90000000000},"data": [{"path": "/var/lib/elasticsearch/nodes/0","total_in_bytes": 500000000000,"free_in_bytes": 100000000000,"available_in_bytes": 90000000000}]}}}
}

解释：
- total_in_bytes：文件系统的总大小（字节）。
- free_in_bytes：文件系统的空闲大小（字节）。
- available_in_bytes：文件系统的可用大小（字节）。
- data：每个数据路径的详细信息，包括路径、总大小、空闲大小和可用大小。

索引存储使用情况：

访问路径：
```
GET /_cat/indices?v&h=index,store.size
```

示例：

curl -X GET "http://localhost:9200/_cat/indices?v&h=index,store.size"

输出示例：

index          store.size
my_index_1     1.2gb
my_index_2     500mb
my_index_3     2.3gb

解释：
- index：索引名称。
- store.size：索引的存储大小。

集群健康状态：

访问路径：
```
GET /_cluster/health
```

示例：

curl -X GET "http://localhost:9200/_cluster/health?pretty"

输出示例：

{"cluster_name": "my_cluster","status": "green","timed_out": false,"number_of_nodes": 3,"number_of_data_nodes": 3,"active_primary_shards": 10,"active_shards": 20,"relocating_shards": 0,"initializing_shards": 0,"unassigned_shards": 0,"delayed_unassigned_shards": 0,"number_of_pending_tasks": 0,"number_of_in_flight_fetch": 0,"task_max_waiting_in_queue_millis": 0,"active_shards_percent_as_number": 100.0
}

解释：
- 参见上文的集群健康状态解释。

判断是否超过存储限制

磁盘水位线：
- Elasticsearch 使用磁盘水位线（Disk Watermark）来管理磁盘使用情况。默认情况下，有三个水位线：
  - 低水位线（low watermark）：默认 85%。当磁盘使用超过这个阈值时，Elasticsearch 会停止在该节点上分配新的分片。
  - 高水位线（high watermark）：默认 90%。当磁盘使用超过这个阈值时，Elasticsearch 会尝试将分片从该节点迁移到其他节点。
  - 致命水位线（flood stage watermark）：默认 95%。当磁盘使用超过这个阈值时，Elasticsearch 会将索引设置为只读，以防止磁盘耗尽。
- 你可以在 elasticsearch.yml 配置文件中修改这些水位线：
```
cluster.routing.allocation.disk.watermark.low: 85%
cluster.routing.allocation.disk.watermark.high: 90%
cluster.routing.allocation.disk.watermark.flood_stage: 95%
```
监控磁盘使用情况：
- 使用上述 API 和监控工具，定期检查节点和索引的存储使用情况。
- 设置告警机制，当磁盘使用接近或超过水位线时，及时通知管理员。

通过这些方法和工具，你可以有效地监控和管理 Elasticsearch 的内存和硬盘使用情况，确保其稳定运行。