Num of Nodes | METRIC_RECORD (MB) | METRIC_RECORD _MINUTE (MB) | METRIC_RECORD _HOURLY (MB) | METRIC_RECORD _DAILY (MB) | METRIC_AGGREGATE (MB) | METRIC_AGGREGATE _MINUTE (MB) | METRIC_AGGREGATE _HOURLY (MB) | METRIC_AGGREGATE _DAILY (MB) | TOTAL (GB) |
50 | 5120 | 2700 | 245 | 10 | 1433.6 | 305 | 28 | 1 | 9.6 |
100 | 10240 | 5400 | 490 | 20 | 1433.6 | 305 | 28 | 1 | 17.5 |
300 | 30720 | 16200 | 1470 | 60 | 1433.6 | 305 | 28 | 1 | 49 |
500 | 51200 | 27000 | 2450 | 100 | 1433.6 | 305 | 28 | 1 | 80.6 |
800 | 81920 | 43200 | 3920 | 160 | 1433.6 | 305 | 28 | 1 | 127.9 |
NOTE
- The above guidance has been derived from looking at AMS disk utilization in actual clusters.
- The ACTUAL numbers have been obtained by observing an actual cluster with the basic services (HDFS, YARN, HBase) installed along with Storm, Kafka and Flume.
- Kafka and Flume generate metrics only while a job is running. If those services are being used, additional disk space is recommended.
Actual disk utilization data
Num of Nodes | METRIC_RECORD (MB) | METRIC_RECORD _MINUTE (MB) | METRIC_RECORD _HOURLY (MB) | METRIC_RECORD _DAILY (MB) | METRIC_AGGREGATE (MB) | METRIC_AGGREGATE _MINUTE (MB) | METRIC_AGGREGATE _HOURLY (MB) | METRIC_AGGREGATE _DAILY (MB) | TOTAL (GB) |
2 | 120 | 175 | 17 | 1 | 545 | 136 | 16 | 1 | 1 |
10 | 1024 | 540 | 49 | 2 | 1433.6 | 305 | 28 | 1 | 3.3 |
3 | 294 | 51 | 3.4 | 1 | 104 | 26 | 1.8 | 1 | 0.5 |