ID | IEP-13 |
Author | Anton Vinogradov |
Sponsor | Aleksey Plekhanov |
Created | 24 Jan 2017 |
Status | DRAFT |
For Ignite diagnostic usually it’s helpful to get some Ignite internals information. But currently, in my opinion, there are no convenient tools for this purpose:
Sometimes we are forced to use more than one tool and join data by hands (for example, current thread dump and data from logs).
Often RDBMS for diagnostic purposes provides system views (for example, DBA_% and V$% in Oracle), which can be queried by SQL. This solution makes all internal diagnostic information available in a readable form (with all possible filters and projections) without using any other internal or external tools. My proposal is to create similar system views in Ignite.
Aleksey Plekhanov implemented working prototype (PR:[1]). It contains views:
IGNITE_SYSTEM_VIEWS | Registered system views |
IGNITE_INSTANCE | Ignite instance |
IGNITE_JVM_THREADS | JVM threads |
IGNITE_JVM_RUNTIME | JVM runtime |
IGNITE_JVM_OS | JVM operating system |
IGNITE_CACHES | Ignite caches |
IGNITE_CACHE_CLUSTER_METRICS | Ignite cache cluster metrics |
IGNITE_CACHE_NODE_METRICS | Ignite cache node metrics |
IGNITE_CACHE_GROUPS | Cache groups |
IGNITE_NODES | Nodes in topology |
IGNITE_NODE_HOSTS | Node hosts |
IGNITE_NODE_ADDRESSES | Node addresses |
IGNITE_NODE_ATTRIBUTES | Node attributes |
IGNITE_NODE_METRICS | Node metrics |
IGNITE_TRANSACTIONS | Active transactions |
IGNITE_TRANSACTION_ENTRIES | Cache entries used by transaction |
IGNITE_TASKS | Active tasks |
IGNITE_PART_ASSIGNMENT | Partition assignment map |
IGNITE_PART_ALLOCATION | Partition allocation map |
There are much more useful views can be implemented (executors diagnostic, SPIs diagnostic).
Some usage examples:
Cache groups and their partitions, which used by transaction more than 5 minutes long:
SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS ENTITIES_CNT
FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = te.XID
JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID
WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
Average CPU load on server nodes grouped by operating system:
SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
FROM INFORMATION_SCHEMA.IGNITE_NODES n
JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = n.ID AND na.NAME = 'os.name'
JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID
WHERE n.IS_CLIENT = false
GROUP BY na.VALUE
Top 5 nodes by puts to cache ‘cache’:
SELECT cm.NODE_ID, cm.CACHE_PUTS FROM INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
WHERE cm.CACHE_NAME = 'cache'
ORDER BY cm.CACHE_PUTS DESC
LIMIT 5
// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.
// Links to various reference documents, if applicable.