Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


IDIEP-96
Author
Sponsor
Created

  

Status
Status
colourGrey
titleDRAFT


Table of Contents

Motivation

...

It's proposed to create 3 levels of memory trackers:

  1. Global memory tracker - control total memory usage by SQL queries on a cluster node.
  2. Per-query memory tracker (perhaps we can start even with per-fragment memory tracker instead of per-query tracker to simplify implementation, since ExecutionContext currently is bounded to the frament)fragment) - control memory usage by a single SQL query/fragment.
  3. Per-execution-node memory tracker- tracks memory usage by a query execution node.

First and second trackers are configurable, third tracker is for internal usage.

Tracker on each level stores amount of memory, allocated by the tracked element and pass this information to the upper level tracker. When tracked element releases the rows (one by one or entirely), corresponding changes should be also reflected to the upper level tracker.

...

Code Block
languagejava
titleMemoryTracker
public interface MemoryTracker {
    public void onMemoryAllocated(long size);
    public void onMemoryReleased(long size);
    public void clearreset();
}

For execution node memory tracker:

Code Block
languagejava
titleRowTracker
public interface RowTracker<Row> {
    public void onRowAdded(Row row);
    public void onRowRemoved(Row row);
    public void clearreset();
}

Query memory tracker and execution node trackers are single threaded, global memory tracker can be called from the different threads. To reduce contention to upper level trackers track events can be batched on lower level trackers.

...

Object size can also be estimated without java agent using some assumptions about JVM internals (object header size, alignment, pointer size). For the most of popular JVMs such an approach gives precise results, but for some JVMs or envirements environments result can be not so accurate. Examples of tools, that use this such approach: [3], [4], [5].

To calculate object graph size (deep object size) in common case reflection and recursive reference fields traverse must be used, taking into account already visited objects. 

In our case we need rather fast than precise tool. We can use cached shallow size for most frequently used classes (classes natively supported by calcite type system and perhaps classes, supported by marshaller, which we can find in _KEY or _VAL fields) and shortcuts to calculate full deep object size without using reflection (but still with recursive traversal for some classes, like collections). For other classas (rare case, for example when object marshalled with OptimizedMarshaller appears in _KEY or _VAL column) recursive traversal through reflection is requred. Alternatively, we can skip calculation for such a classes and use some constant value.

...