As part of the refactoring and integration of the Data Lake REST API into the Data Explorer, the return type of Data Lake queries needs to be harmonized.

Hence, after a brief overview of the status quo, a draft for the harmonization of return types is presented below.

Status Quo

org.apache.streampipes.model.datalake | ui/src/app/core-model/datalake

Currently, there are three different return types, depending on the query type (simple query, query with grouping, query with paging).

Data Result

The return type for simple queries without grouping or paging is a DataResult object.

Data Result
DataResult {
    measureName: string;
    total: number;
    headers: string[];
    rows: any[];
    labels: string[];
}

Note: "labels" is not used in the entire project and is therefore obsolete.

Grouped Data Result

The return type for queries with grouping is a GroupedDataResult object.

Grouped Data Result
GroupedDataResult {
    total: number;
    dataResults: Map<string, DataResult>;
}

Page Result

The return type for queries with paging is a PageResult object, that extends DataResult class by two properties.

PageResult
PageResult {
	measureName: string;
    total: number;
    headers: string[];
    rows: any[];
    labels: string[];

    page: number;
    pageSum: number;
}

Note: "labels" from DataResult as well as "pageSum" are not used in the entire project and are therefore obsolete.

Harmonization Drafts

The objective of harmonizing the return types is to elaborate a single flexible Data Lake Query Result definition that is capable of mapping the multiple features of the redesigned Data Lake REST API. Properties that are no longer needed will not be taken into consideration when defining the future Data Lake Query Result (i.e., "labels" and paging specific properties).

So far, two drafts of an aligned return type have been worked out, which differ only slightly from each other.

Draft 1

Data Lake Result - Draft 1
DataLakeResult {
    measureName: string;
    total: number;
    headers: string[];
	groupingTags: string[];
    data: Map<string, any[]>;
}
PropertyDescription
measureNameIndex of the Measurement Series in Data Lake
totalNumber of entries in data map (corresponds to the number of groups)
headersColumn names contained in the query result
groupingTagsColumn names by which was grouped
data

Actual query results

  • key: matches the form "groupingTag = groupingValue"
  • value: row-by-row query result (each row corresponds to one measurement)

Note: for queries without grouping, "groupingTags" is omitted and a unique default value (to be specified) is used as key for data mapping.

Draft 2

Instead of a simple array as value in data mapping, a dedicated Data Result object is used that contains the number of items within the group in addition to the related query results.

Data Lake Result - Draft 2
DataLakeResult {
    measureName: string;
    total: number;
    headers: string[];
	groupingTags: string[];
    data: Map<string, DataResult>;
}

DataResult {
	total: number;
	rows: any[];
}
PropertyDescription
measureNameIndex of the Measurement Series in Data Lake
totalNumber of entries in data map (corresponds to the number of groups)
headersColumn names contained in the query result
groupingTagsColumn names by which was grouped
data

Actual query results

  • key: matches the form "groupingTag = groupingValue"
  • value: data result belonging to the group
    • total: number of items in the group.
    • rows: row-by-row query result (each row corresponds to one measurement)

Note: for queries without grouping, "groupingTags" is omitted and a unique default value (to be specified) is used as key for data mapping.

  • No labels