Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: update endpoints, structure of result identifier

...

  • Currently we have various REST endpoints (/ddl, /query, /update, /aql, /queryServicequery/service), which do similar things (parsing parameters out of the request parameters, assembling the response) but each endpoint does it in a slightly different manner. All of that should be refactored to use the common base endpoint. At the same time old endpoints could be kept for backwards-compatibility if that it really needed.

  • Current Web interface does the server-side HTML generation and does not use REST calls whatsoever. It would be better to eat our own dog food here and switch to assembling the result on the client side using JavaScript. Especially since we have a potential GSoC project proposal to build such JS-based interface.

  • As we are moving towards having more query languages under AsterixDB umbrella (AQL, SQL++, XQuery\/JSONiq as a part of VXQuery) it would be nice to design a generic language-agnostic REST API, which later could be reused by VXQuery since it's also lacking proper API as of now.

...

The design is inspired by N1QL REST API (http://developer.couchbase.com/documentation/server/4.15/n1ql/n1ql-rest-api/indexexecuten1ql.html) since it is an example of well thought API in the similar system. I believe we don't need to be 100% compatible, although it would be nice to be able to reuse the same clients.

...

Proposed API consists of 3 endpoints: Query, Status and Result. The latter two are needed only in a case of asynchronous data delivery, while the former is the main endpoint and serves as an entry point for all client requests.

Query Endpoint (/query/service)

The following compares various parameters & HTTP headers using in N1QL API and in current AsterixDB API and proposes parameters to be used in new API.

...

 

N1QL API Parameter

Value

Old Asterix API Parameter

Proposed API Parameter

Comment

statement

string

query

statement

Any valid AQL statement A semicolon separated sequence of AQL/SQL++ statements (DDL, update/load statement, FLOWR query), which should be executed. The result of the last statement is returned.

format

enum

Accept HTTP headerAccept HTTP header

format

[Optional] Desired format for the query results. Possible values are ADM, JSON, CSV. (default: ADM)

signature

boolean

header

signature

[Optional] Defines whether to include a header for the results schema in the response. (default: false)

In case of CSV result format header is included right into the result.

   include-results

[Optional] Defines whether to include results right into the response, or return a handle to retrieve them. (default: true)Used only with synchronous result delivery.

-

enum

mode

mode

[Optional] Result delivery mode. Possible values are asynchronousimmediate, asynchronous- deferred, synchronousasync. (default: synchronous immediate)

-

boolean

lossless

lossless

[Optional] Defines whether to use  lossless-JSON output for JSON-encoded output or keep clean-JSON instead. (default: false)

-

boolean

wrapper-array

wrapper-array

[Optional] Defines whether to wrap ADM-encoded output into array-bracesbrackets. (default: false)

Used only when format=ADM and include-results=false.

-

boolean

print-expr-tree

expr-tree

[Optional] Defines whether to include an query expression AST into the result (default: false)

-

boolean

print-rewritten-expr-tree

rewritten-expr-tree

[Optional] Defines whether to include a rewritten query expression AST into the result (default: false)

-

boolean

print-logical-plan

logical-plan

[Optional] Defines whether to include a logical plan into the result (default: false)

-

boolean

print-optimized-logical-plan

optimized-plan

[Optional] Defines whether to include a optimized logical plan into the result (default: false)

-

boolean

print-job

hyracks-job

[Optional] Defines whether to include a Hyracks job into the result (default: false)

-

boolean

execute-query

execute-querystatement

[Optional] Defines whether to execute a query statement (default: true)

 

HTTP Response format:

...

 

-

N1QL API

Value

Old Asterix API

Proposed API Parameter

Comment

results

JSONList/URI

HTTP response body

results

One of three 2 possible values depending on result delivery

1) A list of all results returned by the query (mode=synchronous and include-results=true).

2) A URI to Status endpoint (mode=asynchronous).3) No value (statement is DDL/update/load)

signaturetypeJSONObjectString signature The schema of the results (if signature=true & format !=CSV)MIME type of result
-URI-handleA URI to Status endpoint (mode=asynchronous) or the Result endpoint (mode=synchronous and include-results=false)
signatureJSONObject-signatureThe schema of the results (if signature=true & format !=CSV)

status

string

HTTP response status

status

The status

status

string

HTTP response status

status

The status of the request; possible values are: success, running (async result), error (parse/optimizer error), fatal (execution error).error

errors

JSONObjectJSONList

HTTP response body

error

errors

A list containing objects An object containing details of the errorerrors.

error.code

int

error-code

error.code

A code that identifies the error.

error.msg

string

summary

error.msg

A message describing the error in detail.

error.namestring-error.nameA unique name for an error - 1:1 mapping to code.
error.sevint-error.sevSeverity of an error.
error.tempbooleanerror.temp'true' if the error condition is temporary, i.e. the unchanged request could be successfully executed
 string

stacktrace

error.stacktrace

A stack trace of the error.

warningsJSONList-warningsA list containing objects containing details of the warnings. The structure of the objects is the same as for errors.

metrics

JSONObject

-

metrics

An object containing details of the execution metrics.

metrics.executionTime

string

-

metrics.executionTime

The time taken for the execution of the request, i.e. time from when query execution started until the results were returned. (if mode=synchronous and execute-query=true)

metrics.resultCount

unsigned int

-

metrics.resultCount

The total number of objects in the results (if mode=synchronous and execute-query=true)

-JSONObject-plans

(Optional) An object containing details of the execution plan at different stages.

-

JSONObject/string

 

JSONObject

-

metricsplans.expr_treeexprTree

Serialized query expression tree (if expr-tree=true)

 
-

JSONObject/string

-

metrics.rewritten_expr_treeplans.rewrittenExprTree

Serialized rewritten query expression tree (if rewritten-expr-tree=true)

 
-

JSONObject/string

-

metricsplans.logical_planlogicalPlan

Serialized query logical plan (if logical-plan=true)

 
-

JSONObject/string

-

metricsplans.optimized_planoptimizedPlan

Serialized query optimized logical plan (if optimized-plan=true)

 
-

JSONObject/string

-

metricsplans.hyracks_jobhyracksJob

Serialized Hyracks job (if hyracks-job=true)

Examples:

  1. DDL request

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST \
    -d "statement=create dataverse test;"
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "success"
      "metrics": {
        "executionTime": "1ms"
      }
    }
  2. Query which is not executed, but returns logical plan

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST \
    -d "statement=for $x in dataset testDS return $x & lossless=true & logical-plan=true & execute-query=false"
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "success"
      "metrics": {
        "executionTime": "5ms"
      },
      "plans": {
        "logical_plan": [
          {"operator": "distribute_result", "args": [{"exp": "var_ref", "var": "$$0"}]},
          {"operator": "datascan", "output_vars": ["$$0", "$$1"]}
        ]}
      }
    }
  3. Query which synchronously returns CSV (with header) result inside JSON response

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST -H "Accept: text/csv" \
    -d "statement=for $x in dataset Tweets return $x & signature=true"
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "success",
      "results": "'id','screen_name','message_text'\n'1','BarackObama','Four more years'"
      "metrics": {
        "executionTime": "10ms",
        "resultCount": 1
      }
    }
  4. Query which returns optimizer error

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST \
    -d "statement=create dataset Tweets(TweetType) primary key facebook_id"
    Code Block
    titleResponse
    < HTTP/1.1 400 Bad Request
    Content-Type: application/json
    {
      "status": "error",
      "error": {
        "code": 1,
        "msg": "Field 'facebook_id' cannot be found in datatype 'TweetType'"
      }
      "metrics": {
        "executionTime": "1ms"
      }
    }
  5. Query which returns runtime error

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST \
    -d "statement=create dataset Tweets(TweetType) primary key facebook_id"
    Code Block
    titleResponse
    < HTTP/1.1 500 Internal Server Error
    Content-Type: application/json
    {
      "status": "fatal",
      "error": {
        "code": 99,
        "msg": "Something happen during query execution",
        "stacktrace": "java.lang.RuntimeException: \n\r at java.lang.Thread.run(Thread.java:745)"
      }
      "metrics": {
        "executionTime": "1ms"
      }
    }
  6. Query which runs synchronously, however does not include them into in the response, but provides a handle to retrieve them. Request also specifies to include ADM type (signature). 

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST \
    -d "statement=for $x in dataset Tweets return $x & include-results=false & signature=true"
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "success",
      "resultshandle": "http://localhost:19002/query/resultsservice/result/071cde2e-0277-11e6-b512-3e1d05defe7827-2",
      "signature": {
        "id": "int64",
        "screen_name": "string",
        "message_text": "string"
      },
      "metrics": {
        "executionTime": "10ms",
        "resultCount": 2
      }
    }
  7. Query which returns lossless-JSON results asynchronously and does not include them in the response, but provides a handle to retrieve them.

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service -X POST -H "Accept: application/json" \
    -d "statement=for $x in dataset Tweets return $x & lossless=true & mode=asynchronous & include-results=false"
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "running",
      "resultshandle": "http://localhost:19002/query/status/bd7a2b0e-0277-11e6-b512-3e1d05defe78service/status/27-2"
      "metrics": {
      }
    }

Status Endpoint (/query/service/status)

This endpoint is supposed to be used only in the case when results are delivered asynchronously. The endpoint purpose is solely to inform about status of submitted query, and possibly include a URI to results of its execution

HTTP Request(GET) format:

http://localhost:19002/query/service/status/ID

Where ID is a UUID generated an identifier generated and returned by the /query/service endpoint.

HTTP Response parameters:

The response is a small subset of response if of the /query/service endpoint

 

statusenumThe status of the request; possible values are: success (query is completed), running (query is still running), fatal (execution errorexecution error).

results

JSONObject

The URI to /query/service/result endpoint, if the query was completed (status=success and include-results=true).results

handleURIThe URI to /query/service/result endpoint, if the query was completed (status=success and include-results=false).

error

JSONObject

An object containing details of the error.

error.code

int

A code that identifies the error.

error.msg

string

A message describing the error in detail.

error.stacktracestring

A stack trace of the error.

metrics

JSONObject

An object containing details of the  execution metrics (only when status=success).

metrics.executionTime

string

The time it took to execute of the request

metrics.resultCount

int

The total number of objects in the results

...

  1. Query which returns runtime error

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service/status/bd7a2b0e-0277-11e6-b512-3e1d05defe7827-2 -X GET
    Code Block
    titleResponse
    < HTTP/1.1 500 Internal Server Error
    Content-Type: application/json
    {
      "status": "fatal",
      "error": {
        "code": 99,
        "msg": "Something happen during query execution",
        "stacktrace": "java.lang.RuntimeException: \n\r at java.lang.Thread.run(Thread.java:745)"
      }
    }
  2. Query which is still executing

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service/status/bd7a2b0e-0277-11e6-b512-3e1d05defe7827-2 -X GET 
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "running"
    } 
  3. Query which successfully completes and return URI to its results. Reffer to Query endpoint Example 7 to see the original request.

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service/status/bd7a2b0e-0277-11e6-b512-3e1d05defe7827-2 -X GET
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    {
      "status": "success",
      "resultshandle": "http://localhost:19002/query/service/resultsresult/c7e7daed-28cd-472f-890d-62e02909e5e927-2"
      "metrics": {
        "executionTime": 100ms,
        "resultCount": 10
      }
    }  

Result Endpoint (/query/service/result)

This endpoint is used to retrieve results of asynchronous query execution or synchronous results, when the client opted out from having results included into /query request (include-results=false).

...

HTTP Request(GET) format:

http://localhost:19002/query/resultsservice/result/ID

Where ID is a UUID generated an identifier generated and returned by the /query/service endpoint (include-results=false) or by /query/service/status endpoint when the asynchronous result is computed.

...

  1. Synchronous query results. Refer to Query endpoint Example 6 to see the original request.

    Code Block
    titleRequest
    curl -v http://localhost:19002/results/071cde2e-0277-11e6-b512-3e1d05defe78query/service/result/27-2 -X GET 
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/x-adm
    {
      "id": 1,
      "screen_name": "BarackObama",
      "message_text": "Four more years"
    }
    {
      "id": 2,
      "screen_name": "ElonMusk",
      "message_text": "I Would Like to Die on Mars, Just Not on Impact"
    }
  2. Asynchronous query results. Refer to Query endpoint Example 7 & Status endpoint Example 3 to see the original requests.

    Code Block
    titleRequest
    curl -v http://localhost:19002/query/service/resultsresult/c7e7daed-28cd-472f-890d-62e02909e5e927-2 -X GET
    Code Block
    titleResponse
    < HTTP/1.1 200 OK
    Content-Type: application/json
    [ {
      "id": 1,
      "screen_name": "BarackObama",
      "message_text": "Four more years"
    }, {
      "id": 2,
      "screen_name": "ElonMusk",
      "message_text": "I Would Like to Die on Mars, Just Not on Impact"
    } ]  

...