...
The reason for HTTP API refactoring is threefold:
Currently we have various REST endpoints (/ddl, /query, /update, /aql, /queryService), which do similar things (parsing parameters out of the request parameters, assembling the response) but each endpoint does it in a slightly different manner. All of that should be refactored to use the common base endpoint, which . At the same time old endpoints could be kept for backwards-compatibility if that it really needed.
Current Web interface does the server-side HTML generation and does not use REST calls whatsoever. It would be better to eat our own dog food here and switch to assembling the result on the client side using JavaScript. Especially since we have a potential GSoC project proposal to build such JS-based interface.
- As we are moving towards having more query languages under AsterixDB umbrella (AQL, SQL++, XQuery\JSONiq as a part of VXQuery) it would be nice to design a generic language-agnostic REST API, which later could be reused by VXQuery since it's also lacking proper API as of now.
...
The design is inspired by N1QL REST API (http://developer.couchbase.com/documentation/server/4.1/n1ql/n1ql-rest-api/index.html) since it is an example of wel thoguht well thought API in the similar system. I believe we don't need to be 100% compatible, although it would be nice to be able to reuse the same clients.
We might also consider use using Swagger (http://swagger.io/) to describe the API. This library will allow users to seamlessly generate client SDKs in their favorite language, which is especially usefully since we don't provide drivers for any clients. Here is the complete set of features which Swagger will allow us to do:
...
N1QL API Parameter | Value | Old Asterix API Parameter | Proposed API Parameter | Comment |
statement | string | query | statement | Any valid AQL statement (DDL, update\/load statement, flowr FLOWR query), which should be executed |
format | enum | Accept HTTP header | Accept HTTP header | [Optional] Desired format for the query results. Possible values are ADM, JSON, CSV. (default: ADM) |
signature | boolean | header | signature | [Optional] Defines whether to include a header for the results schema in the response. (default: false) In case of CSV result format header is included right into the result. |
include-results | [Optional] Defines whether to include results right into the response, or return a handle to retrieve them. (default: true) Used only with synchronous result delivery. | |||
- | enum | mode | mode | [Optional] Result delivery mode. Possible values are asynchronous, asynchronous-deferred, synchronous. (default: synchronous) |
- | boolean | lossless | lossless | [Optional] Defines whether to use lossless-JSON output for JSON-encoded output or keep clean-JSON instead. (default: false) |
- | boolean | wrapper-array | wrapper-array | [Optional] Defines whether to wrap ADM-encoded output into array-braces. (default: false) Used only when format=ADM and include-results=false. |
- | boolean | print-expr-tree | expr-tree | [Optional] Defines whether to include an query expression AST into the result (default: false) |
- | boolean | print-rewritten-expr-tree | rewritten-expr-tree | [Optional] Defines whether to include a rewritten query expression AST into the result (default: false) |
- | boolean | print-logical-plan | logical-plan | [Optional] Defines whether to include a logical plan into the result (default: false) |
- | boolean | print-optimized-logical-plan | optimized-plan | [Optional] Defines whether to include a optimized logical plan into the result (default: false) |
- | boolean | print-job | hyracks-job | [Optional] Defines whether to include a Hyracks job into the result (default: false) |
- | boolean | execute-query | execute-query | [Optional] Defines whether to execute a query (default: true) |
...
N1QL API | Value | Old Asterix API | Proposed API Parameter | Comment |
results | JSONList/URI | HTTP response body | results | One of three possible values depending on result delivery 1) A list of all results returned by the query (mode=synchronous). 2) A URI to Status endpoint (mode=asynchronous). 3) No value (statement is DDL\/update\/load) |
signature | JSONObject | - | signature | The schema of the results (if signature=true & format !=CSV) |
status | string | HTTP response status | status | The status of the request; possible values are: success, running (async result), error (parse\/optimizer error), fatal (execution error). |
error | JSONObject | HTTP response body | error | An object containing details of the error |
error.code | int | error-code | error.code | A code that identifies the error. |
error.msg | string | summary | error.msg | A message describing the error in detail. |
string | stacktrace | error.stacktrace | A stack trace of the error. | |
metrics | JSONObject | - | metrics | An object containing details of the execution the execution metrics. |
metrics.executionTime | string | - | metrics.executionTime | The time taken for the execution of the request, i.e. time from when query execution started until the results were returned. (if mode=synchronous and execute-query=true) |
metrics.resultCount | unsigned int | - | metrics.resultCount | The total number of objects in the results (if mode=synchronous and execute-query=true) |
JSONObject | - | metrics.expr_tree | Serialized query expression tree (if expr-tree=true) | |
JSONObject | - | metrics.rewritten_expr_tree | Serialized rewritten query expression tree (if rewritten-expr-tree=true) | |
JSONObject | - | metrics.logical_plan | Serialized query logical plan (if logical-plan=true) | |
JSONObject | - | metrics.optimized_plan | Serialized query optimized logical plan (if optimized-plan=true) | |
JSONObject | - | metrics.hyracks_job | Serialized Hyracks job (if hyracks-job=true) |
Examples:
DDL request
Code Block title Request curl -v http://localhost:19002/query -X POST \ -d "statement=create dataverse test;"
Code Block title Response < HTTP/1.1 200 OK Content-Type: application/json { "status": "success" "metrics": { "executionTime": "1ms" } }
Query which is not executed, but returns logical plan
Code Block title Request curl -v http://localhost:19002/query -X POST \ -d "statement=for $x in dataset testDS return $x & lossless=true & logical-plan=true & execute-query=false"
Code Block title Response < HTTP/1.1 200 OK Content-Type: application/json { "status": "success" "metrics": { "executionTime": "5ms", "logical_plan": [ {"operator": "distribute_result", "args": [{"exp": "var_ref", "var": "$$0"}]}, {"operator": "datascan", "output_vars": ["$$0", "$$1"]} } }
Query which synchronously returns CSV (with header) result inside JSON response
Code Block title Request curl -v http://localhost:19002/query -X POST -H "Accept: text/csv" \ -d "statement=for $x in dataset Tweets return $x & signature=true"
Code Block title Response < HTTP/1.1 200 OK Content-Type: application/json { "status": "success", "results": "'id','screen_name','message_text'\n'1','BarackObama','Four more years'" "metrics": { "executionTime": "10ms", "resultCount": 1 } }
Query which returns optimizer error
Code Block title Request curl -v http://localhost:19002/query -X POST \ -d "statement=create dataset Tweets(TweetType) primary key facebook_id"
Code Block title Response < HTTP/1.1 400 Bad Request Content-Type: application/json { "status": "error", "error": { "code": 1, "msg": "Field 'facebook_id' cannot be found in datatype 'TweetType'" } "metrics": { "executionTime": "1ms" } }
Query which returns runtime error
Code Block title Request curl -v http://localhost:19002/query -X POST \ -d "statement=create dataset Tweets(TweetType) primary key facebook_id"
Code Block title Response < HTTP/1.1 500 Internal Server Error Content-Type: application/json { "status": "fatal", "error": { "code": 99, "msg": "Something happen during query execution", "stacktrace": "java.lang.RuntimeException: \n\r at java.lang.Thread.run(Thread.java:745)" } "metrics": { "executionTime": "1ms" } }
Query which runs synchronously, however does not include them into response, but provides a handle to retrieve them. Request also specifies to include ADM type (signature).
Code Block title Request curl -v http://localhost:19002/query -X POST \ -d "statement=for $x in dataset Tweets return $x & include-results=false & signature=true"
Code Block title Response < HTTP/1.1 200 OK Content-Type: application/json { "status": "success", "results": "http://localhost:19002/results/071cde2e-0277-11e6-b512-3e1d05defe78", "signature": { "id": "int64", "screen_name": "string", "message_text": "string" }, "metrics": { "executionTime": "10ms", "resultCount": 2 } }
Query which returns lossless-JSON results asynchronously
Code Block title Request curl -v http://localhost:19002/query -X POST -H "Accept: application/json" \ -d "statement=for $x in dataset Tweets return $x & lossless=true & mode=asynchronous"
Code Block title Response < HTTP/1.1 200 OK Content-Type: application/json { "status": "running", "results": "http://localhost:19002/status/bd7a2b0e-0277-11e6-b512-3e1d05defe78" "metrics": { } }
...