Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Use "CLEAR" command to clear the screen.   
Use "EXIT" or "QUIT" to quit the shell.

Accessing Samza

Samza SQL shell run jobs in Samza standalone mode. Samza executor can run multiple non-query jobs as backend jobs simultaneously, and run a single query job as a front-end job. Job results are stored in a data buffer.

Configuration

The configuration file is located at "conf/shell-defaults.conf".  Most of the variables set here can also be set in the shell via command "SET". Some of the important configurations are listed below.

...

Assume an user tries to execute SQL statement that selects profiles of those who have visited his LinkedIn page in the past 5 minutes. How do we display the result?

When the data volume is huge and comes very fast, a table view is probably the best. No matter how much the data volume is, the screen always displays one page out of maybe 5,000 pages. The page may contain 50 lines, each of which displays some brief information of a record. User can navigate through pages and select the row they are interested in and see the detail of the record.

When the amount of data is much less, however, the table view is not straightforward. A logging view, which continuously displays the newly coming data and scrolls the screen up is more convenient for the user. Of course, the user can still pause and resume the screen so they can examine a specific record that they are interested in.

The shell supports the logging view at the moment. The table view will be supported as a future work.

The SamzaSqlExecutor

SamzaExecutor is the default implementation of SqlExecutor. SamzaExecutor returns job results into a data buffer, where Shell retrieves results and then show those results in the terminal with the given format from users.

List tables. Currently Shell can only talk to Kafka system, but in SAMZA-1902, we will use a general way to connect to different systems.

Get table schema. SamzaExecutor uses AvroSqlSchemaConverter to convert Avro schema to Samza SQL schema. Currently Shell works only for systems that has Avro schemas.

List functions. Currently the Shell only shows some UDFs supported by Samza internally. We may need to require UDFs to provide a function of getting their "SamzaSqlUdfDisplayInfo", then we can get the UDF information from SamzaSqlApplicationConfig.udfResolver or SamzaSqlApplicationConfig.udfMetadata (please refer to SAMZA-1957). 

Execute non-query jobs. SamzaExecutor can run multiple non-query jobs simultaneously, and users can manage (ls / stop / rm) each job easily.

Execute query jobs. SamzaExecutor runs a single query job as a front-end job. It stores query results in a data buffer, where Shell retrieves results.