Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: refix javadoc links (except AbstractLazySimpleRecordWriter, still not found)

...

Concurrency Note: I/O can be performed on multiple TransactionBatches concurrently. However the transactions within a transaction batch must be consumed sequentially.

See HiveEndPoint in the Javadoc for HiveEndPoint for more information.  Generally a user will establish a connection to the end point with HiveEndPoint and then call newConnection to get a StreamingConnection.

...

The StreamingConnection class is used to acquire batches of transactions.  Once the connection has been provided by HiveEndPoint the application will generally enter a loop where it calls fetchTransactionBatch and writes a series of transactions.  When closing down, the application should call close.  See StreamingConnection in the Javadoc for more information.  

TransactionBatch

TransactionBatch is used to write a series of transactions.  For each transaction, the application calls beginNextTransactionwrite, and then commit or abort as appropriate. See TransactionBatch in the Javadoc for details.

I/O – Writing Data

...

RecordWriter is the base interface implemented by all Writers. A Writer is responsible for taking a record in the form of a byte[] containing data in a known format (such as CSV) and writing it out in the format supported by Hive streaming. A RecordWriter may reorder or drop fields from the incoming record if necessary to map them to the corresponding columns in the Hive Table.  A streaming client will instantiate an appropriate RecordWriter type and pass it to TransactionBatch. The streaming client does not directly interact with RecordWriter therafter. The TransactionBatch will thereafter use and manage the RecordWriter instance to perform I/O.  See RecordWriter in the Javadoc for details.

A RecordWriter has two primary functions.

...

Class DelimitedInputWriter provides support for writing out input data that is in delimited formats (such as CSV).  Class DelimitedInputWriter only implements the input format specific task of modifying the input record.  See DelimitedInputWriter in  See the Javadoc for details.

AbstractLazySimpleRecordWriter

...