...
It's imperative for proper functioning of the system that the client of this API handle errors correctly. Once a TransactionBatch
is is obtained, if any exception is thrown from from TransactionBatch
(except SerializationError
) should cause the client to call TransactionBatch.abort()
to abort current transaction and then TransactionBatch.close()
and and start a new batch to write more data and/or redo the work of the last transaction during which the failure occurred. Not following this may, in rare cases, cause file corruption. Furthermore, StreamingException
should should ideally cause the client to perform exponential back off before starting new batch. This will help the cluster stabilize since the most likely reason for these failures is HDFS overload.
SerializationError
indicates indicates that a given tuple could not be parsed. The client may choose to throw away such tuples or send them to a dead letter queue. After seeing this exception, more data can be written to the current transaction and further transactions in the same TransactionBatch
.
Example – Non-secure Mode
...