Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The (key, value) pairs provided by the mapper are passed on the Loader for the TO part. The way they are written is governed by the OutputFormat. SqoopNullOutputFormat extends the OutputFormat class. The real goal of this hadoops NullOutputFormat : generates no output files  on HDFS since HDFS may not always be the destination. In our case too HDFS is not always the destination, so we use SqoopNullOutputFormat a custom class to to delegate writing to the Loader specified in the sqoop job, it relies on the SqoopOutputFormatLoadExecutor to pass the data to the Loader via the SqoopRecordWriter. Much like how the SqoopInputFormat actually reads individual records through the SqoopRecordReader implementation, the SqoopNullOutputFormat class is a factory for SqoopRecordWriter objects; these are used to write the individual records to the final destination ( in our case the Loader)TO part of the sqoop job). Notice the key to the SqoopNullOutputFormat is actually the 

SqoopWritable,that the SqoopRecordWriter uses

 

Code Block
public class SqoopNullOutputFormat extends OutputFormat<SqoopWritable, NullWritable> {
...}
 
 private class SqoopRecordWriter extends RecordWriter<SqoopWritable, NullWritable> {
    @Override
    public void write(SqoopWritable key, NullWritable value) throws InterruptedException {
      free.acquire();
      checkIfConsumerThrew();
      // NOTE: this is the place where data written from SqoopMapper writable is available to the SqoopOutputFormat
      toDataFormat.setCSVTextData(key.toString());
      filled.release();
    }

SqoopDestroyerOutputCommitter is a custom outputcommiter that provides hooks to do final cleanup or in some cases the one-time operations we want to invoke when sqoop job finishes, i,e either fails or succeeds.

...