Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
/**
 * All {@link #Column} types supported by Sqoop.
 */
public enum ColumnType {
  ARRAY,
  BINARY,
  BIT,
  DATE,
  DATE_TIME,
  DECIMAL,
  ENUM,
  FIXED_POINT,
  FLOATING_POINT,
  MAP,
  SET,
  TEXT,
  TIME,
  UNKNOWN,
  ;
}
Warning

The following is the spec as per 1.99.5, Please do not edit this directly in future. If there is format spec change in future releases add a new section to highlight what changed.

1.99.5 SQOOP CSV Format

Column TypeCSV FormatNotes
NULL value in the field

  public static final String NULL_FIELD = "NULL";

 
ARRAY
  • Will be encoded as String (and hence enclosed with '\, inside there will be JSON encoding of the top level array elements (hence the entire value will be enclosed in [] pair), Nested values are not JSON encoded..
  • Few examples:
    • Array of FixedPoint '[1,2,3]'
    • Array of Text '["A","B","C"]'
    • Array of Objects of type FixedPoint '["[11, 12]","[14, 15]"]
    • Array of Objects of type Text ["[A, B]","[X, Y]"]' - 

 

BINARY
byte array enclosed in quotes and encoded with ISO-8859-1 charset 
BIT

true, TRUE, 1

false, FALSE, 0

( not encoded in quotes )

Unsupported values should throw an exception
DATE
YYYY-MM-DD ( no time zone) 
DATE_TIME
YYYY-MM-DD HH:MM:DD[.ZZZ][+/-XX] ( fraction and timezone are optional) 
DECIMAL
Bigdecimal (not encoded in quotes ) 
ENUM
Same as TEXT 
FIXED_POINT
integer or long, ( not encoded in quotes ) 
FLOATING_POINT
float or double ( not encoded in quotes ) 
MAP
  • Will be encoded as String (and hence enclosed with '\, inside there will be JSON encoding of the map (hence the entire value will be enclosed in  pair { }
  • Map<Number, Number> '{1:20}'
  • Map<String, String> - '{"testKey":"testValue\}'
 
SET
same as ARRAY 
TEXT

Entire string will be enclosed in single quotes and all bytes will be printed as they are will exception of following bytes

Byte

Encoded as

0x5C

\ \ (no space) 

0x27

\'

0x22

\"

0x1A

\Z

0x0D

\r

0x0A

\n

0x00

\0

 
TIME
HH:MM:DD[.ZZZ] ( fraction is optional )3 digit milli second support only for time
UNKNOWN
same as BINARY 

 

1.99.5  SQOOP Object Format

SqoopDataUtils exposes a few utility methods to use to convert into the sqoop expected object format.

...