Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Column TypeCSV FormatObject Format
NULL value in the field

  public static final String NULL_FIELD = "NULL";

java null
ARRAY
  • Will be encoded as String (and hence enclosed with '\, inside there will be JSON encoding of the top level array elements (hence the entire value will be enclosed in [] pair), Nested values are not JSON encoded..
  • Few examples:
    • Array of FixedPoint '[1,2,3]'
    • Array of Text '["A","B","C"]'
    • Array of Objects of type FixedPoint '["[11, 12]","[14, 15]"]'
    • Array of Objects of type Text '["[A, B]","[X, Y]"]' 

Refer https://issues.apache.org/jira/browse/SQOOP-1771 for more details

java Object[]
BINARY
byte array enclosed in quotes and encoded with ISO-8859-1 charsetjava byte[]
BIT

true, TRUE, 1

false, FALSE, 0

( not encoded in quotes )

Unsupported values should throw an exception

java boolean

DATE
YYYY-MM-DD ( no time)org.joda.time.LocalDate
DATE_TIME

YYYY-MM-DD HH:MM:DD[.ZZZ][+/-XX] ( fraction and timezone are optional)

Refer https://issues.apache.org/jira/browse/SQOOP-1846 for more details

org.joda.time. DateTime

or

org.joda.time. LocalDateTime

(depends on timezone attribute )

DECIMAL

BigDecimal (not encoded in quotes ),

 

java BigDecimal

scale and precision fields are handled via :

https://issues.apache.org/jira/browse/SQOOP-2027

ENUM
Same as TEXTjava String
FIXED_POINT

Integer or Long, ( not encoded in quotes )


java Integer

or

java Long

( depends on

byteSize attribute

and signed attribute)

https://issues.apache.org/jira/browse/SQOOP-2022

FLOATING_POINT
Float or Double ( not encoded in quotes )

java Double

or

java Float

( depends on

byteSize attribute)

https://issues.apache.org/jira/browse/SQOOP-2022

MAP
  • Will be encoded as String (and hence enclosed with '\, inside there will be JSON encoding of the map (hence the entire value will be enclosed in  pair { }, nested values are also encoded as JSON
  • Map<Number, Number> '{1:20}'
  • Map<String, String> - '{"testKey":"testValue"}'


    Refer https://issues.apache.org/jira/browse/SQOOP-1771 for more details
java.util.Map<Object, Object>
SET
same as ARRAY

java Object[]

TEXT

Entire string will be enclosed in single quotes and all bytes will be printed as they are will exception of following bytes

Byte

Encoded as

0x5C

\ \ (no space) 

0x27

\'

0x22

\"

0x1A

\Z

0x0D

\r

0x0A

\n

0x00

\0

java String
TIME

HH:MM:DD[.ZZZ] ( fraction is optional )

3 digit milli second support only for time

org.joda.time.LocalTime ( No Timezone)
UNKNOWN
same as BINARYsame as java byte[]

...

CSVIntermediateDataFormat

Relevant JIRA : SQOOP-555 and SQOOP-1350

...

NOTE: It may not be obvious but the current IDF design expect every new implementation of it to expose the CSV an ObjectArray formats in addition to its native format.

JSONIntermediateDataFormat

Relevant JIRA: SQOOP-1901

Avro Intermediate Data Format

SqoopIDFUtils 

It is a utility class in sqoop to aid connectors in encoding data into expected CSV format and object format and also parsing the CSV string back to the prescribed object format.

...