...
URI format
Code Block |
---|
hdfs://hostname[:port][/path][?options] |
You can append query options to the URI in the following format, ?option=value&option=value&...
The path is treated in the following way:
- as a consumer, if it's a file, it just reads the file, otherwise if it represents a directory it scans all the file under the path satisfying the configured pattern. All the files under that directory must be of the same type.
- as a producer, if at least one split strategy is defined, the path is considered a directory and under that directory the producer creates a different file per split named seg0, seg1, seg2, etc.
Options
Name | Default Value | Description |
---|---|---|
| | BBBB |
Writing/Reading messages to an HDFS filesystem
Message Headers
Header | Description |
---|
HDFS Producer
...
| | The file can be overwritten |
| | The buffer size used by HDFS |
| | The HDFS replication factor |
| | The size of the HDFS blocks |
| | It can be SEQUENCE_FILE, |
| | It can be LOCAL for local filesystem |
| | The type for the key in case of |
| | The type for the key in case of |
|
| A string describing the strategy on |
| | When a file is opened for reading/ |
| | Once the file has been read is |
| | For the consumer, how much to wait |
| | Then interval between the directory |
| | The pattern used for scanning the |
| | When reading a normal file, this is split |
HDFS Usage Samples
Example 1:
...