Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current state:  "Under Discussion"Discarded - Superseded by KIP-875

Discussion thread: here

JIRA: here

...

The tool will read all of the persisted partition-offset pairs and write them out in JSON form to standard out. The usage is as follows:

bin/connect-offsets.sh [options]

where the options for export mode are as follows:

ParameterRequiredDescription
--configyesSpecifies the path to the Connect worker configuration. The configuration is used to know how to access the offset storage and the converters used to serialize and deserialize the offset information.
--connectorsnoSpecifies a comma-separated list of the connector names used to filter which partition-offset pairs are to be exported.
--from-beginningnoInclude all of the partition-offset pairs, including those that have been since overwritten by new pairs. If this option is excluded, the tool only outputs the latest partition-offset pairs.

...


For example, the following command will read the Connect worker configuration in the my-worker-config.properties file and write out all persisted partition-offset pairs to standard out:

bin/connect-offsets.sh --config=my-worker-config.properties

A sample of this output might be:

{
"MyConnector": [{
"partition": {
"file": "a"
},
"offset": {
"offsetKey1": "offsetValue1",
"offsetKey2": "offsetValue2"
}
},{
"partition": {
"file": "b"
},
"offset": {
"offsetKey1": "offsetValue3",
"offsetKey2": "offsetValue4"
}]
}

Note how the response is a JSON document with field names that correspond to the connector names. The value of each field is an array containing zero or more documents each with a "partition" field and an "offset" field. The "partition" field contains a document with the string-valued fields representing the source-specific partition information, and the "offset" field contains a document with the string-valued fields representing the source-specific offset information.

The --connectors parameter takes a comma-separated list of connector names, and will output only the connector offset document that match one of the named connectors:

bin/connect-offsets.sh --config=my-worker-config.properties --connectors=MyConnector,OtherConnector

The tools JSON output can be piped to a file using standard techniques:

bin/connect-offsets.sh --config=my-worker-config.properties --connectors=MyConnector,OtherConnector > my-offsets.json

Importing

The tool will read all of the persisted partition-offset pairs supplied on the standard input, and write them to the offset storage. Typically this will involve piping the JSON offset information from a file that was output using the export mode as described above:

bin/connect-offsets.sh [options] < my-offsets.json

where the options for import mode are as follows:

ParameterRequiredDescription
--configyesSpecifies the path to the Connect worker configuration. The configuration is used to know how to access the offset storage and the converters used to serialize and deserialize the offset information.
--dry-runnoDefaults to "false". When set to "true", the tool will write out the actions that would be performed but will not actually modify any persisted partition-offset pairs.

Override or remove the source offset for one or more source partitions

A user can use the tool as mentioned above to obtain a copy of the partition-offset pairs. The user can modify the JSON, and send the updated JSON to the tool to update the persisted partition-offset pairs that appear in the file. For example, the following will read the partition-offset pairs in the specified file and update only those partition-offset pairs:

bin/connect-offsets.sh --config=my-worker-config.properties < my-offsets.json

This tool overwrites the partitions and offsets for the connectors as specified in the input. Any persisted partition-offset pair for a connector not included in the input will be left unmodified.

Note that in addition to modifying the persisted partition-offset pairs, the tool can also be used to remove partition offset pairs when the "offset" object in the JSON is null. For example, consider the following input supplied to the tool: 


{
"MyConnector": [{
"partition": {
"file": "a"
},
"offset": {
"offsetKey1": "offsetValue1",
"offsetKey2": "offsetValue2"
}
},{
"partition": {
"file": "b"
},
"offset": null
}]
}

The tool will update the "MyConnector" connector's offsets for the partition with file "a" but will remove the offsets for the partition with file "b". Any other partition-offset pairs for this or any other connector will be unmodified.

...