When using the Data Lake sink, the incoming events are stored in an InfluxDB.
Implementation
org.apache.streampipes.sinks.internal.jvm.datalake
The concrete implementation comprises a Data Lake class, a Data Lake Controller class, a Data Lake InfluxDB Client class and a Data Lake Parameters class. The code is basically the same as for the InfluxDB sink (org.apache.streampipes.sinks.databases.jvm.influxdb).
Data Lake Parameters Class
The parameter class defines the necessary parameters for the configuration of the sink.
parameter | description |
---|---|
influxDbHost | hostname/URL of the InfluxDB instance. (including http(s)://) |
influxDbPort | port of the InfluxDB instance |
databaseName | name of the database where events will be stored |
measureName | name of the Measurement where events will be stored (will be created if it does not exist) |
user | username for the InfluxDB server |
password | password for the InfluxDB server |
timestampField | field which contains the required timestamp (field type = http://schema.org/DateTime) |
batchSize | indicates how many events are written into a buffer, before they are written to the database |
flushDuration | maximum waiting time for the buffer to fill the Buffer size before it will be written to the database in ms |
dimensionProperties | list containing the tag fields (scope = dimension property) |
Data Lake Controller Class
In controller class, the model is declared for viewing and configuration in Pipeline Editor, and initializes sink on invocation of pipeline.
The measurement name and the timestamp field are derived from user input, the remaining parameters (except batch size and flush duration) from org.apache.streampipes.sinks.internal.jvm.config.SinksInternalJvmConfig. Batch size is fixed to 2000 events and flush duration is set to 500 ms.
Data Lake Class
The data lake class itself comprises a total of six methods and essentially controls the saving of events to the database. For this purpose, it uses the Data Lake InfluxDB Client.
References:
- Apache StreamPipes Documentation: https://streampipes.apache.org/docs/docs/pe/org.apache.streampipes.sinks.internal.jvm.datalake/