...
Simply having multiple patterns might not be able to deal with situations like above, and the concept PatternProcessor
can handle them properly.
We propose to add PatternProcessor interface as below. The properties of the pattern processor are required by CEP operator in order to function correctly.
...
The general process of the example described in this section and some general implementation idea is shown in the wireframe below. Operator Coordinator on JobManager will read pattern processor updates from users' database and send pattern processor updates to subtasks, and as a result the matched & processed results produced by subtasks are influenced.
Proposed Changes
...
- Key Generating Operator computes the key of the input data according to the
KeySelector
provided byPatternProcessor
s, and attach the key along with the input data. If there areN
pattern processors, then there will beN
records with the same input data and different keys sent downstream. - Input data is sent to different downstream operator subtasks by calling
keyBy()
operation on them.keyBy()
uses the key generated in the previous step. - The next operator matches the input data to
Pattern
s provided byPatternProcessor
s, and when a matched sequence is generated, the operator calls the correspondingPatternProcessFunction
on that sequence and produce the result to downstream.
In the wireframe above, Key Generating Operator and Pattern Matching & Processing Operator uses information from the pattern processors and thus need a PatternProcessorDiscoverer
. Our design of collecting pattern processors and update them inside operators is shown in the wireframe below. In each of the two operators mentioned, there will be pattern processor managers inside each of their Operator Coordinators. The manager in the OperatorCoordinator serves to discover pattern processor changes and convert the information into pattern processors. Then the subtasks, after receiving the latest pattern processors from Operator Coordinator, will make the update.
Supporting multiple & dynamic pattern processors inside CEP operators
...
Given that the proposed design offers eventual consistency, there could be a period of time just after a pattern processor update happens when inconsistency exists among subtasks of CepOperators, which means they do not work on the same set of pattern processors. This inconsistency is caused by the fact that OperatorCoordinator cannot send OperatorEvent to subtasks simultaneously, and this period of time usually lasts for milliseconds. As can be illustrated in the graph below, fake alerts ("AC" instead of "ABC") might be generated during that period.
Compatibility, Deprecation, and Migration Plan
...