The Streams DSL (Domain Specific Language) is what's known as an "Embedded DSL". This means that when you program in the Streams DSL, you're not writing source code in a separate language with its own parser and runtime. Instead, you write the code by making library calls in a "host" language. In our case, the host language is Java (and Scala). This carries with it some tradeoffs, but mostly benefits in our case.
One downside of embedded DSLs, though, it that they make it a little too easy for the authors to gloss over defining the grammar of the language. Maybe this is a little unfair, since not all parsers are grammar-based, but it seems like the act of writing a language parser puts you more in the frame of mind to think about the structure of your language than writing interfaces and method headers does.
For small, one-off DSLs, a defined grammar probably isn't that important. Users can easily understand the whole extent of the language at once, even if they occasionally have to look at the docs. But for a large and complex language, with lots of objects and operations, having a defined grammar is incredibly beneficial for both users and maintainers. From the user's side, using a programming language with a compact grammar is far easier than one where each statement seemingly follows its own rules. It's harder to predict in advance how you can make certain statements, and users will constantly be facing compiler errors and referring to the docs to figure out where to put which arguments on which statements. As an analogy, imagine trying to have a conversation in which you have to adhere to completely different grammatical rules, depending on the subject of the sentence!
As maintainers, there are other benefits of sticking to a grammar. We can avoid debating whether we should add new overloads or not, counting how many new method signatures a change would create, debating over the names of config objects, etc. We also avoid a constant thrash of adding new overloads and deprecating old ones, which increases our maintenance surface area.
This is why we define a Streams DSL grammar and coerce our API to match it.
The grammar is informally specified on this page and should serve as a roadmap for future modifications to the DSL. Creating a formal specification is out of scope right now; we need experience to decide if it would be useful. Also, making a pass over the whole DSL to coerce it to comply with this grammar is left as an available task for whoever is interested (and would require KIPs).
This is a living document. As we live with the grammar over time, we will discover shortcomings of the current specification, and we would update the spec as needed.
StreamsBuilder
is the entry point, and includes methods to produce the top-level DSLObjectsoperand: a DSLObject
operation: operand.operator(Parameter?) | operand.operator(operand, Parameter?) | operand.operator(operand collection, Parameter?)
{DSLOperation}Parameters
(e.g., FilterParameters
, FlatTransformValuesParameters
, ToParameter
)through
. Plus, the past participle strategy isn't machine verifiable.ktable.FilterParameters
vs kstream.FilterParameters
)TableFilterParameters
), which is more verbose, and also opens a pandora's box where we might try to namespace only the operations that have collisions (like, we can just use ToStreamParameters
because only the KTable API as this operation), but then later on want to add a new operation to one of the DSLObjects and actually create a collision (like maybe we decide to add toStream
to the GlobalKTable API), but now it's worse because one of them is like GlobalKTableToStreamParameters
and the other is just ToStreamParameters
, which we just have to know only applies to the KTable API... in short, it would be a mess.Windows
ecosystem results in unintentional rigidity, which we now cannot remove: the worst kind of tech debt, because it can't be cleaned up. It also prevents us from trying to be clever over time and combine similar objects into complicated hierarchies, which only come back to bite us when we realize that the abstractions are leaking.filterParameters()
, flatTransformValuesParameters()
, etc.from
and contain a description of the arguments, like fromSerdes(key serde, value serde)
as
" but different argument lists). Also, guaranteed not to conflict with the optional parameters (below)with
and contain a description of the arguments, like withKeySerde(key serde)
or withName(name)
with
" but different argument lists). Also, guaranteed not to conflict with the "reqired arguments" version.enable{FlagName}
or disable{FlagName}
, but cannot take any argumentswithLogging(false)
.