...
Allows us to measure the monitoring overhead.
serializes and deserializes using coder
Uses Aggregator for byte size counter
...
Query 1: What are the bid values in Euro's? (Currency Conversion)
Simple map
Filter + ParDo to extract bids out of events
ParDo that outputs Bid objects with price converted
Expand | ||
---|---|---|
| ||
SELECT Istream(auction, DOLTOEUR(price), bidder, datetime) FROM bid [ROWS UNBOUNDED]; |
...
Query 2: Find bids with specific auction ids and show their bid price.
Illustrates simple filter
Filter + ParDo to extractbids out of events
Filter to keepbids with correct auctionId
ParDo that outputs AuctionPrice(auction, price) objects
...
Expand | ||
---|---|---|
| ||
SELECT Istream(P.name, P.city, P.state, A.id) FROM Auction A [ROWS UNBOUNDED], Person P [ROWS UNBOUNDED] WHERE A.seller = P.id AND (P.state = 'OR' OR P.state = 'ID' OR P.state = 'CA') AND A.category = 10; |
...
Query 4: What is the average selling price for each auction category?
Illustrates sliding windows and aggregation
Apply Wining-bids
ParDo to key winning-bids by category
apply sliding windows to have a period of time
apply Mean.perKey (key = category)
ParDo that outputs CategoryPrice(categoryId, avgPrice)
Expand | ||
---|---|---|
| ||
Query 5: Which auctions have seen the most bids in the last period?
...
Expand | ||
---|---|---|
| ||
|
Query 4: What is the average selling price for each auction category?
Illustrates sliding windows and aggregation
Apply Wining-bids
ParDo to key winning-bids by category
apply sliding windows to have a period of time
apply Mean.perKey (key = category)
ParDo that outputs CategoryPrice(categoryId, avgPrice)
Query 5: Which auctions have seen the most bids in the last period?
Illustrates sliding windows and combiners (i.e. reducers) to compare the elements in auctions Collection
Input: (sliding) window (to have a result over 1h period updated every 1 min) collection of bids events
ParDo to replacebid elements by their auction id
Count.PerElement to count the occurrences of each auction id
Combine.globally to select only the auctions with the maximumnumber of bids
BinaryCombineFnto compare one to one the elements of the collection (auction id occurrences, i.e. number of bids)
Return KV(auction id, max occurrences)
output:AuctionCount(auction id, max occurrences) objects
...
windows with large side effects on firing
ParDo to key events by their shardId (number of shards is a config item)
Apply fixed windows with composite triggering that fires when each sub-triger (executed in order) fires
repeatedly
after at least maxLogEvents in pane
or finally when watermark pass the end of window
Repeatedly
after at least maxLogEvents in pane
or processing time pass the first element in pane + lateDelay
With allowedLateness of 1 day (so that any late date will stall the pipeline and be noticeable)
GroupByKey to group events by shardId
ParDo to construct the outputStreams (fileNames contain shardId) and encode each event to that outputStream + form pairs with key = null key and value = outputFile (represents a fileName with various added information)
apply fixed window with default trigger and lateness of 1 day to clear complex triggerring
GroupByKey all outputFiles together (they have the samekey) to have one file per window
ParDo to write all the lines to files in Google Cloud Storage
...
Query 11 (not part of original NexMark): How many bids did a user make in each session he was active?
...