Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Nexmark is a suite of queries (pipelines) used to measure performance and non-regression in Beam (see https://beam.apache.org/documentation/sdks/java/nexmark/).

Links to the original Nexmark papers:

Some queries can be very complex. To ease their maintenance, here is a presentation of the architecture along with pseudo-code of the queries:


Components of NexMark

nexmark components (6).pngImage RemovedImage Added

  • Generator:

    • generation of timestamped events (bids, persons, auctions) correlated between each other

  • NexmarkLauncher:

    • creates sources that use the generator

    • queries pipelines launching, monitoring

  • Output metrics:

    • Each query includes ParDos to update metrics

    • execution time, processing event rate, number of results,                      but also invalid auctions/bids, …

  • Modes:

    • Batch mode: test data is finite and uses a BoundedSource

    • Streaming mode: test data is finite but uses an UnboundedSource to trigger streaming mode in runners

...

Query 1: What are the bid values in Euro's? (Currency Conversion)

  • Simple map

    • Filter + ParDo to extract bids out of events

    • ParDo that outputs Bid objects with price converted


Expand
titleSQL Interpretation
SELECT Istream(auction, DOLTOEUR(price), bidder, datetime)
FROM bid [ROWS UNBOUNDED];


Query 2: Find bids with specific auction ids and show their bid price.

  • Illustrates simple filter

    • Filter + ParDo to extractbids out of events

    • Filter to keepbids with correct auctionId

    • ParDo that outputs AuctionPrice(auction, price) objects


Expand
titleSQL Interpretation
SELECT Rstream(auction, price)
FROM Bid [NOW]
WHERE
   auction = 1007
   OR auction = 1020
   OR auction = 2001
   OR auction = 2019
   OR auction = 2087;

Query 3: Who is selling in particular US states?

...