You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

  • Syntax
    The syntax of the Upsert statement is similar to the syntax of the Insert statement and it is constructed as follows:

        upsert into dataset datasetname(
            query()
        );

  • Semantics
    An upsert operation couples a delete "if exists" with an insert based on the primary key of the record. The two operations are run atomically but the execution of the query doesn't happen under the same transaction context and a materialized result might be computed before upserting records into the dataset.

Implementation Details
-Compilation time
An upsert statement is compiled exactly as an insert statement with the differences only at the storage side of the produced plan. The logical operators used are the exact ones used in the case of the insert statement and the differences are only in their expected inputs and outputs. The difference is only in the expected input and output schema for the operators since an Upsert operator needs the existing values in addition to the new values.

-Runtime
A new flag has been added to AsterixLSMTreeInsertDeleteOperatorDescriptor to indicate when the operation is an upsert operation. If the flag is set, the created push runtime is a new NodePushable that extends LSMIndexInsertUpdateDeleteOperatorNodePushable and named AsterixLSMPrimaryUpsertOperatorNodePushable. The secondary upsert push runtime AsterixLSMSecondaryUpsertOperatorNodePushable extends LSMIndexInsertUpdateDeleteOperatorNodePushable and the upsert operation has a special commit operator "UpsertCommitRuntime" that extends the existing "CommitRuntime". With these changes, the upsert pipeline looks as follows:

  • The primary Upsert operator perform the following tasks:
  1. Lock on key
  2. Search using key
  3. Delete if found
  4. Insert new record
  5. Produce tuples that consist of the existing record "or NULL" and the new record
  • The secondary Upsert operator perform the following tasks:
  1. If the existing secondary key equals to the new secondary key, it is a NO OP.
  2. If a previous secondary key exists, it deletes the existing secondary key-primary key pair from the secondary index.
  3. If a new secondary key exists, it inserts the new secondary key-primary key pair from the secondary index.

    Example:
    data type: [int(pk),int].
    existing data:
    single record {1,5}

    a) upsert {1,5}
    In this case, the secondary index upsert the old secondary key == the new secondary key. Hence, it will be no op.
    b) upsert{1,2}
    In this case, the secondary keys are not equal. Hence, we check:
    (previous value exists(not null) which was {5,1}, so we delete it)
    (new value is {2,1} so we insert it).
    c) upsert {2,2}
    In this case, there is no old value, we will only insert the key {2,2} in secondary index.
  • The Upsert commit operator commits the transaction creating Upsert logs if an old value was deleted in the transaction and an Insert log otherwise.
  • When an Upsert log is flushed to disk, the operation counter is decremented twice since it performed two operations.

 

 

  • No labels