Documentation

Home > Documentation > MERGE Statement

MERGE Statement

Syntax

MERGE stream_expression_1, stream_expression_2
  USING field_identifier_m [GROUP BY field_identifier_g]

Substitutable Fields

stream_expression_n:

Either the unique identifier (name) of a stream or a parenthesized SELECT statement. If a stream has the desired output schema (including field names, types, and order), the stream identifier may be used directly. If it is necessary to rename, reorder, or change the type of a field, then the SELECT statement must be used.

field_identifier_m:

The tuple field used to merge the incoming streams into an ordered output stream.

field_identifier_g:

The tuple field used to group tuples prior to merging.

Discussion

MERGE takes two input streams with compatible schemas and produces one output stream with all the tuples from the original streams. MERGE has a strict guarantee of output order based on some user-specified expression over all tuples. Incoming tuples are pre-sorted in increasing order of some tuple field.

Optionally, tuples can also be grouped by one or more input fields. Any groups that you define are ordered independently when they are merged. MERGE stores arriving tuples in a buffer for each input port. It emits tuples when a new tuple's value (based on the field that was selected to merge on) is greater than or equal to the oldest tuple in the other buffer. If the group option has been selected, the tuples must also evaluate to the same group.

The stream_expression_n entries may be either the names of streams with identical tuple structure or a parenthesized SELECT statement. In a parenthesized SELECT statement, use the target list and AS keyword to enable merging of streams with incompatible tuple structure. The USING keyword identifies the merge field; GROUP BY identifies the grouping specification. These cannot be the same tuple fields.

MERGE generates a stream and may be used anywhere a stream expression is acceptable. Alternatively, the output could be "captured" in a stream, as illustrated in the following code fragments.

CREATE [OUTPUT] STREAM stream_identifier;
MERGE ... INTO stream_identifier;
CREATE [OUTPUT] STREAM stream_identifier AS
  MERGE ...;

Or, for an OUTPUT STREAM

MERGE ... => CREATE OUTPUT STREAM stream_identifier;