Documentation

Home > Documentation > VJOIN Statement

VJOIN Statement

Syntax

VJOIN target_list
  FROM {stream_identifier_1 | stream_expression_1} [[AS] stream_alias_1],
       {stream_identifier_2 | stream_expression_2} [[AS] stream_alias_2]
  WHERE join_field_identifier_2
    BETWEEN join_field_identifier_1 - value_1
            AND join_field_identifier_1 + value_2
    [AND predicate]
  [TIMEOUT timeout_value]

Substitutable Fields

target_list:

One or more entries, separated by commas, of the format target_list_entry.

target_list_entry:

A value, of the format scalar_expression [AS output_field_identifier], to be included in the result set returned by the statement. Values may be obtained from a tuple field or from a simple function. Optionally, the name for a value may be modified through an AS clause.

scalar_expression:

A value that will be included in the output tuple. Fields from either input tuple may be included in the output tuple. If desired, the output field may be renamed using the AS clause.

stream_identifier:

The streams containing the input tuples.

stream_expression:

A StreamSQL statement that produces a stream. The statement must be enclosed within parentheses.

stream_alias:

A unique identifier or name (alias) used only within this statement.

join_field_identifier:

The fields to be compared as the basis of the join; join_field_identifier_1 is name of the join field contained in the first stream while join_field_identifier_2 is name of the join field contained in the second stream. These fields may have different names but they must be of the same type. For tuples in the two input streams to be joined, the value of the join field in the second input stream must be within the range around the value of the join field in the first input stream [Low end of range = (value of join field in stream 1) - value_1; High end of range = (value of join field in stream 1) + value_2].

value:

value_1 sets the lower end of the join range; value_2 sets the upper value of the join range. These entries, which are the same data type as the join_field_identifier, may be equivalent or different values.

predicate:

A conditional statement that establishes additional constraints for the join, for example, stream_identifier_1.field_identifier==stream_identifier_2.field_identifier.

timeout_value:

An integer value, in seconds, after which tuples will be flushed from the buffer. The TIMEOUT functionality is not enabled until at least two tuples arrive on a stream.

Discussion

With a value based join, matches will only occur if the tuples satisfy the predicate condition and the values of their ordering fields are within a specified range. For the functionality to execute correctly, the value of the ordering field in each stream must be increasing; undefined behavior will result if the value of the ordering field in either stream decreases.

All incoming tuples are buffered and a tuple will remain in the buffer as long as its ordering field value remains within the join range of each newly arriving tuple on the other inputs stream. If the value of the ordering field in a newly arriving tuple causes the join range to be repositioned, then tuples whose ordering field values are outside of the repositioned range will be dropped from the operator.

VJOIN generates a stream and may be used anywhere a stream expression is acceptable. Alternatively, the output could be "captured" in a stream, as illustrated in the following code fragments.

CREATE [OUTPUT] STREAM stream_identifier;
VJOIN ... INTO stream_identifier;
CREATE [OUTPUT] STREAM stream_identifier AS
  VJOIN ...;

Or, for an OUTPUT STREAM

VJOIN ... => CREATE OUTPUT STREAM stream_identifier;