Categories:: Aggregate functions (Cardinality Estimation) , Window function syntax and usage

HLL_COMBINE¶

Combines (merges) input states into a single output state.

This allows scenarios where HLL_ACCUMULATE is run over horizontal partitions of the same table, producing an algorithm state for each table partition. These states can later be combined using HLL_COMBINE, producing the same output state as a single run of HLL_ACCUMULATE over the entire table.

See also:: HLL , HLL_ACCUMULATE , HLL_ESTIMATE

Syntax¶

HLL_COMBINE([DISTINCT] state)

Copy

Arguments¶

state: An expression that contains state information generated by a call to HLL_ACCUMULATE.

Usage notes¶

DISTINCT is supported syntactically, but has no effect.
The output of this function is not fully deterministic. Running this function on the same inputs might return different results at different times. The differences are typically small and are consistent with the fact that the HLL_* functions are approximation functions.

Examples¶

This example shows how to use the three related functions HLL_ACCUMULATE, HLL_ESTIMATE, and HLL_COMBINE.

Create a simple table and data:

-- Create a sequence to use to generate values for the table.
CREATE OR REPLACE SEQUENCE seq92;
CREATE OR REPLACE TABLE sequence_demo (c1 INTEGER DEFAULT seq92.nextval, dummy SMALLINT);
INSERT INTO sequence_demo (dummy) VALUES (0);

-- Double the number of rows a few times, until there are 8 rows:
INSERT INTO sequence_demo (dummy) SELECT dummy FROM sequence_demo;
INSERT INTO sequence_demo (dummy) SELECT dummy FROM sequence_demo;
INSERT INTO sequence_demo (dummy) SELECT dummy FROM sequence_demo;

Copy

Create a table that contains the “state” that represents the current approximate cardinality information for the table named sequence_demo:

CREATE OR REPLACE TABLE resultstate1 AS (
     SELECT hll_accumulate(c1) AS rs1
        FROM sequence_demo);

Copy

Now create a second table and add data. (In a more realistic situation, the user could have loaded more data into the first table and divided the data into non-overlapping sets based on the time that the data was loaded.)

CREATE OR REPLACE TABLE test_table2 (c1 INTEGER);
-- Insert data.
INSERT INTO test_table2 (c1) SELECT c1 + 4 FROM sequence_demo;

Copy

Get the “state” information for just the new data.

CREATE OR REPLACE TABLE resultstate2 AS 
  (SELECT hll_accumulate(c1) AS rs1 
     FROM test_table2);

Copy

Combine the “state” information for the two batches of rows:

CREATE OR REPLACE TABLE combined_resultstate (c1) AS 
  SELECT hll_combine(rs1) AS apc1
    FROM (
        SELECT rs1 FROM resultstate1
        UNION ALL
        SELECT rs1 FROM resultstate2
      )
      ;

Copy

Get the approximate cardinality of the combined set of rows:

SELECT hll_estimate(c1) FROM combined_resultstate;

Copy

Output:

+------------------+
| HLL_ESTIMATE(C1) |
|------------------|
|               12 |
+------------------+

Copy