Step 4. Update Data¶
So far in this tutorial, you have loaded a batch of JSON event data into a RAW_SOURCE
table, queried it, and created a EVENTS
table composed of event key values. In the scenario we described earlier in this tutorial, the JSON data flows from multiple sources through a web endpoint to your S3 bucket. You can write a script to pull new data files into the RAW_SOURCE
table. How do you then insert only new event data into the EVENTS
table?
There are numerous ways to differentiate your JSON data in the RAW_SOURCE
table and the flattened data in the EVENTS
table. This section suggests two options:
Comparison of Primary Key Data¶
Examine your JSON data for any values that are naturally unique and would be good candidates for a primary key. In our current example, let’s suppose that the combination of
src:device_type
andvalue:rv
would make a good composite primary key. These two JSON keys correspond to theDEVICE_TYPE
andRV
columns in theEVENTS
table.Note
Snowflake does not enforce the primary key constraint. Rather, the constraint serves as metadata identifying the natural key in the Information Schema.
Add the primary key constraint to the
EVENTS
table:ALTER TABLE events ADD CONSTRAINT pk_DeviceType PRIMARY KEY (device_type, rv);
Insert a new JSON event record into the
RAW_SOURCE
table:insert into raw_source select PARSE_JSON ('{ "device_type": "cell_phone", "events": [ { "f": 79, "rv": "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22", "t": 5769784730576, "v": { "ACHZ": 75846, "ACV": 098355, "DCA": 789, "DCV": 62287, "ENJR": 2234, "ERRS": 578, "MXEC": 999, "TMPI": 9 }, "vd": 54, "z": 1437644222811 } ], "version": 3.2 }');
Insert the new record in the
RAW_SOURCE
table into theEVENTS
table based on a comparison of the primary key values:insert into events select src:device_type::string , src:version::string , value:f::number , value:rv::variant , value:t::number , value:v.ACHZ::number , value:v.ACV::number , value:v.DCA::number , value:v.DCV::number , value:v.ENJR::number , value:v.ERRS::number , value:v.MXEC::number , value:v.TMPI::number , value:vd::number , value:z::number from raw_source , lateral flatten( input => src:events ) where not exists (select 'x' from events where events.device_type = src:device_type and events.rv = value:rv);
Querying the
EVENTS
table shows the added row:select * from EVENTS; +-------------+---------+---------+----------------------------------------------------------------------+---------------+-------+--------+-----+-------+------+------+------+------+-----+---------------+ | DEVICE_TYPE | VERSION | F | RV | T | ACHZ | ACV | DCA | DCV | ENJR | ERRS | MXEC | TMPI | VD | Z | |-------------+---------+---------+----------------------------------------------------------------------+---------------+-------+--------+-----+-------+------+------+------+------+-----+---------------| | server | 2.6 | 83 | "15219.64,783.63,48674.48,84679.52,27499.78,2178.83,0.42,74900.19" | 1437560931139 | 42869 | 709489 | 232 | 62287 | 2599 | 205 | 487 | 9 | 54 | 1437644222811 | | server | 2.6 | 1000083 | "8070.52,54470.71,85331.27,9.10,70825.85,65191.82,46564.53,29422.22" | 1437036965027 | 6953 | 346795 | 250 | 46066 | 9033 | 615 | 0 | 112 | 626 | 1437660796958 | | cell_phone | 3.2 | 79 | "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22" | 5769784730576 | 75846 | 98355 | 789 | 62287 | 2234 | 578 | 999 | 9 | 54 | 1437644222811 | +-------------+---------+---------+----------------------------------------------------------------------+---------------+-------+--------+-----+-------+------+------+------+------+-----+---------------+
Comparison of All Keys/Columns¶
If your JSON data doesn’t include any key values that are natural primary key candidates, you could compare all repeating JSON keys in the RAW_SOURCE
table with the corresponding column values in the EVENTS
table.
No changes to your existing EVENTS
table are required.
Insert a new JSON event record into the
RAW_SOURCE
table:insert into raw_source select parse_json ('{ "device_type": "web_browser", "events": [ { "f": 79, "rv": "122375.99,744.89,386.99,12.45,78.08,43.7,9.22,8765.43", "t": 5769784730576, "v": { "ACHZ": 768436, "ACV": 9475, "DCA": 94835, "DCV": 88845, "ENJR": 8754, "ERRS": 567, "MXEC": 823, "TMPI": 0 }, "vd": 55, "z": 8745598047355 } ], "version": 8.7 }');
Insert the new record in the
RAW_SOURCE
table into theEVENTS
table based on a comparison of all repeating key values:insert into events select src:device_type::string , src:version::string , value:f::number , value:rv::variant , value:t::number , value:v.ACHZ::number , value:v.ACV::number , value:v.DCA::number , value:v.DCV::number , value:v.ENJR::number , value:v.ERRS::number , value:v.MXEC::number , value:v.TMPI::number , value:vd::number , value:z::number from raw_source , lateral flatten( input => src:events ) where not exists (select 'x' from events where events.device_type = src:device_type and events.version = src:version and events.f = value:f and events.rv = value:rv and events.t = value:t and events.achz = value:v.ACHZ and events.acv = value:v.ACV and events.dca = value:v.DCA and events.dcv = value:v.DCV and events.enjr = value:v.ENJR and events.errs = value:v.ERRS and events.mxec = value:v.MXEC and events.tmpi = value:v.TMPI and events.vd = value:vd and events.z = value:z);
Querying the
EVENTS
table shows the added row:select * from EVENTS; +-------------+---------+---------+----------------------------------------------------------------------+---------------+--------+--------+-------+-------+------+------+------+------+-----+---------------+ | DEVICE_TYPE | VERSION | F | RV | T | ACHZ | ACV | DCA | DCV | ENJR | ERRS | MXEC | TMPI | VD | Z | |-------------+---------+---------+----------------------------------------------------------------------+---------------+--------+--------+-------+-------+------+------+------+------+-----+---------------| | server | 2.6 | 83 | "15219.64,783.63,48674.48,84679.52,27499.78,2178.83,0.42,74900.19" | 1437560931139 | 42869 | 709489 | 232 | 62287 | 2599 | 205 | 487 | 9 | 54 | 1437644222811 | | server | 2.6 | 1000083 | "8070.52,54470.71,85331.27,9.10,70825.85,65191.82,46564.53,29422.22" | 1437036965027 | 6953 | 346795 | 250 | 46066 | 9033 | 615 | 0 | 112 | 626 | 1437660796958 | | cell_phone | 3.2 | 79 | "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22" | 5769784730576 | 75846 | 98355 | 789 | 62287 | 2234 | 578 | 999 | 9 | 54 | 1437644222811 | | web_browser | 8.7 | 79 | "122375.99,744.89,386.99,12.45,78.08,43.7,9.22,8765.43" | 5769784730576 | 768436 | 9475 | 94835 | 88845 | 8754 | 567 | 823 | 0 | 55 | 8745598047355 | +-------------+---------+---------+----------------------------------------------------------------------+---------------+--------+--------+-------+-------+------+------+------+------+-----+---------------+
Next: Step 5. Congratulations!