Data Generation Functions¶
Data Generation functions allow you to generate data. Snowflake supports two types of data generation functions:
Random, which can be useful for testing purposes.
These functions produce a random value each time. Each value is independent of the other values generated by other calls to the function. The underlying algorithm produces pseudorandom values, and thus the values are not truly random or independent, but without knowing the algorithm, the values are essentially unpredictable, usually evenly distributed (if the sample size is large), and pseudoindependent of each other.
Controlled distribution, which can be useful for providing unique ID numbers for records that do not already have unique identifiers.
These functions produce values that are not independent. For example, the NORMAL function returns values that have an approximately “normal” (bellshaped) distribution based on a specified mean and standard deviation. Thus, each new value generated is at least indirectly influenced by previously generated values as the function tries to maintain the specified distribution. As another example, the SEQ family of functions return a sequence of values.
Note
The UNIFORM function is listed as a controlleddistribution function, but is intended to generate evenlydistributed values. In other words, it acts as though it’s a “random” function, but we refer to it as a controlled distribution function because the distribution is explicitly specified and because you can choose a datageneration function that produces nonuniform values over a large sample size.
List of Functions¶
Function Name 
Notes 

Random 

Returns a pseudorandom 64bit integer. 

Returns a random string of specified length. 

Returns a random RFC 4122compliant UUID as a formatted string. 

Controlled Distribution 

Returns a normaldistributed floating point number, with specified mean and standard deviation. 

Returns a uniformly random number within the specified range. 

Returns a Zipfdistributed integer. 

Returns a sequence of monotonically increasing integers. 
Usage Notes¶
Random distribution functions are deterministic.
Each random distribution function takes a generator expression,
gen
, as its last argument. The generator expression,gen
, can be constant or variable:If constant, then the result of the random distribution function is constant (unless there are other, variable arguments, which is currently only supported for the RANDSTR function).
If variable, then the result of the random distribution function is variable.
Generator expressions must be a type 64bit integer, although implicit conversions are allowed. Any expression that can be converted into a 64bit integer can be used as a generator expression.
The randomness of any random distribution function is directly linked to the randomness of its generator expression. For most practical purposes, the RANDOM data generation function is the best choice for randomlygenerated integer values.
Sequences generated by data generation functions are not guaranteed to be ordered and gapfree. This is because the numbers may be generated in parallel, in an unsynchronized fashion.
For more details about sequences in Snowflake, see Using Sequences.