snowflake.snowpark.DataFrameStatFunctions.sampleBy¶
- DataFrameStatFunctions.sampleBy(col: ColumnOrName, fractions: Dict[LiteralType, float]) DataFrame [source]¶
Returns a DataFrame containing a stratified sample without replacement, based on a
dict
that specifies the fraction for each stratum.Example:
>>> df = session.create_dataframe([("Bob", 17), ("Alice", 10), ("Nico", 8), ("Bob", 12)], schema=["name", "age"]) >>> fractions = {"Bob": 0.5, "Nico": 1.0} >>> sample_df = df.stat.sample_by("name", fractions) # non-deterministic result
- Parameters:
col – The name of the column that defines the strata.
fractions – A
dict
that specifies the fraction to use for the sample for each stratum. If a stratum is not specified in thedict
, the method uses 0 as the fraction.