sdf_rhyper

Generate random samples from a hypergeometric distribution

Description

Generator method for creating a single-column Spark dataframes comprised of i.i.d. samples from a hypergeometric distribution.

Usage

sdf_rhyper(
  sc,
  nn,
  m,
  n,
  k,
  num_partitions = NULL,
  seed = NULL,
  output_col = "x"
)

Arguments

Argument	Description
sc	A Spark connection.
nn	Sample Size.
m	The number of successes among the population.
n	The number of failures among the population.
k	The number of draws.
num_partitions	Number of partitions in the resulting Spark dataframe

(default: default parallelism of the Spark cluster). seed | Random seed (default: a random long integer). output_col | Name of the output column containing sample values (default: “x”).

sdf_rhyper

Description

Usage

Arguments

See Also