Apply an R Function in Spark
Applies an R function to a Spark object (typically, a Spark DataFrame).
spark_apply(x, f, columns = colnames(x), memory = TRUE, group_by = NULL,
packages = TRUE, ...)Arguments
| x | An object (usually a |
| f | A function that transforms a data frame partition into a data frame.
The function |
| columns | A vector of column names or a named vector of column types for the transformed object. Defaults to the names from the original object and adds indexed column names when not enough columns are specified. |
| memory | Boolean; should the table be cached into memory? |
| group_by | Column name used to group by data frame partitions. |
| packages | Boolean to distribute For offline clusters where |
| ... | Optional arguments; currently unused. |