Function Reference

Spark Operations
Spark Data
Spark Tables
Spark DataFrames
Spark Machine Learning
Spark Feature Transformers
Spark Machine Learning Utilities
Extensions
Distributed Computing

Spark Operations
`spark_config`	Read Spark Configuration
`spark_connect` `spark_connection_is_open` `spark_disconnect` `spark_disconnect_all`	Manage Spark Connections
`spark_install_find` `spark_install` `spark_uninstall` `spark_install_dir` `spark_install_tar` `spark_installed_versions` `spark_available_versions`	Find a given Spark installation by version.
`spark_log`	View Entries in the Spark Log
`spark_web`	Open the Spark web interface
Spark Data
`spark_read_csv`	Read a CSV file into a Spark DataFrame
`spark_read_jdbc`	Read from JDBC connection into a Spark DataFrame.
`spark_read_json`	Read a JSON file into a Spark DataFrame
`spark_read_parquet`	Read a Parquet file into a Spark DataFrame
`spark_read_source`	Read from a generic source into a Spark DataFrame.
`spark_read_table`	Reads from a Spark Table into a Spark DataFrame.
`spark_write_csv`	Write a Spark DataFrame to a CSV
`spark_write_jdbc`	Writes a Spark DataFrame into a JDBC table
`spark_write_json`	Write a Spark DataFrame to a JSON file
`spark_write_parquet`	Write a Spark DataFrame to a Parquet file
`spark_write_source`	Writes a Spark DataFrame into a generic source
`spark_write_table`	Writes a Spark DataFrame into a Spark table
Spark Tables
`src_databases`	Show database list
`tbl_cache`	Cache a Spark Table
`tbl_change_db`	Use specific database
`tbl_uncache`	Uncache a Spark Table
Spark DataFrames
`sdf_along`	Create DataFrame for along Object
`sdf_bind_rows` `sdf_bind_cols`	Bind multiple Spark DataFrames by row and column
`sdf_broadcast`	Broadcast hint
`sdf_checkpoint`	Checkpoint a Spark DataFrame
`sdf_coalesce`	Coalesces a Spark DataFrame
`sdf_copy_to` `sdf_import`	Copy an Object into Spark
`sdf_len`	Create DataFrame for Length
`sdf_mutate` `sdf_mutate_`	Mutate a Spark DataFrame
`sdf_num_partitions`	Gets number of partitions of a Spark DataFrame
`sdf_partition`	Partition a Spark Dataframe
`sdf_pivot`	Pivot a Spark DataFrame
`sdf_predict`	Model Predictions with Spark DataFrames
`sdf_read_column`	Read a Column from a Spark DataFrame
`sdf_register`	Register a Spark DataFrame
`sdf_repartition`	Repartition a Spark DataFrame
`sdf_residuals`	Model Residuals
`sdf_sample`	Randomly Sample Rows from a Spark DataFrame
`sdf_separate_column`	Separate a Vector Column into Scalar Columns
`sdf_seq`	Create DataFrame for Range
`sdf_sort`	Sort a Spark DataFrame
`sdf_with_unique_id`	Add a Unique ID Column to a Spark DataFrame
Spark Machine Learning
`ml_als_factorization`	Spark ML -- Alternating Least Squares (ALS) matrix factorization.
`ml_decision_tree`	Spark ML -- Decision Trees
`ml_generalized_linear_regression`	Spark ML -- Generalized Linear Regression
`ml_gradient_boosted_trees`	Spark ML -- Gradient-Boosted Tree
`ml_kmeans`	Spark ML -- K-Means Clustering
`ml_lda`	Spark ML -- Latent Dirichlet Allocation
`ml_linear_regression`	Spark ML -- Linear Regression
`ml_logistic_regression`	Spark ML -- Logistic Regression
`ml_model_data`	Extracts data associated with a Spark ML model
`ml_multilayer_perceptron`	Spark ML -- Multilayer Perceptron
`ml_naive_bayes`	Spark ML -- Naive-Bayes
`ml_one_vs_rest`	Spark ML -- One vs Rest
`ml_pca`	Spark ML -- Principal Components Analysis
`ml_random_forest`	Spark ML -- Random Forests
`ml_survival_regression`	Spark ML -- Survival Regression
Spark Feature Transformers
`ft_binarizer`	Feature Transformation -- Binarizer
`ft_bucketizer`	Feature Transformation -- Bucketizer
`ft_count_vectorizer`	Feature Tranformation -- CountVectorizer
`ft_discrete_cosine_transform`	Feature Transformation -- Discrete Cosine Transform (DCT)
`ft_elementwise_product`	Feature Transformation -- ElementwiseProduct
`ft_index_to_string`	Feature Transformation -- IndexToString
`ft_one_hot_encoder`	Feature Transformation -- OneHotEncoder
`ft_quantile_discretizer`	Feature Transformation -- QuantileDiscretizer
`ft_sql_transformer`	Feature Transformation -- SQLTransformer
`ft_string_indexer`	Feature Transformation -- StringIndexer
`ft_vector_assembler`	Feature Transformation -- VectorAssembler
`ft_tokenizer`	Feature Tranformation -- Tokenizer
`ft_regex_tokenizer`	Feature Tranformation -- RegexTokenizer
Spark Machine Learning Utilities
`ml_binary_classification_eval`	Spark ML - Binary Classification Evaluator
`ml_classification_eval`	Spark ML - Classification Evaluator
`ml_create_dummy_variables`	Create Dummy Variables
`ml_model`	Create an ML Model Object
`ml_tree_feature_importance`	Spark ML - Feature Importance for Tree Models
`ml_options`	Options for Spark ML Routines
`ml_prepare_dataframe`	Prepare a Spark DataFrame for Spark ML Routines
`ml_prepare_response_features_intercept` `ml_prepare_features`	Pre-process the Inputs to a Spark ML Routine
`ml_load` `ml_save`	Save / Load a Spark ML Model Fit
Extensions
`compile_package_jars`	Compile Scala sources into a Java Archive (jar)
`connection_config`	Read configuration values for a connection
`download_scalac`	Downloads default Scala Compilers
`find_scalac`	Discover the Scala Compiler
`spark_context` `java_context` `hive_context` `spark_session`	Access the Spark API
`hive_context_config`	Runtime configuration interface for Hive
`invoke` `invoke_static` `invoke_new`	Invoke a Method on a JVM Object
`register_extension` `registered_extensions`	Register a Package that Implements a Spark Extension
`spark_compilation_spec`	Define a Spark Compilation Specification
`spark_default_compilation_spec`	Default Compilation Specification for Spark Extensions
`spark_connection`	Retrieve the Spark Connection Associated with an R Object
`spark_context_config`	Runtime configuration interface for Spark.
`spark_dataframe`	Retrieve a Spark DataFrame
`spark_dependency`	Define a Spark dependency
`spark_home_set`	Set the SPARK_HOME environment variable
`spark_jobj`	Retrieve a Spark JVM Object Reference
`spark_version`	Get the Spark Version Associated with a Spark Connection
Distributed Computing
`spark_apply`	Apply an R Function in Spark

Function Reference

Spark Operations

Spark Data

Spark Tables

Spark DataFrames

Spark Machine Learning

Spark Feature Transformers

Spark Machine Learning Utilities

Extensions

Distributed Computing