Function Reference
Spark Operations |
|
|---|---|
Read Spark Configuration |
|
|
|
Manage Spark Connections |
|
|
Find a given Spark installation by version. |
View Entries in the Spark Log |
|
Open the Spark web interface |
|
Spark Data |
|
Read a CSV file into a Spark DataFrame |
|
Read from JDBC connection into a Spark DataFrame. |
|
Read a JSON file into a Spark DataFrame |
|
Read a Parquet file into a Spark DataFrame |
|
Read from a generic source into a Spark DataFrame. |
|
Reads from a Spark Table into a Spark DataFrame. |
|
Write a Spark DataFrame to a CSV |
|
Writes a Spark DataFrame into a JDBC table |
|
Write a Spark DataFrame to a JSON file |
|
Write a Spark DataFrame to a Parquet file |
|
Writes a Spark DataFrame into a generic source |
|
Writes a Spark DataFrame into a Spark table |
|
Spark Tables |
|
Show database list |
|
Cache a Spark Table |
|
Use specific database |
|
Uncache a Spark Table |
|
Spark DataFrames |
|
Create DataFrame for along Object |
|
Bind multiple Spark DataFrames by row and column |
|
Broadcast hint |
|
Checkpoint a Spark DataFrame |
|
Coalesces a Spark DataFrame |
|
Copy an Object into Spark |
|
Create DataFrame for Length |
|
Mutate a Spark DataFrame |
|
Gets number of partitions of a Spark DataFrame |
|
Partition a Spark Dataframe |
|
Pivot a Spark DataFrame |
|
Model Predictions with Spark DataFrames |
|
Read a Column from a Spark DataFrame |
|
Register a Spark DataFrame |
|
Repartition a Spark DataFrame |
|
Model Residuals |
|
Randomly Sample Rows from a Spark DataFrame |
|
Separate a Vector Column into Scalar Columns |
|
Create DataFrame for Range |
|
Sort a Spark DataFrame |
|
Add a Unique ID Column to a Spark DataFrame |
|
Spark Machine Learning |
|
Spark ML -- Alternating Least Squares (ALS) matrix factorization. |
|
Spark ML -- Decision Trees |
|
Spark ML -- Generalized Linear Regression |
|
Spark ML -- Gradient-Boosted Tree |
|
Spark ML -- K-Means Clustering |
|
Spark ML -- Latent Dirichlet Allocation |
|
Spark ML -- Linear Regression |
|
Spark ML -- Logistic Regression |
|
Extracts data associated with a Spark ML model |
|
Spark ML -- Multilayer Perceptron |
|
Spark ML -- Naive-Bayes |
|
Spark ML -- One vs Rest |
|
Spark ML -- Principal Components Analysis |
|
Spark ML -- Random Forests |
|
Spark ML -- Survival Regression |
|
Spark Feature Transformers |
|
Feature Transformation -- Binarizer |
|
Feature Transformation -- Bucketizer |
|
Feature Tranformation -- CountVectorizer |
|
Feature Transformation -- Discrete Cosine Transform (DCT) |
|
Feature Transformation -- ElementwiseProduct |
|
Feature Transformation -- IndexToString |
|
Feature Transformation -- OneHotEncoder |
|
Feature Transformation -- QuantileDiscretizer |
|
Feature Transformation -- SQLTransformer |
|
Feature Transformation -- StringIndexer |
|
Feature Transformation -- VectorAssembler |
|
Feature Tranformation -- Tokenizer |
|
Feature Tranformation -- RegexTokenizer |
|
Spark Machine Learning Utilities |
|
Spark ML - Binary Classification Evaluator |
|
Spark ML - Classification Evaluator |
|
Create Dummy Variables |
|
Create an ML Model Object |
|
Spark ML - Feature Importance for Tree Models |
|
Options for Spark ML Routines |
|
Prepare a Spark DataFrame for Spark ML Routines |
|
Pre-process the Inputs to a Spark ML Routine |
|
Save / Load a Spark ML Model Fit |
|
Extensions |
|
Compile Scala sources into a Java Archive (jar) |
|
Read configuration values for a connection |
|
Downloads default Scala Compilers |
|
Discover the Scala Compiler |
|
Access the Spark API |
|
Runtime configuration interface for Hive |
|
Invoke a Method on a JVM Object |
|
Register a Package that Implements a Spark Extension |
|
Define a Spark Compilation Specification |
|
Default Compilation Specification for Spark Extensions |
|
Retrieve the Spark Connection Associated with an R Object |
|
Runtime configuration interface for Spark. |
|
Retrieve a Spark DataFrame |
|
Define a Spark dependency |
|
Set the SPARK_HOME environment variable |
|
Retrieve a Spark JVM Object Reference |
|
Get the Spark Version Associated with a Spark Connection |
|
Distributed Computing |
|
Apply an R Function in Spark |
|