Overview
SparkPlusPlus is a Scala framework for building Apache Spark applications with less boilerplate and a more consistent runtime model.
It currently focuses on batch Spark jobs and gives you a standard way to:
- load typed YAML configuration
- create and customize a
SparkSession - declare input and output datasets in YAML
- pass runtime context into your application logic
- keep reusable
DataFrameutilities close to your jobs
If you are starting fresh, begin with Getting Started.