Spark 3.0 brings a new plugin framework to spark. This plugin framework allows users to plugin custom code at the driver and workers. This will allow for advanced monitoring and custom metrics tracking. This set of API’s are going to help tune spark better than before.
In this series of posts I will be discussing about the different aspects of plugin framework. This is the second post in the series, where we will understand the different API’s exposed in framework. You can read all the posts in the series here.
Spark Plugin Interface
The top interface of the framework in SparkPlugin. It exposes below two methods
From the name of methods, we can figure out that these are entry points to specify the driver and executor plugin. If user wants to implement only one, then they can return null in other method.
This is the interface for the driver side plugin. It has below methods. All are optional to override.
This method is called at the beginning of driver initialisation. It has access to spark context and plugin context. The method returns a map which will be passed to executor plugin.
This method is used for the tracking custom metrics in driver side.
This method is used for receiving RPC messages sent by the executors.
This method is called when driver getting shutdown.
The below are methods exposed in the executor plugin interface.
This method is called when an executor is started. extraConf are the parameters sent by driver.
This method is called when executor shutdown.
Adding Spark Plugin
We can add our custom spark plugins to a spark session by setting spark.plugins configuration on spark session.
Spark plugin framework brings a powerful customization to spark ecosystem. In this post, we discussed about different interfaces provided by the plugin framework.