Autoscaling

To help you save resources and reduce time on management, Spark dynamic allocation is now enabled in Data Flow.

Resource planning for data processing is a complex task. Resource usage is a function of the volume of the data. Day-to-day volumes of data can vary, meaning the computational resource required changes, too.

You can define a Data Flow cluster based on a range of executors, instead of just a fixed number of executors. Spark provides a mechanism to dynamically adjust the resources application occupies based on the workload. The application might relinquish resources if they are no longer used and request them again later when there is demand. Billing only counts the time when a resource is used by the application. Returned resources are not billed.