Introduction
In this post, we talk about Azure Batch Service. This service was created to execute parallels batch workloads without having the stress of the resources that they need to run these jobs. And that happens because Azure Batch is responsible to create and manage pools of compute nodes. Imagine a lot of parallel batch workloads separated each from another. In a quick thought we can say that is similar to another Azure service VMSS (Virtual Machine Scale Sets).
We can start processing parallels workloads via Azure Batch service ONLY by using
APIs and tools. These can be used to Create and manage a pool of compute nodes, and then we can schedule to run different jobs and tasks.
For Azure Batch service we need to pay for the used resources (compute & storage).
Azure Batch Use Cases
In the table below we can see some of the Azure Batch use cases with an example to understand where Batch service used in every case.
Use Case | Example |
Financial Data Analysis | Imagine the financial data analysis for a bank |
Software Testing | A developing software team can run multiple parallel tests for an application |
ΑI Solution | A good example is a Face recognition service |
Azure Batch Architecture
At the image below we can see a sample of Azure Batch architecture.
Note |
---|
A Pool execute a job and the Nodes(VMs) of the Pool executes one or more tasks/jobs. |
Azure Batch Concepts
Service Quotas And Limits
In this point of the post, we are going to read about the Quotas and the Limits of Azure Batch Service. We should read carefully the following tables because if we don’t understand the meaning of the values for the quotas and the limitations might have future
problems with the workloads.
Resource Quotas
Service Quota is quite important for the Azure Batch workloads because it is very likely in a rough design might be reached this limit.
Resource | Default Limit | Maximum Limit |
Batch accounts per region per subscription | 1-3 | 50 |
Dedicated cores per Batch account | 10-100 | N/A |
Low-priority cores per Batch account | 10-100 | N/A |
Active jobs and job schedules3 per Batch account | 100-300 | 1000 |
Pools per Batch account | 20-100 | 500 |
Pool Size Limits
The Pool Size is the number of the nodes, as a single node consider a virtual machine.
The next table shows the limits for the pool size.
Resource | Maximum Limit |
Compute nodes in inter-node communication enabled pool |
|
Batch service pool allocation mode | 100 |
Batch subscription pool allocation mode | 80 |
Compute nodes in |
|
Dedicated nodes | 2000 |
Low-priority nodes | 1000 |
We can choose high-priority nodes, which are dedicated VMs and low-priority nodes, of course there some limitations which can found in this section of the post. |
Other Limits
All the other limits are relevant with the Azure Batch Workloads details.
Resource | Maximum Limit |
Concurrent tasks per compute node | 4 x number of node cores |
Applications per Batch account | 20 |
Application packages per application | 40 |
Maximum task lifetime | 180 days |
If the workloads need to increase the quota on an Azure Batch Account, then we can follow the directions at this link.
Supported VM Sizes
When we create an Azure Batch Pool, it is very important to select the correct VM size for the nodes of the Pool.
At the tables below we can see what are the sizes that the Azure Batch Pool DOES NOT support.
Family | Unsupported sizes |
Basic A-Series | Basic_A0(A0) |
A-Series | Standard_A0 |
B-Series | All |
VM size which are supported for Low-Priority nodes
Family | Supported Sizes |
M-Series | Standard_M64ms |
M-Series | Standard_M128ms |
Virtual Machine Image Type
There are two types of images that we can choose between Pre-configured and
Custom. Of course, there are some differences between those two types :
Pre-configured Image | Custom Image |
The image already exists | Need to create a new one |
No need for updating & patching | Need patching & updating |
All custom software need to be installed via pool config | No need for large changes in the pool config |
How It Works
In the following steps, we will see a quick demo of the Azure Batch Service.
Prerequisites
To proceed further with the demo we must be sure that we have all the following:
- Microsoft Azure Batch Account linked with an Azure Storage Account
- Visual Studio 2017 or.Net Core 2.1
Create The Azure Batch Account
Search for the Azure Batch Service
From the Azure Portal left main blade, select + Create a resource, type [Batch Service], and select to Create the Batch Service.
Basics Tab
In the Basics Tab we have to fill in few fields and move to the tab “Advanced”
Setting | Value |
Subscription | Select a valid subscription |
Resource group | Select an existing or create a new Resource group |
Account name | Type a name for the Azure Batch Account, MUST be unique |
Location | Select a Location for the Batch Instance |
Select a Storage account | Select an existing storage account or after deployment complete, create a storage account and link it with the Azure Batch account. |
Advanced Tab
In the Advanced tab, we must choose a Pool allocation mode. The choices are two Batch service and User subscription, for the demo purposes we select Batch service.
Batch service | The pool VMs are created using behind-the-scenes Batch service subscriptions. |
User subscription | The pool VMs are created directly in the same subscription as the Batch account. |
Check this
blog post about Azure Batch capabilities for more details.
Review + Create Tab
In the Review + create tab, we just need to check if the validation passed and click
Create to start the Azure Batch Account deployment.
After the Azure Batch account is created we are ready to see how this works. And that is the juicy part of this post.
For the part of the demo we will use an existing project from GitHub, which is coded from the user dlepow. |
Azure Batch Sample
First, we must connect to GitHub and move on the “Azure-Samples/batch-dotnet-ffmpeg-tutorial” section, by clicking
here.
Run the file BatchDotnetTutorialFfmpeg.sln
As we can see there are some Dependencies missing, for that reason we select Built – Rebuild Project
We must download and install the
.Net Core SDK, and then Build the Project (Build – Build Solution).
After the Build is complete successfully, the Dependencies looks fine.
If after the build we get the following error, then we should close and open the VS. |
Download & Install Application Packages
- Download the 64bit ffmpeg 3.4 file from this here.
- Upload the zip file “ffmpeg-3.4-win64-static.zip” to Azure Batch Service, from the left menu blade
Features – Applications – + Add
The Code Part
After we successfully complete the Build of the solution then we must make some changes to the code.
public
class
Program
{
// Update the Batch and Storage account credential strings below with the values unique to your accounts.
// These are used when constructing connection strings for the Batch and Storage client objects.
// Batch account credentials
private
const
string
BatchAccountName =
"xxxxxxxxx"
;
private
const
string
BatchAccountKey =
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=="
;
private
const
string
BatchAccountUrl =
"
https://xxxxxxxxxxxx.westeurope.batch.azure.com"
;
// Storage account credentials
private
const
string
StorageAccountName =
"xxxxxxxxxxxxxxxxxxxxx"
;
private
const
string
StorageAccountKey =
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=="
;
The Batch and Storage Account Credentials are in the Batch Account dashboard in Azure Portal (see the image below).
Running The App
We complete all the necessary steps with download, installations, configurations etc. And the next thing to do is just Running the Application.
This App is processing media files in parallel using the ffmpeg tool |
The first thing when the app runs is the following cmd prompt console
Create Containers [Input] – [Output]
The Console App creates two Storage Containers (Input, Output).
Upload The Media Files
The second step is to begin media files uploading
Create The Batch Pool
The App creates 5 low-priority nodes inside the Batch Pool and the 5 Tasks that will run parallel in every node. At the two next images we can see exactly the Pool with the nodes in the Azure Portal.
The Running Tasks
At the image below we can clearly see, what about Azure Batch parallel workload works
The Final Results
For the final step we don’t have something to do, all the 5 files are processed and created in the Output folder.
Conclusion
In this post, we made a quick intro to Azure Batch a service that is basically addressed to developers but also can be useful and a very important tool for other groups like IT or in our days much better DevOps.
See Also
- Batch service quotas and limits
- VM sizes for compute nodes in an Azure Batch pool
- Create an automatic scaling formula for scaling compute nodes in a Batch pool
- Use multi-instance tasks to run Message Passing Interface (MPI) applications in Batch
- Tutorial: Run a parallel workload with Azure Batch using the .NET API
- Azure Batch Videos – Channel 9
- Batch Pricing