Ora

How to Run a Batch Job in Spring Batch

Published in Spring Batch Execution 7 mins read

Running a batch job in Spring involves leveraging the robust capabilities of Spring Batch, often integrated with Spring Boot, to process large volumes of data efficiently. This guide outlines the essential steps and best practices for setting up and executing your batch processes.

Core Components of Spring Batch

Before diving into execution, it's crucial to understand the fundamental building blocks of a Spring Batch application:

  • Job: The overarching process that consists of one or more Step instances. It defines the entire batch workflow.
  • Step: An independent, sequential phase of a Job. A Step can be item-oriented (processing data in chunks) or task-oriented (executing a single task).
  • ItemReader: Reads data from a source (e.g., database, file, message queue) one item at a time or in chunks.
  • ItemProcessor: Processes an item read by the ItemReader, transforming or validating it before it's written. This component is optional.
  • ItemWriter: Writes processed items to a destination (e.g., database, file, another message queue).

Step-by-Step Guide to Running a Spring Batch Job

Executing a batch job in Spring Batch typically follows a structured approach, from initial setup to final deployment.

Step 1: Set Up Project Dependencies

Begin by including the necessary Spring Batch and Spring Boot dependencies in your project's build file. Spring Boot simplifies the configuration and setup of Spring Batch.

Maven Example

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-batch</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>
    <dependency>
        <groupId>com.h2database</groupId>
        <artifactId>h2</artifactId>
        <scope>runtime</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>

For more details on setting up Spring Batch, refer to the Spring Batch Documentation.

Step 2: Define and Configure a Batch Job

With dependencies in place, the next step is to define your batch job using Spring Batch's configuration capabilities. This involves creating a configuration class annotated with @EnableBatchProcessing. This annotation provides the core Spring Batch infrastructure.

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.job.builder.JobBuilder;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemWriter;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.PlatformTransactionManager;

@Configuration
public class BatchConfig {

    @Bean
    public Job importUserJob(JobRepository jobRepository, Step step1) {
        return new JobBuilder("importUserJob", jobRepository)
                .start(step1)
                .build();
    }

    @Bean
    public Step step1(
        JobRepository jobRepository,
        PlatformTransactionManager transactionManager,
        ItemReader<String> reader,
        ItemProcessor<String, String> processor,
        ItemWriter<String> writer) {

        return new StepBuilder("step1", jobRepository)
                .<String, String>chunk(10, transactionManager) // Process 10 items at a time
                .reader(reader)
                .processor(processor)
                .writer(writer)
                .build();
    }

    // Example ItemReader (reads from a simple list)
    @Bean
    public ItemReader<String> reader() {
        return new org.springframework.batch.item.support.ListItemReader<>(
            java.util.Arrays.asList("Alice", "Bob", "Charlie"));
    }

    // Example ItemProcessor (converts to uppercase)
    @Bean
    public ItemProcessor<String, String> processor() {
        return item -> item.toUpperCase();
    }

    // Example ItemWriter (prints to console)
    @Bean
    public ItemWriter<String> writer() {
        return items -> {
            for (String item : items) {
                System.out.println("Writing item: " + item);
            }
        };
    }
}

Step 3: Trigger or Schedule the Batch Job

Once your job is defined, you need a mechanism to initiate its execution. Spring Batch offers several ways to trigger jobs:

Manual Triggering

You can run a job directly from your Spring Boot application's main method, or by building a REST endpoint that triggers the JobLauncher. This is useful for on-demand execution.

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;

@SpringBootApplication
public class MyBatchApplication {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job importUserJob;

    public static void main(String[] args) {
        SpringApplication.run(MyBatchApplication.class, args);
    }

    // This CommandLineRunner will execute the job once the application starts
    @Bean
    public CommandLineRunner runJob() {
        return args -> {
            JobParameters jobParameters = new JobParametersBuilder()
                    .addLong("time", System.currentTimeMillis()) // Unique job parameter
                    .toJobParameters();
            jobLauncher.run(importUserJob, jobParameters);
        };
    }
}

Scheduled Execution

For recurring batch jobs, Spring's @Scheduled annotation is a common choice. This allows you to define a fixed rate or a cron expression for job execution.

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;

@Component
@EnableScheduling // Enable scheduling in your main application class or a configuration class
public class JobScheduler {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job importUserJob;

    // This will run the job every 5 minutes (adjust cron expression as needed)
    @Scheduled(cron = "0 */5 * * * *") // Runs every 5 minutes
    public void runScheduledJob() {
        try {
            JobParameters jobParameters = new JobParametersBuilder()
                    .addLong("time", System.currentTimeMillis())
                    .toJobParameters();
            JobExecution execution = jobLauncher.run(importUserJob, jobParameters);
            System.out.println("Job 'importUserJob' started with status: " + execution.getStatus());
        } catch (Exception e) {
            System.err.println("Error running job: " + e.getMessage());
        }
    }
}

Remember to enable scheduling in your main application class with @EnableScheduling. For more on scheduling, consult the Spring Boot Scheduling documentation.

External Schedulers

For more complex scheduling needs or enterprise environments, integrating with external schedulers like Cron, Kubernetes CronJobs, Apache Airflow, or Jenkins is common. In this scenario, your Spring Boot application acts as a standalone executable that the external scheduler invokes.

Execution Method Description Use Case
Manual / On-Demand Triggered via a CommandLineRunner, REST endpoint, or direct JobLauncher invocation. Ad-hoc jobs, testing, single-run data migrations.
@Scheduled Built-in Spring scheduling using @Scheduled annotations with cron expressions or fixed rates. Recurring jobs within a single application instance.
External Schedulers Invoked by external tools (Cron, Kubernetes, Airflow). The Spring app runs as a standalone process. Distributed environments, complex dependencies, robust monitoring.

Step 4: Configure Application Properties

Configure your application.properties or application.yml file to define the data source for Spring Batch's job repository (where job metadata is stored) and other batch-related settings.

# H2 Database configuration (for development)
spring.datasource.url=jdbc:h2:mem:testdb;DB_CLOSE_DELAY=-1;DB_CLOSE_ON_EXIT=FALSE
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=

# Spring Batch properties
spring.batch.jdbc.initialize-schema=always
# To prevent jobs from running automatically on startup, set this to NEVER
spring.batch.job.enabled=false

Setting spring.batch.job.enabled=false is crucial if you are using @Scheduled or an external scheduler, as it prevents Spring Boot from attempting to run all defined jobs automatically upon application startup. The spring.batch.jdbc.initialize-schema=always property ensures the necessary Spring Batch metadata tables are created in your database.

Step 5: Run the Spring Boot Application

Finally, execute your Spring Boot application. Depending on your chosen triggering method (manual, scheduled, or external), the job will run accordingly.

You can run the application from your IDE or via the command line:

# Using Maven
./mvnw spring-boot:run

# Using Gradle
./gradlew bootRun

# Or by running the compiled JAR
java -jar target/your-batch-app.jar

Upon startup, if configured, scheduled jobs will begin at their designated times, or manual triggers will initiate the batch processes.

Practical Considerations

  • Error Handling and Restartability: Spring Batch jobs are inherently restartable. Design your steps to be idempotent, and configure listeners for error handling.
  • Job Parameters: Use job parameters to uniquely identify job executions and pass runtime-specific data to your batch jobs.
  • Monitoring: Leverage the Spring Batch Admin UI (or build custom dashboards) to monitor job status, view job history, and troubleshoot failures.
  • Chunk Processing: For optimal performance, configure chunk sizes carefully, balancing transaction overhead with memory usage.

By following these steps, you can effectively run and manage batch jobs within your Spring applications, taking full advantage of Spring Batch's robust features.