Spring Batch

PHASE 1 — INTRODUCTION & ARCHITECTURE

What is Batch Processing?

Batch Processing means processing bulk data automatically without user interaction.
Examples:

Monthly payroll calculation
Daily report generation
Large database updates or migrations

Spring Batch provides a robust, lightweight, and reusable framework to handle these jobs efficiently.

Why Spring Batch?

Spring Batch is a framework designed to handle large-volume, enterprise-level batch jobs. It supports:

Reading/writing millions of records
Restart/retry logic
Transaction management
Logging and job monitoring
Scalability and partitioning for high performance

Features

Transaction Management
Chunk-based Processing
Restart & Skip Logic
Job Parameters and Metadata
Multi-threaded / Parallel Execution
Integration with Spring Boot, JPA, JDBC, XML, JSON, etc.

Typical Batch Steps

A batch job consists of one or more Steps, each step usually having:

Reader – Reads data (from DB, file, etc.)
Processor – Processes or transforms data
Writer – Writes output (DB, file, API, etc.)

High-Level Architecture

+----------------------------------------+
| Batch Application |
| (Your Spring Boot Job Config) |
+----------------------------------------+
|
▼
+----------------------------------------+
| Spring Batch Core |
| JobLauncher, JobRepository, Steps |
+----------------------------------------+
|
▼
+----------------------------------------+
| Infrastructure Components |
| (DataSource, TransactionManager) |
+----------------------------------------+

Key Components:

Component	Description
Job	A batch job containing one or more Steps
Step	A single phase of a job (read-process-write or tasklet)
JobLauncher	Interface used to launch jobs programmatically
JobRepository	Stores job metadata (status, parameters, executions)
JobInstance	Represents a logical job run with specific parameters
JobExecution	Tracks runtime execution status of a job
StepExecution	Tracks execution details of a step

Use Cases

Month-end salary calculation
Interest computation
Data migration (CSV → DB)
Email notifications

PHASE 2 — SPRING BOOT SETUP

Project Structure

spring-batch-boot/
└── src/
├── main/
│ ├── java/
│ │ └── com/niteshsynergy/
│ │ ├── SpringBatchBootApplication.java
│ │ ├── config/
│ │ │ └── BatchConfig.java
│ │ ├── job/
│ │ │ ├── HelloWorldTasklet.java
│ │ │ └── JobConfig.java
│ └── resources/
│ ├── application.yml
│ └── data/input.csv
└── pom.xml

Maven Dependencies (pom.xml)

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.niteshsynergy</groupId>
<artifactId>spring-batch-boot</artifactId>
<version>1.0.0</version>

<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.3.4</version>
</parent>

<dependencies>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>

<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>

<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>

Main Application

package com.niteshsynergy;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SpringBatchBootApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBatchBootApplication.class, args);
}
}

Basic Configuration

package com.niteshsynergy.config;

import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.context.annotation.Configuration;

@Configuration
@EnableBatchProcessing
public class BatchConfig {
// Spring Boot auto-configures JobRepository, JobLauncher, etc.
}

Simple Tasklet Job :

package com.niteshsynergy.job;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class JobConfig {

private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;

public JobConfig(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
}

@Bean
public Tasklet helloWorldTasklet() {
return (contribution, chunkContext) -> {
System.out.println(">>> Hello World from Spring Batch!");
return RepeatStatus.FINISHED;
};
}

@Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.tasklet(helloWorldTasklet())
.build();
}

@Bean
public Job helloWorldJob() {
return jobBuilderFactory.get("helloWorldJob")
.incrementer(new RunIdIncrementer())
.start(step1())
.build();
}
}

`application.yml`

spring:
datasource:
url: jdbc:h2:mem:batchdb
driver-class-name: org.h2.Driver
username: sa
password: password

batch:
job:
enabled: true
jdbc:
initialize-schema: always

logging:
level:
org.springframework.batch: INFO

Run Output

When you run the app:

Hello World from Spring Batch!

Phase 3 — Creating a Basic Job (Job, Step, JobLauncher, Tasklet & lifecycle)

Goals

Show how to define Jobs and Steps in Spring Boot (annotation / Java config).
Show how to launch a job programmatically and on startup.
Demonstrate job lifecycle info (job parameters, incrementer).

1) `HelloWorldTasklet` — recap above

A Tasklet-based Step executes simple logic (good for small jobs or control-flow steps).

package com.niteshsynergy.job;

import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.stereotype.Component;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.StepContribution;

@Component
public class HelloWorldTasklet implements Tasklet {

@Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
System.out.println(">>> Hello World from Spring Batch Tasklet!");
return RepeatStatus.FINISHED;
}
}

2) Job config using `JobBuilderFactory` / `StepBuilderFactory`

Define beans for Job and Step. We use RunIdIncrementer so you can re-run the same job without changing parameters.

package com.niteshsynergy.config;

import com.niteshsynergy.job.HelloWorldTasklet;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class HelloJobConfig {

private final JobBuilderFactory jobs;
private final StepBuilderFactory steps;
private final HelloWorldTasklet tasklet;

public HelloJobConfig(JobBuilderFactory jobs, StepBuilderFactory steps, HelloWorldTasklet tasklet) {
this.jobs = jobs;
this.steps = steps;
this.tasklet = tasklet;
}

@Bean
public Step helloStep() {
return steps.get("helloStep")
.tasklet(tasklet)
.build();
}

@Bean
public Job helloJob() {
return jobs.get("helloJob")
.incrementer(new RunIdIncrementer())
.start(helloStep())
.build();
}
}

3) Launching the job

Spring Boot auto-runs jobs if spring.batch.job.enabled=true (default is true in Boot).
You may also run programmatically using JobLauncher.

Example CommandLineRunner to launch a job programmatically (useful if you want to control parameters or conditionally run):

package com.niteshsynergy.runner;

import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParametersBuilder;

@Component
public class JobLaunchingRunner implements CommandLineRunner {

private final JobLauncher jobLauncher;
private final Job helloJob;

public JobLaunchingRunner(JobLauncher jobLauncher, Job helloJob) {
this.jobLauncher = jobLauncher;
this.helloJob = helloJob;
}

@Override
public void run(String... args) throws Exception {
JobExecution exec = jobLauncher.run(helloJob, new JobParametersBuilder()
.addLong("time", System.currentTimeMillis()).toJobParameters());
System.out.println("Job Status : " + exec.getStatus());
}
}

Note: RunIdIncrementer or adding a unique job parameter (like time) is needed to allow rerun if the job completed.

Phase 4 — Reader, Processor, Writer (Chunk-oriented processing, CSV → DB example)

This is core Spring Batch usage. We’ll implement a chunk-oriented job that:

reads records from a CSV,
processes each record (e.g., transform/validate),
writes to a relational DB table.

1) Domain / DTO

package com.niteshsynergy.domain;

public class Person {
private Long id;
private String firstName;
private String lastName;
private Integer age;

// constructors
public Person() {}
public Person(Long id, String firstName, String lastName, Integer age) {
this.id = id; this.firstName = firstName; this.lastName = lastName; this.age = age;
}

// getters & setters
public Long getId() { return id; }
public void setId(Long id) { this.id = id; }
public String getFirstName() { return firstName; }
public void setFirstName(String firstName) { this.firstName = firstName; }
public String getLastName() { return lastName; }
public void setLastName(String lastName) { this.lastName = lastName; }
public Integer getAge() { return age; }
public void setAge(Integer age) { this.age = age; }

@Override public String toString() {
return "Person{id=" + id + ", firstName='" + firstName + '\'' +
", lastName='" + lastName + '\'' + ", age=" + age + '}';
}
}

2) Sample CSV (`src/main/resources/data/input.csv`)

id,firstName,lastName,age
1,John,Doe,28
2,Jane,Smith,35
3,Bob,Brown,17

3) DB target table DDL (for example H2)

You can create the table on application startup (use schema.sql) or via JPA. Example schema.sql in src/main/resources:

CREATE TABLE person (
id BIGINT PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
age INTEGER
);

Spring Boot will pick up schema.sql and run it against the datasource on startup.

4) `FlatFileItemReader` (CSV reader) — bean using builder

package com.niteshsynergy.batch;

import com.niteshsynergy.domain.Person;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.core.io.ClassPathResource;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ReaderConfig {

@Bean
public FlatFileItemReader<Person> reader() {
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.resource(new ClassPathResource("data/input.csv"))
.delimited()
.names(new String[] {"id", "firstName", "lastName", "age"})
.fieldSetMapper(new BeanWrapperFieldSetMapper<>() {{
setTargetType(Person.class);
}})
.linesToSkip(1) // skip header
.build();
}
}

5) `ItemProcessor` (transform / validate)

Example: Filter out minors (age < 18) and uppercase names.

package com.niteshsynergy.batch;

import com.niteshsynergy.domain.Person;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.stereotype.Component;

@Component
public class PersonProcessor implements ItemProcessor<Person, Person> {

@Override
public Person process(Person person) throws Exception {
if (person.getAge() == null || person.getAge() < 18) {
// return null to filter out item (will be skipped)
return null;
}
person.setFirstName(person.getFirstName().toUpperCase());
person.setLastName(person.getLastName().toUpperCase());
return person;
}
}

6) `JdbcBatchItemWriter` (write to DB)

We use a builder for JdbcBatchItemWriter. It maps bean properties to SQL parameters.

package com.niteshsynergy.batch;

import com.niteshsynergy.domain.Person;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.database.builder.JdbcBatchItemWriterBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import javax.sql.DataSource;

@Configuration
public class WriterConfig {

@Bean
public JdbcBatchItemWriter<Person> writer(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Person>()
.itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
.sql("INSERT INTO person (id, first_name, last_name, age) VALUES (:id, :firstName, :lastName, :age)")
.dataSource(dataSource)
.build();
}
}

7) Chunk Step + Job configuration

package com.niteshsynergy.config;

import com.niteshsynergy.batch.PersonProcessor;
import com.niteshsynergy.domain.Person;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;

@Configuration
@EnableBatchProcessing
public class ChunkJobConfig {

private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;
private final ItemReader<Person> reader;
private final PersonProcessor processor;
private final ItemWriter<Person> writer;

public ChunkJobConfig(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory,
ItemReader<Person> reader, PersonProcessor processor, ItemWriter<Person> writer) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
this.reader = reader;
this.processor = processor;
this.writer = writer;
}

@Bean
public Step personStep() {
return stepBuilderFactory.get("personStep")
.<Person, Person>chunk(5) // chunk size
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}

@Bean
public Job personJob() {
return jobBuilderFactory.get("personJob")
.incrementer(new RunIdIncrementer())
.start(personStep())
.build();
}
}

.chunk(5) controls commit interval (5 items per transaction). Tune for throughput vs memory.

8) application.yml (update for CSV job)

spring:
datasource:
url: jdbc:h2:mem:batchdb;DB_CLOSE_DELAY=-1;MODE=MySQL
driver-class-name: org.h2.Driver
username: sa
password:
batch:
job:
enabled: true
jdbc:
initialize-schema: always

logging:
level:
org.springframework.batch: INFO

9) Running and expected behavior

On application start, Boot runs personJob (if it is the only job and spring.batch.job.enabled=true).
The reader reads each CSV line (skipping header), processor filters out minors (returns null for those) and uppercases names, writer inserts rows into person table in chunks.
Spring Batch creates metadata tables (JOB_INSTANCE, JOB_EXECUTION, etc.) automatically because initialize-schema: always.

Example console output includes Spring Batch job lifecycle logs and our System.out prints (if any).

Common tweaks & rules

Skip rows in processor

Return null to filter. Use SkipPolicy if you want to skip on exception rather than filtering.

Add logging per item

You can add an ItemWriteListener or ItemProcessListener to log items on read/process/write.

Using JPA instead of JDBC writer

Replace writer with JpaItemWriter<Person> if using JPA / Spring Data.

Handling headers / footers

FlatFileItemReader supports linesToSkip (header) and RecordSeparatorPolicy for custom formats.

Chunk size recommendation

Start with chunk size of 100–1000 for high throughput if records are small; for heavy processing, smaller size (10–100) may be better. Always test.

PHASE 5 — Job Parameters & Execution Control

Purpose

Every run of a batch job is identified by its Job Parameters.
They distinguish one execution from another and are stored in BATCH_JOB_INSTANCE / BATCH_JOB_EXECUTION tables.

Without unique parameters, Spring Batch refuses to re-run a completed job.

Passing parameters programmatically

package com.niteshsynergy.runner;

import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.JobExecution;

@Component
public class ParameterizedJobRunner implements CommandLineRunner {

private final JobLauncher jobLauncher;
private final Job personJob;

public ParameterizedJobRunner(JobLauncher jobLauncher, Job personJob) {
this.jobLauncher = jobLauncher;
this.personJob = personJob;
}

@Override
public void run(String... args) throws Exception {
JobExecution exec = jobLauncher.run(
personJob,
new JobParametersBuilder()
.addLong("time", System.currentTimeMillis()) // ensures uniqueness
.addString("input.file", "data/input.csv")
.toJobParameters()
);
System.out.println("Job executed with status: " + exec.getStatus());
}
}

Accessing parameters inside a Step

package com.niteshsynergy.job;

import org.springframework.batch.core.StepExecution;
import org.springframework.batch.core.annotation.BeforeStep;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.stereotype.Component;

@Component
public class ParameterEchoTasklet implements Tasklet {

private String inputFile;

@BeforeStep
public void beforeStep(StepExecution stepExecution) {
inputFile = stepExecution.getJobParameters().getString("input.file");
}

@Override
public RepeatStatus execute(org.springframework.batch.core.StepContribution contribution,
ChunkContext chunkContext) {
System.out.println("Input file parameter: " + inputFile);
return RepeatStatus.FINISHED;
}
}

Preventing duplicate job runs

If a job finished successfully with a specific parameter set, a rerun with the same parameters throws:

JobInstanceAlreadyCompleteException

Hence you either:

add a unique timestamp (as above), or
use a custom JobParametersIncrementer.

@Bean
public JobParametersIncrementer dailyIncrementer() {
return parameters -> new JobParametersBuilder(parameters)
.addLong("run.id", System.currentTimeMillis())
.toJobParameters();
}

Then link it inside your Job builder.

PHASE 6 — Error Handling / Skip / Retry / Listeners

1. Skip Policy

Sometimes you want to skip faulty records instead of aborting the whole job.

.stepBuilderFactory.get("importStep")
.<Person, Person>chunk(5)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.skipLimit(10) // max 10 skips allowed
.skip(NumberFormatException.class) // skip this exception type
.build();

2. Retry Policy

For transient errors (e.g., network, DB), you can retry automatically.

.stepBuilderFactory.get("retryStep")
.<Person, Person>chunk(5)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.retryLimit(3)
.retry(org.springframework.dao.DeadlockLoserDataAccessException.class)
.build();

3. Listeners

Listeners hook into lifecycle events of jobs or steps.

Example: JobExecutionListener

package com.niteshsynergy.listener;

import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobExecutionListener;
import org.springframework.stereotype.Component;

@Component
public class JobCompletionListener implements JobExecutionListener {

@Override
public void beforeJob(JobExecution jobExecution) {
System.out.println("Job Starting... Params: " + jobExecution.getJobParameters());
}

@Override
public void afterJob(JobExecution jobExecution) {
System.out.println("Job Finished with status: " + jobExecution.getStatus());
}
}

Attach it to a job:

@Bean
public Job personJob(JobCompletionListener listener) {
return jobBuilderFactory.get("personJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(personStep())
.build();
}

StepExecutionListener works similarly (attach to a Step).

4. Chunk listeners and logging

.stepBuilderFactory.get("personStep")
.<Person, Person>chunk(5)
.listener(new ItemProcessListener<>() {
public void beforeProcess(Person item) { System.out.println("Processing: " + item); }
public void afterProcess(Person item, Person result) { System.out.println("Done: " + result); }
public void onProcessError(Person item, Exception e) { System.err.println("Error: " + e); }
})
...

PHASE 7 — Advanced Job Flows (Conditional, Parallel, Partitioned)

1. Conditional Flow (Control flow between steps)

You can link steps with conditions using JobBuilderFactory.

@Bean
public Job conditionalJob() {
return jobBuilderFactory.get("conditionalJob")
.start(step1())
.on("COMPLETED").to(step2())
.from(step1())
.on("FAILED").to(failureHandlerStep())
.end()
.build();
}

on() matches exit status of a step.
You can route the next step accordingly.
Useful for decision logic after a validation or pre-check.

2. Parallel Steps

You can run independent steps concurrently with Flow + split().

@Bean
public Job parallelJob() {
Flow flow1 = new FlowBuilder<Flow>("flow1").start(stepA()).build();
Flow flow2 = new FlowBuilder<Flow>("flow2").start(stepB()).build();

return jobBuilderFactory.get("parallelJob")
.start(flow1)
.split(new SimpleAsyncTaskExecutor()) // runs in parallel
.add(flow2)
.end()
.build();
}

3. Partitioning (Distributed Processing)

Partitioning allows splitting large datasets logically across multiple worker threads or even remote slaves.

a) Master step

@Bean
public Step masterStep(Step slaveStep) {
return stepBuilderFactory.get("masterStep")
.partitioner("slaveStep", rangePartitioner())
.step(slaveStep)
.gridSize(4)
.taskExecutor(new SimpleAsyncTaskExecutor())
.build();
}

b) Partitioner

@Bean
public Partitioner rangePartitioner() {
return gridSize -> {
Map<String, ExecutionContext> map = new HashMap<>();
int range = 1000;
for (int i = 0; i < gridSize; i++) {
ExecutionContext context = new ExecutionContext();
context.putInt("min", i * range);
context.putInt("max", (i + 1) * range - 1);
map.put("partition" + i, context);
}
return map;
};
}

c) Slave step

@Bean
public Step slaveStep(ItemReader<Person> reader, ItemProcessor<Person, Person> processor,
ItemWriter<Person> writer) {
return stepBuilderFactory.get("slaveStep")
.<Person, Person>chunk(100)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}

Each partition gets its own execution context (min, max) for filtering or paging data.

4. Multi-Threaded Step

Simpler alternative to partitioning (single step, multiple threads):

.stepBuilderFactory.get("multiThreadStep")
.<Person, Person>chunk(50)
.reader(reader)
.processor(processor)
.writer(writer)
.taskExecutor(new SimpleAsyncTaskExecutor())
.throttleLimit(8) // number of concurrent threads
.build();

5. Best Practices for Advanced Jobs

Concern	Tip
Thread safety	Each `ItemReader` / `ItemWriter` must be thread-safe or partition-isolated.
DB load	Tune chunk & grid sizes to balance commits vs contention.
Restartability	Store partition info in `ExecutionContext`.
Monitoring	Use `JobExplorer` or custom listeners to trace partitions.

PHASE 8 — Transactions, Restartability & Metadata

Transaction Management

Spring Batch automatically manages transactions for chunk-oriented steps.
Each chunk (e.g., 5 items) runs within a transaction:

If all 5 succeed → commit
If any fail → rollback and retry

Configuration:

.stepBuilderFactory.get("transactionStep")
.<Person, Person>chunk(5)
.reader(reader)
.processor(processor)
.writer(writer)
.transactionManager(new DataSourceTransactionManager(dataSource))
.build();

Key Parameters:

Property	Meaning
`chunk(size)`	number of items per transaction
`commit-interval`	older XML equivalent
`rollback-policy`	customize rollback behavior
`isolation level`	can set through `TransactionAttribute`

Restartability

Spring Batch jobs are restartable by default.
If a job fails, it can resume from where it stopped — provided:

The Job is declared restartable (.preventRestart(false))
You haven’t changed input data locations/IDs drastically
JobRepository is persistent (usually DB, not in-memory)

@Bean
public Job restartableJob() {
return jobBuilderFactory.get("restartableJob")
.incrementer(new RunIdIncrementer())
.preventRestart(false) // allow restart
.start(restartableStep())
.build();
}

When a restart happens:

Completed steps don’t re-run.
Failed/incomplete steps resume.

Job Metadata Tables (auto-created by Spring Boot)

Spring Boot auto-creates these H2/MySQL tables:

Table	Description
BATCH_JOB_INSTANCE	One row per logical job run (unique params)
BATCH_JOB_EXECUTION	Each execution attempt
BATCH_JOB_EXECUTION_PARAMS	Job parameters
BATCH_STEP_EXECUTION	Each step execution details
BATCH_STEP_EXECUTION_CONTEXT	Serialized state/context per step
BATCH_JOB_EXECUTION_CONTEXT	Context across job restarts

You can query them in H2 Console (jdbc:h2:mem:batchdb) to debug or audit job history.

Commit & Rollback Example

Simulate failure every few records:

@Component
public class FaultyProcessor implements ItemProcessor<Person, Person> {
private int count = 0;

@Override
public Person process(Person item) {
count++;
if (count % 4 == 0) {
throw new RuntimeException("Simulated error on record " + count);
}
return item;
}
}

If chunk size = 5 → when failure hits, that chunk rolls back and retries up to configured retry limit.

Rule:

Always use a persistent JobRepository (not in-memory) for restartability.
Keep readers/writers idempotent — same input shouldn’t produce duplicate effects if job restarts.
Avoid non-deterministic logic in processor (like random numbers).

PHASE 9 — Integration & Scheduling

1. Launching Batch Jobs via REST API

You can expose an endpoint that triggers a job manually.

package com.niteshsynergy.controller;

import org.springframework.web.bind.annotation.*;
import org.springframework.batch.core.*;
import org.springframework.batch.core.launch.JobLauncher;
import java.util.Date;

@RestController
@RequestMapping("/batch")
public class JobController {

private final JobLauncher jobLauncher;
private final Job personJob;

public JobController(JobLauncher jobLauncher, Job personJob) {
this.jobLauncher = jobLauncher;
this.personJob = personJob;
}

@GetMapping("/start")
public String startJob(@RequestParam(defaultValue = "data/input.csv") String file) throws Exception {
JobParameters params = new JobParametersBuilder()
.addString("input.file", file)
.addDate("run.date", new Date())
.toJobParameters();

JobExecution exec = jobLauncher.run(personJob, params);
return "Job Status: " + exec.getStatus();
}
}

Now visiting:

http://localhost:8080/batch/start

starts the job dynamically.

2. Scheduling Jobs (Automatic Runs)

You can schedule jobs with Spring Boot’s @EnableScheduling.

package com.niteshsynergy.scheduler;

import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
import org.springframework.batch.core.*;
import org.springframework.batch.core.launch.JobLauncher;

@Component
public class BatchScheduler {

private final JobLauncher jobLauncher;
private final Job personJob;

public BatchScheduler(JobLauncher jobLauncher, Job personJob) {
this.jobLauncher = jobLauncher;
this.personJob = personJob;
}

// Every day at 2 AM
@Scheduled(cron = "0 0 2 * * ?")
public void runNightlyJob() throws Exception {
jobLauncher.run(
personJob,
new JobParametersBuilder()
.addLong("time", System.currentTimeMillis())
.toJobParameters());
System.out.println("Scheduled job triggered!");
}
}

Enable scheduling:

@SpringBootApplication
@EnableScheduling
public class SpringBatchBootApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBatchBootApplication.class, args);
}
}

3. Integration with Spring Data JPA

Instead of plain JDBC, you can use JPA to persist entities.

@Bean
public JpaItemWriter<Person> jpaWriter(EntityManagerFactory emf) {
JpaItemWriter<Person> writer = new JpaItemWriter<>();
writer.setEntityManagerFactory(emf);
return writer;
}

Now the same job works with JPA entity mapping.

4. Integration with Messaging or Cloud

Spring Batch integrates easily with:

Spring Cloud Task (for micro-batch jobs)
Kafka (via custom ItemReader/Writer)
Spring Integration (for event-driven triggering)

PHASE 10 — Monitoring, Testing & Best Practices

1. Job Monitoring Tools

Spring Batch Admin (deprecated but still reference)
Modern options:

Spring Boot Actuator
Custom REST Endpoints
JobExplorer / JobOperator APIs

@Autowired
private JobExplorer jobExplorer;

public void printLastJobStatus() {
JobInstance lastInstance = jobExplorer.getLastJobInstance("personJob");
if (lastInstance != null) {
System.out.println("Last run: " + lastInstance.getJobName());
}
}

2. Unit Testing Batch Jobs

Use @SpringBatchTest + @SpringBootTest.

package com.niteshsynergy;

import org.junit.jupiter.api.Test;
import org.springframework.batch.test.JobLauncherTestUtils;
import org.springframework.batch.test.context.SpringBatchTest;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.batch.core.*;

@SpringBatchTest
@SpringBootTest
class BatchJobTests {

@Autowired
private JobLauncherTestUtils jobLauncherTestUtils;

@Test
void testJobRunsSuccessfully() throws Exception {
JobExecution jobExecution = jobLauncherTestUtils.launchJob(
new JobParametersBuilder()
.addLong("time", System.currentTimeMillis())
.toJobParameters()
);
assert jobExecution.getStatus() == BatchStatus.COMPLETED;
}
}

3. Logging & Metrics

Use Boot’s built-in metrics with Actuator:

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

Add in application.yml:

management:
endpoints:
web:
exposure:
include: health,metrics,info

Monitor job execution counts and durations via:

/actuator/metrics/spring.batch.job

4. Performance Finder Rule

Area	Recommendation
Chunk size	Tune based on record size and commit frequency
Reader/Writer	Prefer streaming readers for large files
Transactions	Keep chunk sizes moderate to reduce rollback overhead
Parallelism	Use multi-threaded or partitioned steps
Indexing	Index DB columns used in readers/writers
JVM	Increase heap and use G1GC for heavy loads

5. Common Pitfalls

No=> Using in-memory DB for production — always persist metadata.
No=> Modifying job parameters mid-run — avoid!
No=>Stateful readers without synchronization in multithreading.
No=> Forgetting to set unique parameters → JobInstanceAlreadyCompleteException.

Final Project Setup:

com/niteshsynergy/
├── SpringBatchBootApplication.java
├── config/
│ ├── BatchConfig.java
│ ├── ChunkJobConfig.java
│ └── HelloJobConfig.java
├── job/
│ ├── HelloWorldTasklet.java
│ ├── ParameterEchoTasklet.java
│ └── FaultyProcessor.java
├── batch/
│ ├── ReaderConfig.java
│ ├── WriterConfig.java
│ ├── PersonProcessor.java
│ └── domain/
│ └── Person.java
├── listener/
│ └── JobCompletionListener.java
├── controller/
│ └── JobController.java
├── scheduler/
│ └── BatchScheduler.java
└── runner/
├── JobLaunchingRunner.java
└── ParameterizedJobRunner.java

Done →

Donate us ? https://razorpay.me/@niteshsynergy

29 min read

Sep 30, 2025

By Nitesh Synergy

Your email address will not be published. Required fields are marked *

Comment

Name

Website

Save my name, email, and website in this browser for the next time I comment.

Please solve the following math function: 5 + 3 = ?

Spring Batch

PHASE 1 — INTRODUCTION & ARCHITECTURE

What is Batch Processing?

Why Spring Batch?

Features

Typical Batch Steps

High-Level Architecture

Key Components:

Use Cases

PHASE 2 — SPRING BOOT SETUP

Project Structure

Maven Dependencies (pom.xml)

Main Application

Basic Configuration

Simple Tasklet Job :

application.yml

Run Output

Phase 3 — Creating a Basic Job (Job, Step, JobLauncher, Tasklet & lifecycle)

Goals

1) HelloWorldTasklet — recap above

2) Job config using JobBuilderFactory / StepBuilderFactory

3) Launching the job

Phase 4 — Reader, Processor, Writer (Chunk-oriented processing, CSV → DB example)

1) Domain / DTO

2) Sample CSV (src/main/resources/data/input.csv)

3) DB target table DDL (for example H2)

4) FlatFileItemReader (CSV reader) — bean using builder

5) ItemProcessor (transform / validate)

6) JdbcBatchItemWriter (write to DB)

7) Chunk Step + Job configuration

8) application.yml (update for CSV job)

9) Running and expected behavior

Common tweaks & rules

Skip rows in processor

Add logging per item

Using JPA instead of JDBC writer

Handling headers / footers

Chunk size recommendation

PHASE 5 — Job Parameters & Execution Control

Purpose

Passing parameters programmatically

Accessing parameters inside a Step

Preventing duplicate job runs

PHASE 6 — Error Handling / Skip / Retry / Listeners

1. Skip Policy

2. Retry Policy

3. Listeners

PHASE 7 — Advanced Job Flows (Conditional, Parallel, Partitioned)

1. Conditional Flow (Control flow between steps)

2. Parallel Steps

3. Partitioning (Distributed Processing)

a) Master step

4. Multi-Threaded Step

5. Best Practices for Advanced Jobs

PHASE 8 — Transactions, Restartability & Metadata

Transaction Management

Restartability

Job Metadata Tables (auto-created by Spring Boot)

Commit & Rollback Example

Rule:

PHASE 9 — Integration & Scheduling

1. Launching Batch Jobs via REST API

2. Scheduling Jobs (Automatic Runs)

3. Integration with Spring Data JPA

4. Integration with Messaging or Cloud

PHASE 10 — Monitoring, Testing & Best Practices

1. Job Monitoring Tools

2. Unit Testing Batch Jobs

3. Logging & Metrics

4. Performance Finder Rule

5. Common Pitfalls

Leave a comment

`application.yml`

1) `HelloWorldTasklet` — recap above

2) Job config using `JobBuilderFactory` / `StepBuilderFactory`

2) Sample CSV (`src/main/resources/data/input.csv`)

4) `FlatFileItemReader` (CSV reader) — bean using builder

5) `ItemProcessor` (transform / validate)

6) `JdbcBatchItemWriter` (write to DB)