이번글은 공식 문서에서 소개하는 JDBC Batch Operations에 대해 정리했습니다.

JDBC Batch Operations

Most JDBC drivers provide improved performance if you batch multiple calls to the same prepared statement. By grouping updates into batches, you limit the number of round trips to the database.

대부분의 JDBC 드라이버는 같은 prepared statement에 대한 여러 호출을 배치로 묶으면 성능이 향상됩니다.

업데이트를 배치로 묶으면 데이터베이스와의 왕복 횟수를 줄일 수 있습니다.

Basic Batch Operations with JdbcTemplate

You accomplish JdbcTemplate batch processing by implementing two methods of a special interface, BatchPreparedStatementSetter, and passing that implementation in as the second parameter in your batchUpdate method call. You can use the getBatchSize method to provide the size of the current batch. You can use the setValues method to set the values for the parameters of the prepared statement. This method is called the number of times that you specified in the getBatchSize call. The following example updates the t_actor table based on entries in a list, and the entire list is used as the batch:

JdbcTemplate의 배치 처리는 BatchPreparedStatementSetter라는 특별한 인터페이스의 두 메서드를 구현하고, 그 구현체를 batchUpdate 메서드의 두 번째 인자로 전달함으로써 수행합니다.

getBatchSize 메서드를 사용하여 현재 배치의 크기를 지정할 수 있습니다.

setValues 메서드를 사용하여 prepared statement의 파라미터 값을 설정할 수 있습니다.

이 메서드는 getBatchSize에서 지정한 횟수만큼 호출됩니다.

다음 예제는 리스트에 있는 항목들을 기반으로 t_actor 테이블을 업데이트하며, 전체 리스트가 하나의 배치로 사용됩니다.

public class JdbcActorDao implements ActorDao {

	private JdbcTemplate jdbcTemplate;

	public void setDataSource(DataSource dataSource) {
		this.jdbcTemplate = new JdbcTemplate(dataSource);
	}

	public int[] batchUpdate(final List<Actor> actors) {
		return this.jdbcTemplate.batchUpdate(
				"update t_actor set first_name = ?, last_name = ? where id = ?",
				new BatchPreparedStatementSetter() {
					public void setValues(PreparedStatement ps, int i) throws SQLException {
						Actor actor = actors.get(i);
						ps.setString(1, actor.getFirstName());
						ps.setString(2, actor.getLastName());
						ps.setLong(3, actor.getId().longValue());
					}
					public int getBatchSize() {
						return actors.size();
					}
				});
	}

	// ... additional methods
}

If you process a stream of updates or reading from a file, you might have a preferred batch size, but the last batch might not have that number of entries. In this case, you can use the InterruptibleBatchPreparedStatementSetter interface, which lets you interrupt a batch once the input source is exhausted. The isBatchExhausted method lets you signal the end of the batch.

파일에서 읽거나 업데이트 스트림을 처리할 때, 선호하는 배치 크기가 있을 수 있지만 마지막 배치는 그 수만큼의 항목을 가지지 않을 수 있습니다. 이 경우에는 InterruptibleBatchPreparedStatementSetter 인터페이스를 사용할 수 있는데, 이 인터페이스는 입력 소스가 끝났을 때 배치를 중단할 수 있도록 해줍니다.

isBatchExhausted 메서드는 배치가 끝났음을 알리는 역할을 합니다.

Batch Operations with a List of Objects

Both the JdbcTemplate and the NamedParameterJdbcTemplate provides an alternate way of providing the batch update. Instead of implementing a special batch interface, you provide all parameter values in the call as a list. The framework loops over these values and uses an internal prepared statement setter. The API varies, depending on whether you use named parameters. For the named parameters, you provide an array of SqlParameterSource, one entry for each member of the batch. You can use the SqlParameterSourceUtils.createBatch convenience methods to create this array, passing in an array of bean-style objects (with getter methods corresponding to parameters), String-keyed Map instances (containing the corresponding parameters as values), or a mix of both.

JdbcTemplate과 NamedParameterJdbcTemplate 둘 다 배치 업데이트를 제공하는 또 다른 방법을 제공합니다.

특별한 배치 인터페이스를 구현하는 대신, 호출 시 모든 파라미터 값을 리스트 형태로 제공하면 됩니다.

프레임워크는 이 값들을 반복 처리하며 내부적으로 prepared statement setter를 사용합니다.

API는 네임드 파라미터(named parameters)를 사용하는지 여부에 따라 달라집니다.

네임드 파라미터를 사용하는 경우, 배치의 각 항목마다 하나씩 SqlParameterSource 배열을 제공해야 합니다.

이 배열은 SqlParameterSourceUtils.createBatch 편의 메서드를 사용해 만들 수 있으며,

이 메서드에는 다음을 전달할 수 있습니다.

게터 메서드를 가진 빈(bean) 객체 배열, 파라미터 이름을 키로 갖는 Map 인스턴스 배열, 혹은 이 둘의 혼합 형태.

특수 인터페이스를 안 쓰고, 파라미터 값들을 리스트나 배열로 주면 내부적으로 알아서 prepared statement를 돌면서 배치 처리해줍니다.

The following example shows a batch update using named parameters:

다음 예제는 네임드 파라미터(named parameters)를 사용한 배치 업데이트(batch update)를 보여줍니다.

public class JdbcActorDao implements ActorDao {

	private NamedParameterTemplate namedParameterJdbcTemplate;

	public void setDataSource(DataSource dataSource) {
		this.namedParameterJdbcTemplate = new NamedParameterJdbcTemplate(dataSource);
	}

	public int[] batchUpdate(List<Actor> actors) {
		return this.namedParameterJdbcTemplate.batchUpdate(
				"update t_actor set first_name = :firstName, last_name = :lastName where id = :id",
				SqlParameterSourceUtils.createBatch(actors));
	}

	// ... additional methods
}

For an SQL statement that uses the classic ? placeholders, you pass in a list containing an object array with the update values. This object array must have one entry for each placeholder in the SQL statement, and they must be in the same order as they are defined in the SQL statement.

The following example is the same as the preceding example, except that it uses classic JDBC ? placeholders:

고전적인 ? 플레이스홀더를 사용하는 SQL 문에 대해서는, 업데이트 값을 담고 있는 객체 배열(object array)을 리스트 형태로 전달해야 합니다. 이 객체 배열은 SQL 문에 있는 각 플레이스홀더마다 하나의 항목을 가져야 하며, 그 항목들은 SQL 문에 정의된 순서와 동일해야 합니다.

다음 예제는 앞의 예제와 동일하지만, 고전적인 JDBC의 ? 플레이스홀더를 사용한다는 점만 다릅니다.

?를 사용하는 SQL이라면, 각 줄마다 필요한 값을 배열에 담고, 그 배열들을 리스트로 넘겨줘야 배치 업데이트가 됩니다.
값의 순서와 개수는 SQL의 ? 순서와 딱 맞아야 합니다.

List<Object[]> batchArgs = List.of(
    new Object[]{"Alice", 30, 1},
    new Object[]{"Bob", 25, 2},
    new Object[]{"Charlie", 40, 3}
);

jdbcTemplate.batchUpdate(
    "UPDATE user SET name = ?, age = ? WHERE id = ?",
    batchArgs
);

All of the batch update methods that we described earlier return an int array containing the number of affected rows for each batch entry. This count is reported by the JDBC driver. If the count is not available, the JDBC driver returns a value of -2.

앞서 설명한 모든 배치 업데이트 메서드들은, 각 배치 항목마다 영향을 받은 행(row)의 수를 담고 있는 int 배열을 반환합니다.

이 수치는 JDBC 드라이버에 의해 보고됩니다.. 만약 그 수치를 알 수 없는 경우, JDBC 드라이버는 -2 값을 반환합니다.

In such a scenario, with automatic setting of values on an underlying PreparedStatement, the corresponding JDBC type for each value needs to be derived from the given Java type. While this usually works well, there is a potential for issues (for example, with Map-contained null values). Spring, by default, calls ParameterMetaData.getParameterType in such a case, which can be expensive with your JDBC driver. You should use a recent driver version and consider setting the spring.jdbc.getParameterType.ignore property to true (as a JVM system property or via the SpringProperties mechanism) if you encounter a specific performance issue for your application.

이러한 시나리오에서는, 내부적으로 PreparedStatement에 값을 자동으로 설정할 때, 각 값에 해당하는 JDBC 타입을 주어진 Java 타입으로부터 유추해야 합니다.

이는 보통 잘 작동하지만, 문제 발생 가능성도 존재합니다. (예: Map에 포함된 null 값).

Spring은 기본적으로 이러한 경우 ParameterMetaData.getParameterType을 호출하는데, 이는 JDBC 드라이버에 따라 비용이 많이 들 수 있습니다.

만약 애플리케이션에서 특정 성능 문제가 발생한다면, JVM 시스템 속성 또는 SpringProperties 메커니즘을 통해

spring.jdbc.getParameterType.ignore 속성을 true로 설정하는 것을 고려해야 합니다.

As of 6.1.2, Spring bypasses the default getParameterType resolution on PostgreSQL and MS SQL Server. This is a common optimization to avoid further roundtrips to the DBMS just for parameter type resolution which is known to make a very significant difference on PostgreSQL and MS SQL Server specifically, in particular for batch operations. If you happen to see a side effect, for example, when setting a byte array to null without specific type indication, you may explicitly set the spring.jdbc.getParameterType.ignore=false flag as a system property (see above) to restore full getParameterType resolution.

Spring 6.1.2부터는 PostgreSQL과 MS SQL Server에서는 기본적인 getParameterType 추론을 우회(bypass)한다.

이는 파라미터 타입 추론을 위해 DBMS에 불필요한 왕복(Round-trip)을 줄이기 위한 일반적인 최적화이며,

특히 PostgreSQL과 MS SQL Server에서는 배치 작업 성능에 상당한 차이를 만드는 것으로 알려져 있습니다.

만약 부작용(예: byte 배열을 null로 설정할 때 타입 명시 없이 오류 발생 등)이 발생한다면,

시스템 속성으로 spring.jdbc.getParameterType.ignore=false를 설정하여 getParameterType 추론을 복구할 수 있습니다.

Alternatively, you could consider specifying the corresponding JDBC types explicitly, either through a BatchPreparedStatementSetter (as shown earlier), through an explicit type array given to a List<Object[]> based call, through registerSqlType calls on a custom MapSqlParameterSource instance, through a BeanPropertySqlParameterSource that derives the SQL type from the Java-declared property type even for a null value, or through providing individual SqlParameterValue instances instead of plain null values.

대안으로는 JDBC 타입을 명시적으로 지정하는 방법들도 있습니다.

예를 들어, 이전에 설명한 BatchPreparedStatementSetter를 사용하는 방법, List<Object[]> 기반 호출에서 타입 배열을 명시하는 방법 , 사용자 정의 MapSqlParameterSource 인스턴스에서 registerSqlType을 호출하는 방법, Java 선언 타입에서 SQL 타입을 추론할 수 있는 BeanPropertySqlParameterSource를 사용하는 방법, 혹은 단순 null 값 대신 개별 SqlParameterValue 인스턴스를 제공하는 방법 등이 있습니다.

Spring이 자동으로 PreparedStatement에 값을 넣어줄 때, 그 값의 JDBC 타입을 제대로 알아내야 성능도 좋고 오류도 없는데, 상황에 따라 비용이 클 수도 있고 문제가 생길수도 있다. 그러므로 경우에 따라 직접 타입을 지정해주는 게 더 좋을 수 있습니다.

Spring이 jdbcTemplate등을 통해 DB에 값을 넣을 때, "이 값은 무슨 타입이지?"를 알아내기 위해서 내부적으로 getParameterType()을 호출합니다. 보통 잘 작동하지만 값이 null인 경우나 Map을 쓸 때처럼 구조가 복잡한 경우엔 타입을 제대로 유추 못하거나, 속도가 느릴 수 있습니다.

위 문제는 Toss Tech에서 jdbcTemplate로 배치 처리시 발생한 성능 저하 문제의 원인으로도 등장했습니다.

https://sangyunpark99.tistory.com/entry/Toss-Tech-Spring-JDBC-%EC%84%B1%EB%8A%A5-%EB%AC%B8%EC%A0%9C-%EB%84%A4%ED%8A%B8%EC%9B%8C%ED%81%AC-%EB%B6%84%EC%84%9D%EC%9C%BC%EB%A1%9C-%ED%8C%8C%EC%95%85%ED%95%98%EA%B8%B0

[Toss Tech] Spring JDBC 성능 문제, 네트워크 분석으로 파악하기

JDBC 성능 문제를 어떻게 네트워크로 분석할까? 이번 글은 Toss Tech에 작성된 "Spring JDBC 성능 문제, 네트워크 분석으로 파악하기"를 정리했습니다.출처 : https://toss.tech/article/engineering-note-7 Spring JD

sangyunpark99.tistory.com

Batch Operations with Multiple Batches

The preceding example of a batch update deals with batches that are so large that you want to break them up into several smaller batches. You can do this with the methods mentioned earlier by making multiple calls to the batchUpdate method, but there is now a more convenient method. This method takes, in addition to the SQL statement, a Collection of objects that contain the parameters, the number of updates to make for each batch, and a ParameterizedPreparedStatementSetter to set the values for the parameters of the prepared statement. The framework loops over the provided values and breaks the update calls into batches of the size specified.

The following example shows a batch update that uses a batch size of 100:

앞서 나온 배치 업데이트 예제는, 너무 큰 배치를 여러 개의 더 작은 배치로 나누고 싶을 때를 다룹니다.

이전의 메서드들을 사용하여 batchUpdate 메서드를 여러 번 호출함으로써 이 작업을 수행할 수 있지만, 이제 더 편리한 메서드가 있다.

이 메서드는 SQL 문 외에도, 파라미터를 포함하는 객체들의 컬렉션, 각 배치마다 수행할 업데이트 수, 그리고 prepared statement의 파라미터 값을 설정하기 위한 ParameterizedPreparedStatementSetter를 인자로 받습니다.

프레임워크는 제공된 값들을 반복 처리하며, 지정된 크기만큼의 배치로 업데이트 호출을 나눕니다.

다음 예제는 배치 크기를 100으로 설정한 배치 업데이트를 보여줍니다.

public class JdbcActorDao implements ActorDao {

	private JdbcTemplate jdbcTemplate;

	public void setDataSource(DataSource dataSource) {
		this.jdbcTemplate = new JdbcTemplate(dataSource);
	}

	public int[][] batchUpdate(final Collection<Actor> actors) {
		int[][] updateCounts = jdbcTemplate.batchUpdate(
				"update t_actor set first_name = ?, last_name = ? where id = ?",
				actors,
				100,
				(PreparedStatement ps, Actor actor) -> {
					ps.setString(1, actor.getFirstName());
					ps.setString(2, actor.getLastName());
					ps.setLong(3, actor.getId().longValue());
				});
		return updateCounts;
	}

	// ... additional methods
}

The batch update method for this call returns an array of int arrays that contains an array entry for each batch with an array of the number of affected rows for each update. The top-level array’s length indicates the number of batches run, and the second level array’s length indicates the number of updates in that batch. The number of updates in each batch should be the batch size provided for all batches (except that the last one that might be less), depending on the total number of update objects provided. The update count for each update statement is the one reported by the JDBC driver. If the count is not available, the JDBC driver returns a value of -2.

이 호출을 위한 배치 업데이트 메서드는, 각 배치에 대한 항목 배열과 각 업데이트에 대해 영향을 받은 행 수를 포함하는 int 배열의 배열(int arrays)을 반환한다.

최상위 배열의 길이는 실행된 배치의 수를 나타내고, 두 번째 수준 배열의 길이는 해당 배치에서 수행된 업데이트 수를 나타낸다.

각 배치에서의 업데이트 수는 제공된 전체 업데이트 객체 수에 따라, 마지막 배치를 제외하고는 모두 지정된 배치 크기와 같아야 한다.

각 업데이트 문에 대한 업데이트 수는 JDBC 드라이버에 의해 보고된 값이다.

만약 그 수치가 제공되지 않으면, JDBC 드라이버는 -2 값을 반환한다.

'공식문서' 카테고리의 다른 글

[Spring Docs] Modeling JDBC Operations as Java Objects (0)	2025.04.10
[Spring Docs] Simplifying JDBC Operations with the SimpleJdbc Classes (0)	2025.04.09
[Spring Docs] Controlling Database Connection (0)	2025.04.02
Garbage Collector #4 (0)	2025.04.01
[Spring Docs] Using the JDBC Core Classes to Control Basic JDBC Processing and Error Handling #3 (0)	2025.04.01

꼬질꼬질

[Spring Docs] JDBC Batch Operations

JDBC Batch Operations

Basic Batch Operations with JdbcTemplate

Batch Operations with a List of Objects

Batch Operations with Multiple Batches

'공식문서' 카테고리의 다른 글

티스토리툴바

[Spring Docs] JDBC Batch Operations

JDBC Batch Operations

Basic Batch Operations with JdbcTemplate

Batch Operations with a List of Objects

Batch Operations with Multiple Batches

'공식문서' 카테고리의 다른 글

관련글

티스토리툴바