Configuring topic properties and retention policies is essential for effectively managing data storage, retention periods, and message durability in Apache Kafka. By fine-tuning these settings, you can control the behavior of topics, ensure data retention compliance, and optimize storage utilization. In this article, we will explore the process of configuring topic properties and retention policies in Kafka. We will provide code samples, reference links, and resources to guide you through the configuration process.
Configuring Topic Properties:
- Topic-Level Configuration:
- Kafka allows setting various topic-level properties to customize the behavior of individual topics. These properties include the number of partitions, replication factor, and cleanup policy.
- Number of Partitions:
- Configuring the number of partitions in a topic determines the parallelism and scalability of message processing. Increasing the number of partitions allows for higher throughput and parallel consumption by multiple consumers.
- Replication Factor:
- The replication factor determines the number of replicas for each partition. Configuring the replication factor ensures data durability and fault tolerance by replicating data across multiple brokers.
Code Sample: Creating a Topic with Configured Properties in Java
import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.clients.admin.NewTopic;
import java.util.Properties;
import java.util.concurrent.ExecutionException;
public class KafkaTopicConfigurationExample {
public static void main(String[] args) {
String topicName = "my_topic";
int numPartitions = 3;
short replicationFactor = 2;
Properties properties = new Properties();
properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
try (AdminClient adminClient = AdminClient.create(properties)) {
// Create the topic with configured properties
NewTopic newTopic = new NewTopic(topicName, numPartitions, replicationFactor);
adminClient.createTopics(Collections.singleton(newTopic)).all().get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
}
}
Reference Link: Apache Kafka Documentation – Topic-Level Configurations – https://kafka.apache.org/documentation/#topicconfigs
Configuring Retention Policies:
- Log Compaction:
- Kafka supports log compaction, which ensures that the latest key-value pair for each key is retained in the log. This is useful when maintaining a compacted log of changes for entities.
- Time-Based Retention:
- Kafka allows setting a retention time for messages in a topic. Messages older than the specified time will be automatically deleted from the log.
- Size-Based Retention:
- Kafka also supports size-based retention, where you can specify the maximum size of the log for a topic. Once the log size exceeds the configured threshold, older messages will be deleted.
Code Sample: Configuring Retention Policies for a Topic in Java
import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.clients.admin.AlterConfigOp;
import org.apache.kafka.clients.admin.Config;
import org.apache.kafka.clients.admin.ConfigEntry;
import org.apache.kafka.clients.admin.ConfigEntry.AlterConfigOpType;
import org.apache.kafka.clients.admin.ConfigResource;
import org.apache.kafka.clients.admin.DescribeConfigsResult;
import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.common.config.TopicConfig;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutionException;
public class KafkaTopicRetentionExample {
public static void main(String[] args) {
String topicName = "my_topic";
long retentionTimeMs = 86400000;
// 24 hours
Properties properties = new Properties();
properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
try (AdminClient adminClient = AdminClient.create(properties)) {
// Describe the topic to retrieve its current configuration
DescribeConfigsResult describeResult = adminClient.describeConfigs(Collections.singleton(new ConfigResource(ConfigResource.Type.TOPIC, topicName)));
Config topicConfig = describeResult.all().get().get(new ConfigResource(ConfigResource.Type.TOPIC, topicName));
// Update the retention time configuration
Map<ConfigResource, Config> updateConfigs = new HashMap<>();
ConfigEntry retentionEntry = new ConfigEntry(TopicConfig.RETENTION_MS_CONFIG, String.valueOf(retentionTimeMs), AlterConfigOpType.SET);
Config updatedConfig = new Config(Collections.singleton(retentionEntry));
updateConfigs.put(new ConfigResource(ConfigResource.Type.TOPIC, topicName), updatedConfig);
adminClient.incrementalAlterConfigs(updateConfigs).all().get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
}
}
Reference Link: Apache Kafka Documentation – Log Compaction and Retention – https://kafka.apache.org/documentation/#compactionandretention
Helpful Video: “Apache Kafka – Configuring Retention and Cleanup Policies” by Simplilearn – https://www.youtube.com/watch?v=_xF7k5jpzRQ
Conclusion:
Configuring topic properties and retention policies in Apache Kafka allows you to optimize the behavior of individual topics, ensure data durability, and manage storage space effectively. By configuring the number of partitions, replication factor, log compaction, and retention time, you can customize topics to suit your application’s requirements.
In this article, we explored the process of configuring topic properties and retention policies in Kafka. The provided code samples demonstrated the creation of a topic with configured properties and the configuration of retention policies. The reference links to the official Kafka documentation and the suggested video resource offer further insights into topic configuration and retention policies.
By understanding and effectively configuring topic properties and retention policies, you can build scalable, reliable, and efficient data streaming applications using Apache Kafka.
Subscribe to our email newsletter to get the latest posts delivered right to your email.