Apache

August 12, 2023

4 Min Read

KSQL: The Kafka Query Language – SQL for Stream Processing

Last Update: August 12, 2023

KSQL, an innovative language developed by Confluent, is a game-changer for anyone dealing with real-time data streaming using Apache Kafka. KSQL simplifies the complexity of writing and deploying powerful, scalable stream processing logic. It achieves this by providing an SQL-like syntax that many data professionals are already comfortable with.

This article is designed to be a comprehensive guide to KSQL, diving deep into its practical usage for data filtering, transformation, aggregation, and joining operations. We’ll explore how to leverage KSQL to unlock Kafka’s full potential. By the end, you’ll be able to integrate KSQL into your real-time data pipeline confidently and efficiently.

1. Introduction to KSQL

KSQL is a declarative, SQL-like stream processing language for Apache Kafka. It allows developers to write real-time, scalable, fault-tolerant stream processing applications on top of Kafka’s robust architecture.

KSQL is an excellent tool for scenarios where the data requires continuous computations and aggregations, such as real-time analytics, online data enrichment, anomaly detection, and many more.

2. Setting Up KSQL

First, you’ll need to start a KSQL server:

Bash

$ confluent local services ksql-server start

Then, you can interact with KSQL using the KSQL CLI:

Bash

$ ksql

With these two processes running, you’re ready to dive into KSQL’s stream processing capabilities.

3. Understanding KSQL Data Models: Streams and Tables

In KSQL, data is categorized into Streams and Tables, two fundamental structures with distinct properties.

Stream: A Stream in KSQL is an unbounded sequence of structured data records. Each record represents an individual event, which is immutable. In other words, once an event has occurred, it cannot be changed.
Table: A Table in KSQL represents the current state of a computation and can be updated as new events occur.

4. Building Blocks of KSQL

Let’s now delve into some practical examples demonstrating the use of KSQL for stream processing.

4.1 Creating a Stream

In KSQL, you can create a stream from a Kafka topic. The following command creates a stream named orders_stream, which contains three fields.

SQL

CREATE STREAM orders_stream (orderId INT, customerId INT, amount DOUBLE) WITH (KAFKA_TOPIC='orders_topic', VALUE_FORMAT='JSON', KEY='orderId');

This command creates a stream from the ‘orders_topic’ Kafka topic. It’s important to specify the correct VALUE_FORMAT (in this case ‘JSON’) and the key for each record (‘orderId’).

4.2 Creating a Table

Creating a KSQL table from a Kafka topic is quite similar to creating a stream. The following command creates a table named customers_table.

SQL

CREATE TABLE customers_table (customerId INT, customerName VARCHAR, age INT) WITH (KAFKA_TOPIC='customers_topic', VALUE_FORMAT='JSON', KEY='customerId');

The table creation syntax is almost identical to that of creating a stream. It creates a table from the ‘customers_topic’ Kafka topic, with ‘customerId’ as the key for each record.

4.3 Filtering a Stream

KSQL provides an intuitive, SQL-like syntax to filter a stream based on specific criteria. Below, we create a new stream large_orders that includes only orders with an amount greater than 100.

SQL

CREATE STREAM large_orders AS SELECT * FROM orders_stream WHERE amount > 100;

This command filters out records from the ‘orders_stream’ where the order amount is greater than 100 and creates a new stream ‘large_orders’ containing these filtered records.

4.4 Transforming a Stream

KSQL provides a range of transformation capabilities. In the following example, we add a column that calculates the tax for each order.

SQL

CREATE STREAM orders_with_tax AS SELECT *, amount * 0.07 AS tax FROM orders_stream;

Here, we’ve created a new stream, ‘orders_with_tax’, that includes a new ‘tax’ column, computed as 7% of the order amount.

4.5 Aggregating a Stream

KSQL provides robust aggregation features similar to SQL. Below, we calculate the total order amount for each customer.

SQL

CREATE TABLE total_amount_per_customer AS SELECT customerId, SUM(amount) AS total_amount FROM orders_stream GROUP BY customerId;

This command creates a table ‘total_amount_per_customer’ that aggregates the total order amount per customer from the ‘orders_stream’ stream.

4.6 Joining a Stream and a Table

KSQL enables joining streams and tables, providing deeper insights from your data. The following command enriches the orders_stream with customer information from customers_table.

SQL

CREATE STREAM orders_enriched AS SELECT o.orderId, o.amount, c.customerName FROM orders_stream o LEFT JOIN customers_table c ON o.customerId = c.customerId;

This creates a new stream, ‘orders_enriched’, which combines data from ‘orders_stream’ and ‘customers_table’ based on matching ‘customerId’ values.

4.7 Windowed Aggregations

With KSQL, you can perform windowed aggregations. This command calculates the total amount per customer in the last hour:

SQL

CREATE TABLE hourly_total_amount AS SELECT customerId, SUM(amount) AS total_amount FROM orders_stream WINDOW TUMBLING (SIZE 1 HOUR) GROUP BY customerId;

This command generates a table that aggregates the total order amount per customer for every hour from the ‘orders_stream’ stream.

4.8 Creating a Session Windowed Table

You can create session windowed tables to capture user activity sessions. Below is an example:

SQL

CREATE TABLE session_activity AS SELECT customerId, COUNT(*) AS activity_count FROM user_clicks_stream WINDOW SESSION (30 MINUTES) GROUP BY customerId;

This command creates a table ‘session_activity’ that counts user activity sessions within a window of 30 minutes from the ‘user_clicks_stream’ stream.

4.9 Persisting Queries

KSQL allows you to persist the results of a query into a new Kafka topic using the CREATE STREAM/TABLE AS SELECT command.

4.10 Terminate Persistent Queries

Finally, you can terminate a persistent query using the TERMINATE command.

SQL

TERMINATE CTAS_TOTAL_AMOUNT_PER_CUSTOMER_7;

This command terminates a persistent query with the ID ‘CTAS_TOTAL_AMOUNT_PER_CUSTOMER_7’.

Conclusion

In the world of real-time data streaming, KSQL provides a familiar SQL-like syntax to unleash the full power of Apache Kafka. Whether you’re a data engineer looking to streamline your data pipeline, a data analyst aiming to extract real-time insights, or a developer looking to build real-time applications, KSQL can significantly simplify your interaction with Kafka.

Remember, the essence of mastering KSQL lies not just in being able to write KSQL queries but also in understanding when and how to apply it in your stream processing tasks. The better you grasp KSQL, the more time and effort you can save on the intricacies of stream processing.

With the continuous advancements in Apache Kafka and its ecosystem, KSQL will surely play a pivotal role in shaping how developers approach real-time data processing. Embrace the power of KSQL, and you’ll discover a new level of efficiency and sophistication in your data streaming tasks.

Categorized in:

Apache Apache Course Kafka Tutorials

About the Author

ozziefel

Founder & Editor

Ozzie Feliciano is a highly experienced technologist with a remarkable twenty-three years of expertise in the technology industry.

View All Articles

Comments

mejor código de referencia de Binance on May 27, 2025

Thanks for sharing. I read many of your blog posts, cool, your blog is very good.

Reply
binance kod on June 18, 2025

Thanks for sharing. I read many of your blog posts, cool, your blog is very good. https://accounts.binance.com/sk/register?ref=OMM3XK51

Reply
Зареструйтесь, щоб отримати 100 USDT on July 15, 2025

I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article.

Reply
binance konto skapande on July 17, 2025

Your article helped me a lot, is there any more related content? Thanks!

Reply
moz on August 28, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Web on August 28, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
SimilarWeb on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Dofollow on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Organic traffic on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Page Authority on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Page speed on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Copywriting on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Ubersuggest on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
User intent on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Mobile friendly on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Anchor text on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
User intent on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Robots.txt on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply
Trust Flow on August 29, 2025

Blog được xây dựng với mục tiêu chia sẻ thông tin hữu ích, cập nhật kiến thức đa dạng và mang đến góc nhìn khách quan cho bạn đọc. Nội dung tập trung vào việc tổng hợp, phân tích và truyền tải một cách minh bạch – dễ hiểu, giúp bạn tiếp cận nguồn thông tin chất lượng trong nhiều lĩnh vực.

Reply

KSQL: The Kafka Query Language – SQL for Stream Processing

1. Introduction to KSQL

2. Setting Up KSQL

3. Understanding KSQL Data Models: Streams and Tables

4. Building Blocks of KSQL

4.1 Creating a Stream

4.2 Creating a Table

4.3 Filtering a Stream

4.4 Transforming a Stream

4.5 Aggregating a Stream

4.6 Joining a Stream and a Table

4.7 Windowed Aggregations

4.8 Creating a Session Windowed Table

4.9 Persisting Queries

4.10 Terminate Persistent Queries

Conclusion

About the Author

ozziefel

Check latest articles from this author:

NFT Marketplace Simulation

Create a Simple Blockchain from Scratch (High School)

Leveraging Event Sourcing in Pharmaceutical Manufacturing: Implementing CQRS with Kafka and RabbitMQ for Scalable Systems

Comments

Leave a Reply Cancel reply

Previous Article

Fault-Tolerance in Kafka: Understanding Replication and Leader Election

Next Article

Camel’s Puzzle: Solving Complex Routing Challenges with Apache Camel

NFT Marketplace Simulation

Create a Simple Blockchain from Scratch (High School)

Leveraging Event Sourcing in Pharmaceutical Manufacturing: Implementing CQRS with Kafka and RabbitMQ for Scalable Systems

Press ESC to close

Or check our Popular Categories...

1. Introduction to KSQL

2. Setting Up KSQL

3. Understanding KSQL Data Models: Streams and Tables

4. Building Blocks of KSQL

4.1 Creating a Stream

4.2 Creating a Table

4.3 Filtering a Stream

4.4 Transforming a Stream

4.5 Aggregating a Stream

4.6 Joining a Stream and a Table

4.7 Windowed Aggregations

4.8 Creating a Session Windowed Table

4.9 Persisting Queries

4.10 Terminate Persistent Queries

Conclusion

Like what you read?

Subscribe to our Newsletter

About the Author

Check latest articles from this author:

Comments

Leave a Reply Cancel reply

Related Articles

Previous Article

Next Article