InkdownInkdown
Start writing

Arpit Bhayani Blogs

336 files·168 subfolders

Shared Workspace

Arpit Bhayani Blogs
001 Ai Topological Sort

016-kafka-partitions

Shared from "Arpit Bhayani Blogs" on Inkdown

When You Increase Kafka Partitions

Source: https://arpitbhayani.me/blogs/kafka-partitions Date: 2025-12-13

Partitions sit right in the middle of how Kafka works. They define ordering, parallelism, and how far it can scale. But what actually happens when you need more of them? How does Kafka grow that number, what happens to the data you already have, and which guarantees stay intact after the change?


Partitions sit right in the middle of how Kafka works. They define ordering, parallelism, and how far it can scale. But what actually happens when you need more of them? How does Kafka grow that number, what happens to the data you already have, and which guarantees stay intact after the change?

This essay takes you through the complete picture of partition expansion in Kafka. We will cover

  • the mechanics of adding partitions
  • understand why existing data does not move
001-ai-topological-sort.md
tldr.md
002 Temporal Primer
002-temporal-primer.md
tldr.md
003 Rag Production
003-rag-production.md
tldr.md
004 Structure Of Llm Chat
004-structure-of-llm-chat.md
tldr.md
005 How Llms Work
005-how-llms-work.md
tldr.md
006 Monolith Is Distributed System
006-monolith-is-distributed-system.md
tldr.md
007 Defensive Databases
007-defensive-databases.md
tldr.md
008 Bm25
008-bm25.md
tldr.md
009 Join Algorithms
009-join-algorithms.md
tldr.md
010 Venting At Work
010-venting-at-work.md
tldr.md
011 Half Life
011-half-life.md
tldr.md
012 Multi Paxos
012-multi-paxos.md
tldr.md
013 Mysql Replication Internals
013-mysql-replication-internals.md
tldr.md
014 Bloom Filters
014-bloom-filters.md
tldr.md
015 Clock Sync Nightmare
015-clock-sync-nightmare.md
tldr.md
016 Kafka Partitions
016-kafka-partitions.md
tldr.md
017 Product Quantization
017-product-quantization.md
tldr.md
018 Qkv Matrices
018-qkv-matrices.md
tldr.md
019 Deleted Production
019-deleted-production.md
tldr.md
020 How Llm Inference Works
020-how-llm-inference-works.md
tldr.md
021 Blocking Queues
021-blocking-queues.md
tldr.md
022 Heartbeats In Distributed Systems
022-heartbeats-in-distributed-systems.md
tldr.md
023 Cassandra Writes
023-cassandra-writes.md
tldr.md
024 Redis Replication
024-redis-replication.md
tldr.md
025 Arrogant People At Work
025-arrogant-people-at-work.md
tldr.md
026 Cdn Content Replication
026-cdn-content-replication.md
tldr.md
027 Cant Fix Everything Day One
027-cant-fix-everything-day-one.md
tldr.md
028 Emotions At Work
028-emotions-at-work.md
tldr.md
029 Grpc Http2
029-grpc-http2.md
tldr.md
030 Meetings With No Agenda Are A Waste Of Time
030-meetings-with-no-agenda-are-a-waste-of-time.md
tldr.md
031 Growth Is Not About Doing Everything
031-growth-is-not-about-doing-everything.md
tldr.md
032 Career Longevity Vs Job Hopping
032-career-longevity-vs-job-hopping.md
tldr.md
033 Stay Relevant At Higher Salary Levels
033-stay-relevant-at-higher-salary-levels.md
tldr.md
034 Why Consensus
034-why-consensus.md
tldr.md
035 Database Deadlocks
035-database-deadlocks.md
tldr.md
036 Cpu Cache Locality
036-cpu-cache-locality.md
tldr.md
037 Eventual Consistency
037-eventual-consistency.md
tldr.md
038 Dns Udp Tcp
038-dns-udp-tcp.md
tldr.md
039 Masters
039-masters.md
tldr.md
040 Empathy Makes Great Engineers Unstoppable
040-empathy-makes-great-engineers-unstoppable.md
tldr.md
041 Good Mentors Build People
041-good-mentors-build-people.md
tldr.md
042 Always Have Back Burner Projects
042-always-have-back-burner-projects.md
tldr.md
043 Before You Push Back Know What Youre Standing On
043-before-you-push-back-know-what-youre-standing-on.md
tldr.md
044 Be The One They Can Count On
044-be-the-one-they-can-count-on.md
tldr.md
045 How Much People Bet On You
045-how-much-people-bet-on-you.md
tldr.md
046 How To Get Leadership To Say Yes To Your Project
046-how-to-get-leadership-to-say-yes-to-your-project.md
tldr.md
047 Dont Let Your Best Ideas Die In Silence
047-dont-let-your-best-ideas-die-in-silence.md
tldr.md
048 Be Someone Others Want To Work With
048-be-someone-others-want-to-work-with.md
tldr.md
049 Dont Fall For Xy Problem Ask Right Questions
049-dont-fall-for-xy-problem-ask-right-questions.md
tldr.md
050 Biggest Lie Startups Tell Engineers
050-biggest-lie-startups-tell-engineers.md
tldr.md
051 Promotions Are Proactive Not Reactive
051-promotions-are-proactive-not-reactive.md
tldr.md
052 Not Enough To Be Right Learn To Be Heard
052-not-enough-to-be-right-learn-to-be-heard.md
tldr.md
053 No One Ships Alone
053-no-one-ships-alone.md
tldr.md
054 Not Every Mistake Needs A Correction
054-not-every-mistake-needs-a-correction.md
tldr.md
055 Build Influence At Work
055-build-influence-at-work.md
tldr.md
056 Your Soft Skills Arent Soft At All
056-your-soft-skills-arent-soft-at-all.md
tldr.md
057 Experience Before Forming Opinion
057-experience-before-forming-opinion.md
tldr.md
058 Curiosity And High Bias For Action
058-curiosity-and-high-bias-for-action.md
tldr.md
059 Worklog
059-worklog.md
tldr.md
060 Mistakes And Growth
060-mistakes-and-growth.md
tldr.md
061 Own It Instead Of Sweeping It Aside
061-own-it-instead-of-sweeping-it-aside.md
tldr.md
062 Dont Wait Step Up
062-dont-wait-step-up.md
tldr.md
063 Temporary Fix Is Permanent
063-temporary-fix-is-permanent.md
tldr.md
064 Interview Bias And What Sets You Apart
064-interview-bias-and-what-sets-you-apart.md
tldr.md
065 Saying This Isnt My Problem Is A Problem
065-saying-this-isnt-my-problem-is-a-problem.md
tldr.md
066 Okr
066-okr.md
tldr.md
067 Miscommunication
067-miscommunication.md
tldr.md
068 When In Doubt Code It Out
068-when-in-doubt-code-it-out.md
tldr.md
069 Follow Up Without Annoying People
069-follow-up-without-annoying-people.md
tldr.md
070 Lead Projects That Land
070-lead-projects-that-land.md
tldr.md
071 Abstract Thinking Skill Next Decade
071-abstract-thinking-skill-next-decade.md
tldr.md
072 We Engineers Suck At Task Estimation
072-we-engineers-suck-at-task-estimation.md
tldr.md
073 Shiny Object Syndrome In Tech
073-shiny-object-syndrome-in-tech.md
tldr.md
074 3p
074-3p.md
tldr.md
075 Leverage The Equilibrium
075-leverage-the-equilibrium.md
tldr.md
076 On Demand Container Loading In Aws Lambda
076-on-demand-container-loading-in-aws-lambda.md
tldr.md
077 Sql Has Problems We Can Fix Them Pipe Syntax In Sql
077-sql-has-problems-we-can-fix-them-pipe-syntax-in-sql.md
tldr.md
078 Nanolog A Nanosecond Scale Logging System
078-nanolog-a-nanosecond-scale-logging-system.md
tldr.md
079 Best Resource Is Mythical
079-best-resource-is-mythical.md
tldr.md
080 Wtf The Who To Follow Service At Twitter
080-wtf-the-who-to-follow-service-at-twitter.md
tldr.md
081 Know A Lot
081-know-a-lot.md
tldr.md
082 Out Of Syllabus
082-out-of-syllabus.md
tldr.md
083 Negotiate The Offer
083-negotiate-the-offer.md
tldr.md
084 Never Bad Mouth Your Ex Exployer
084-never-bad-mouth-your-ex-exployer.md
tldr.md
085 Culture Fit
085-culture-fit.md
tldr.md
086 Quantification In Resume
086-quantification-in-resume.md
tldr.md
087 Hiring Is Unfair
087-hiring-is-unfair.md
tldr.md
088 Questions For Interviewers
088-questions-for-interviewers.md
tldr.md
089 Collaboration Communication
089-collaboration-communication.md
tldr.md
090 Out Of Vicious Interview Cycle
090-out-of-vicious-interview-cycle.md
tldr.md
091 Pitch Projects Not Ideas
091-pitch-projects-not-ideas.md
tldr.md
092 Read Design Docs
092-read-design-docs.md
tldr.md
093 Read Rca Docs
093-read-rca-docs.md
tldr.md
094 Start Generalist
094-start-generalist.md
tldr.md
095 Do Not Rely On Summaries
095-do-not-rely-on-summaries.md
tldr.md
096 Structure Your Design Interviews
096-structure-your-design-interviews.md
tldr.md
097 Title Inflation
097-title-inflation.md
tldr.md
098 Find Your Own Project
098-find-your-own-project.md
tldr.md
099 Six Pointers To Crack Coding And Design Interviews
099-six-pointers-to-crack-coding-and-design-interviews.md
tldr.md
100 Keep Yourself Unblocked
100-keep-yourself-unblocked.md
tldr.md
101 Genetic Knapsack
101-genetic-knapsack.md
tldr.md
102 Pseudorandom Number Generation Lfsr
102-pseudorandom-number-generation-lfsr.md
tldr.md
103 How Indexes Work On Partitioned And Sharded Data
103-how-indexes-work-on-partitioned-and-sharded-data.md
tldr.md
104 Some Data Partitioning Strategies For Distributed Data Stores
104-some-data-partitioning-strategies-for-distributed-data-stores.md
tldr.md
105 Data Partitioning
105-data-partitioning.md
tldr.md
106 Leaderless Replication
106-leaderless-replication.md
tldr.md
107 Conflict Resolution
107-conflict-resolution.md
tldr.md
108 Conflict Detection
108-conflict-detection.md
tldr.md
109 Multi Master Replication
109-multi-master-replication.md
tldr.md
110 Monotonic Reads
110-monotonic-reads.md
tldr.md
111 Read Your Write Consistency
111-read-your-write-consistency.md
tldr.md
112 Handling Outages Master Replica
112-handling-outages-master-replica.md
tldr.md
113 Replication Formats
113-replication-formats.md
tldr.md
114 Replication Strategies
114-replication-strategies.md
tldr.md
115 Master Replica Replication
115-master-replica-replication.md
tldr.md
116 Durability
116-durability.md
tldr.md
117 Isolation
117-isolation.md
tldr.md
118 Atomicity
118-atomicity.md
tldr.md
119 Consistency
119-consistency.md
tldr.md
120 Architectures In Distributed Systems
120-architectures-in-distributed-systems.md
tldr.md
121 Mistaken Beliefs Of Distributed Systems
121-mistaken-beliefs-of-distributed-systems.md
tldr.md
122 Fork Bomb
122-fork-bomb.md
tldr.md
123 Chained Operators Python
123-chained-operators-python.md
tldr.md
124 Taxonomy On Sql
124-taxonomy-on-sql.md
tldr.md
125 The Weird Walrus
125-the-weird-walrus.md
tldr.md
126 Fully Persistent Arrays
126-fully-persistent-arrays.md
tldr.md
127 Persistent Data Structures Introduction
127-persistent-data-structures-introduction.md
tldr.md
128 Constant Folding Python
128-constant-folding-python.md
tldr.md
129 String Interning Python
129-string-interning-python.md
tldr.md
130 Recursion Visualizer Python
130-recursion-visualizer-python.md
tldr.md
131 Flajolet Martin
131-flajolet-martin.md
tldr.md
132 2q Cache
132-2q-cache.md
tldr.md
133 Israeli Queues
133-israeli-queues.md
tldr.md
134 1d Terrain
134-1d-terrain.md
tldr.md
135 Jaccard Minhash
135-jaccard-minhash.md
tldr.md
136 Ts Smoothing
136-ts-smoothing.md
tldr.md
137 Lfu
137-lfu.md
tldr.md
138 Morris Counter
138-morris-counter.md
tldr.md
139 Slowsort
139-slowsort.md
tldr.md
140 Bitcask
140-bitcask.md
tldr.md
141 Phi Accrual
141-phi-accrual.md
tldr.md
142 10x Engineer
142-10x-engineer.md
tldr.md
143 Decipher Repeated Key Xor
143-decipher-repeated-key-xor.md
tldr.md
144 Decipher Single Xor
144-decipher-single-xor.md
tldr.md
145 Python Iterable Integers
145-python-iterable-integers.md
tldr.md
146 Inheritance C
146-inheritance-c.md
tldr.md
147 Rum
147-rum.md
tldr.md
148 Consistent Hashing
148-consistent-hashing.md
tldr.md
149 Python Caches Integers
149-python-caches-integers.md
tldr.md
150 Fractional Cascading
150-fractional-cascading.md
tldr.md
151 Copy On Write
151-copy-on-write.md
tldr.md
152 Midpoint Insertion Caching Strategy
152-midpoint-insertion-caching-strategy.md
tldr.md
153 Fsm Python
153-fsm-python.md
tldr.md
154 Bayesian Average
154-bayesian-average.md
tldr.md
155 Sliding Window Ratelimiter
155-sliding-window-ratelimiter.md
tldr.md
156 Idf
156-idf.md
tldr.md
157 Better Programmer
157-better-programmer.md
tldr.md
158 Python Prompts
158-python-prompts.md
tldr.md
159 Rule 30 Cellular Automata
159-rule-30-cellular-automata.md
tldr.md
160 Function Overloading
160-function-overloading.md
tldr.md
161 Isolation Forest
161-isolation-forest.md
tldr.md
162 Image Steganography
162-image-steganography.md
tldr.md
163 Long Integers Python
163-long-integers-python.md
tldr.md
164 I Changed My Python
164-i-changed-my-python.md
tldr.md
165 Benchmark And Compare Pagination Approach In Mongodb
165-benchmark-and-compare-pagination-approach-in-mongodb.md
tldr.md
166 Mongodb Cursor Skip Is Slow
166-mongodb-cursor-skip-is-slow.md
tldr.md
167 Fast And Efficient Pagination In Mongodb
167-fast-and-efficient-pagination-in-mongodb.md
tldr.md
168 Making Http Requests Using Netcat
168-making-http-requests-using-netcat.md
tldr.md
  • explore the implications for key-based ordering, and
  • dive deep into the consumer rebalancing protocols
  • Why Increase Partitions

    Before diving into the mechanics, let us understand the scenarios that drive partition expansion. Kafka partitions determine several critical aspects of your system.

    Throughput scales with partitions. Producers can write to different partitions in parallel, and consumers in a group can read from different partitions simultaneously. If you have three partitions and three consumers, each consumer handles one partition. Adding more partitions allows you to add more consumers and increase parallelism.

    Storage distribution improves with more partitions. Each partition is hosted on a broker (node), so more partitions means better distribution of data across your cluster. This prevents any single broker from becoming a storage bottleneck.

    Consumer concurrency is bounded by partition count. You cannot have more active consumers in a consumer group than partitions. If your topic has four partitions, the fifth consumer will sit idle. Scaling consumer processing power requires scaling partition count.

    Common scenarios for partition expansion include:

    • Traffic growth exceeding current throughput capacity
    • Adding consumers to process backlogs faster
    • Proactive scaling based on projected growth
    • Rebalancing load across a cluster after adding new brokers

    Mechanics of Adding Partitions

    Kafka provides two primary methods for adding partitions to an existing topic. The first is the CLI approach using the kafka-topics.sh script.

    Plain text

    The number you specify is the total partition count, not the number to add. If your topic currently has 3 partitions and you want to add 3 more, you specify 6.

    When you execute, Kafka performs several operations.

    1. The controller updates the topic metadata in the cluster to reflect the new partition count.
    2. New partition directories are created on the brokers (nodes) assigned to host them.
    3. The partition metadata is propagated to all brokers.
    4. Producers and consumers receive updated metadata on their next refresh cycle.

    Importantly, this operation is additive only. Kafka does not support reducing the number of partitions. Once you add partitions, they are permanent unless you delete and recreate the topic.

    What Happens to Existing Data

    When you add partitions to a Kafka topic, existing data does not move. The messages already written to partitions 0, 1, and 2 stay exactly where they are. The new partitions 3, 4, and 5 start empty.

    This choice has real consequences. Right after you expand a topic, the old partitions are still operate with all the historical messages while the new ones start out empty. Over time, as new messages arrive, this imbalance gradually corrects itself.

    Consider a topic with 3 partitions that has been running for a month with 100 million messages. After expanding to 6 partitions, you have 100 million messages spread across partitions 0 to 2, and zero messages in partitions 3 to 5. If your consumers care about total lag or throughput parity, this imbalance matters.

    So why doesn’t Kafka shuffle the old data into the new partitions? It comes down to how Kafka is built.

    Kafka is built for append-only, immutable logs. Moving data between partitions would require rewriting message offsets, breaking consumer position tracking, and creating complex consistency challenges. The engineering complexity and operational risk of automatic redistribution far outweigh the benefits.

    If you need to redistribute existing data across more partitions, you must do it yourself. The standard approach is to create a new topic with the desired partition count, stream all data from the old topic to the new topic, and switch your producers and consumers to the new topic.

    The Key Ordering Problem

    One of the trickiest parts of adding partitions is what it does to key ordering.

    When a producer sends a message with a key, Kafka uses a deterministic hash function to select the partition. The default partitioner uses the Murmur2 hashing algorithm.

    Plain text

    The critical part is the modulus operation against the partition count. When you change the number of partitions, the modulus changes, and keys that previously went to one partition may now go to a different partition.

    Here’s a concrete example. Suppose you have a topic with 4 partitions and a message key “user-123”. The Murmur2 hash of “user-123” might be 7654321. With 4 partitions, this maps to partition 1 (7654321 % 4 = 1).

    Now you expand to 6 partitions. The same key “user-123” with the same hash 7654321 now maps to partition 3 (7654321 % 4 = 3).

    Here is what this means for your app.

    Before partition expansion, all messages for “user-123” went to partition 1 and were processed by a single consumer. The consumer saw all messages for this user in order.

    After partition expansion, new messages for “user-123” go to partition 3 while old messages remain in partition 1. If a consumer is processing both partitions, messages for “user-123” are no longer guaranteed to be processed in order.

    Even worse, different consumers might process the old and new messages for the same key. Consumer A might be assigned partition 1 (old messages) while Consumer B handles partition 3 (new messages). You now have concurrent processing of what should be sequential updates.

    This can mess up patterns like event sourcing, where the order of events matters for rebuilding state. It breaks stream processing aggregations, where you accumulate state per key. It breaks any business logic that assumes all events for an entity arrive at the same processor.

    Maintaining Ordering Guarantees

    With ordering on the line, the question becomes: how do you expand partitions without breaking things for keyed workloads?

    The first strategy is to accept the temporary inconsistency. If your old data has been fully consumed and processed, the ordering disruption only affects keys with messages straddling the expansion. For append-only analytics or logging use cases, this is often acceptable.

    The second strategy is the dual-write approach during a transition period. Before expanding partitions, start writing to both the old topic and a new topic with more partitions. Continue reading from the old topic until it drains. Then switch consumers to the new topic.

    The third strategy is to use application-level ordering. Instead of relying on Kafka’s partition-based ordering, include a sequence number or timestamp in your messages. Your consumer logic reorders messages regardless of arrival sequence.

    For Kafka Streams applications, the situation is more complex.

    Streams maintains state stores backed by changelog topics, and the state is partitioned. Adding partitions to input topics can cause keys to be redistributed, but the associated state does not follow them.

    Hence, the recommended approach for stateful Kafka Streams applications is to create new topics rather than expanding existing ones.

    Consumer Group Rebalancing

    When the partition count changes, consumers have to reshuffle their work. That’s handled by Kafka’s rebalancing protocol.

    The original rebalancing protocol used an eager, stop-the-world approach. When rebalancing triggered, every consumer in the group stopped processing. The group coordinator recalculated partition assignments. All consumers received new assignments and resumed. During this window, throughput dropped to zero.

    This was problematic for large consumer groups. A group with 100 consumers and 200 partitions might spend 30 seconds or more in rebalancing, during which no messages are processed.

    Kafka 2.4 introduced the incremental cooperative rebalancing protocol through KIP-429. Instead of stopping all consumers, this protocol works in phases. Consumers who need to give up partitions release them first. In a subsequent rebalance, those partitions are assigned to other consumers. Consumers who keep their partitions continue processing throughout.

    This significantly reduces the impact of rebalancing. Instead of 30 seconds of zero throughput, you might see individual partitions pause for a few seconds while overall throughput remains high.

    Kafka 4.0 brought the next generation consumer rebalance protocol through KIP-848. This moves the assignment logic from the consumer group leader to the broker-side group coordinator. Rebalancing becomes fully asynchronous, meaning consumers not affected by the partition changes experience no interruption at all.

    Partition Assignment Strategies

    Kafka provides several partition assignment strategies, each with different behaviors during rebalancing.

    The RangeAssignor assigns partitions on a per-topic basis. It sorts partitions and consumers, then divides partitions among consumers as evenly as possible. This can lead to imbalanced assignments when you have multiple topics with different partition counts.

    Plain text

    The RoundRobinAssignor distributes partitions across consumers in a round-robin fashion across all subscribed topics. This achieves a better balance but does not attempt to preserve assignments during rebalancing.

    Plain text

    The StickyAssignor aims for balanced distribution like RoundRobin but also minimizes partition movements during rebalancing. If Consumer 2 leaves, partitions are redistributed with minimal changes to Consumer 1’s assignments.

    The CooperativeStickyAssignor adds cooperative protocol support to the StickyAssignor. This is the recommended choice for most production deployments.

    When you add partitions, the assignment strategy determines how quickly and smoothly work is redistributed. With sticky assignors, existing assignments remain stable while new partitions are assigned to consumers with capacity.

    Exactly-once Semantics and Partition Expansion

    Exactly-once in Kafka is built on two mechanisms. Idempotent producers ensure that retried messages are not duplicated within a partition. Transactions ensure atomic writes across multiple partitions.

    Idempotent producers work per-partition. Each producer is assigned a Producer ID (PID), and each message includes a sequence number. The broker deduplicates based on PID, partition, and sequence number.

    Plain text

    When you add partitions, idempotent production continues to work for the original partitions. New partitions start fresh with their own sequence tracking. There is no semantic issue here.

    Transactions are more nuanced. A transaction can span multiple partitions, and Kafka ensures that either all writes succeed or none do.

    Plain text

    Adding partitions during an active transaction does not affect the transaction. The new partitions are simply available for future transactions. However, if your transaction logic assumes certain partition counts or key-to-partition mappings, expanding partitions mid-operation could violate those assumptions.

    For stream processing applications using exactly-once semantics, the transactional ID should encode partition information to ensure correct fencing after failures.

    Plain text

    This pattern ensures that each input partition has a dedicated transactional producer, maintaining exactly-once guarantees across restarts and rebalances.

    Best Practices

    1. Plan partition counts for future growth. Over-partition initially if you expect significant traffic increases. Having idle partitions costs little, while under-partitioning forces painful migrations.
    2. Verify that your consumers handle rebalancing correctly and that your application tolerates the temporary ordering disruption for keyed messages.
    3. Use sticky assignors to minimize partition movement.
    4. Monitor during and after expansion. Watch consumer lag, rebalance metrics, and application error rates for at least an hour after expanding partitions.
    5. For stateful applications, prefer creating new topics over expanding existing ones. The migration effort is worth the consistency guarantees.

    Conclusion

    Growing the number of partitions in Kafka is easy to do, but the ripple effects can get complicated. The kafka-topics.sh command finishes in seconds, but it is fun to understand what happens afterward.

    The gist is that the existing data stays where it is, key-to-partition mappings change and break ordering assumptions, consumer groups rebalance with configurable levels of disruption, and stateful stream processing applications need special care.