【Kafka】Best Practice

Posted by 西维蜀黍 on 2021-11-03, Last Modified on 2022-02-19

Naming

  • Valid characters for Kafka topics are the following:
    • ASCII alphanumerics
    • .
    • _
    • -
  • Avoid using any non-English words.

Retention Time

  • The retention time means how long the msg data will be kept on Kafka. By default it is 3 days and NOT recommended to exceed 7 days.
  • It is recommended to reduce the retention time in case the data size becomes larger.

Data Size

  • How to calc the data size: (the max length of each msg) x (the max num of msg per day) x (retention days) ÷ (the num of partitions)
  • The data size per topic/partition should not be larger than 100GB. Make sure the data size of each topic does not exceeds 100GB to avoid a long failover time in case a broker goes down or the whole Kafka cluster gets network partitioned. It is recommended to do sharding or partitioning in case a topic goes too large.
  • It is recommended to enable data compression on client side to reduce the data size and bandwidth.

Partition Num

Partitioning is to split a topic into multiple partitions. The partitioning may help accelerate the consumption performance by parallel processing , if you have multiple consumer instances in a consumer group.

  • Don’t split into too many partitions for a topic, no more than 96 partitions. It is recommended to be less than 64 partitions.
  • Take note that the num of partitions can be increased dynamically, but you are not able to decrease the num of partitions for an existing topic.
  • It’s probably a good idea to limit the number of partitions per broker to 100 x b x r, where b is the number of brokers in a Kafka cluster and r is the replication factor, so if we have 5 broker servers and 3 replications, we should set less than 1500 in total partitions in one kafka cluster.
    • leader election will take 5ms for one partition, and initializing the metadata from zookeeper will take 2ms per partitions, so it will take more time to recovery if setting up many partitions in one topic .
  • The message data should be evenly distributed across multiple partitions.

Topic Num

Sharding is to split a topic into multiple topics. For example,

  • The topic app_log can be splitted into multiple topics: app_info_log , app_error_log, app_fata_log , …
  • Also it is recommended to shard a topic by regions: app_log_sg , app_log_my, app_log_vn , …

Reference