西维蜀黍

【Distributed System】Replication - Leaderless Replication

Background

Single-leader and multi-leader replication are based on the idea that a client sends a write request to one node (the leader), and the database system takes care of copying that write to the other replicas. A leader determines the order in which writes should be processed, and followers apply the leader’s writes in the same order.

Some data storage systems take a different approach, abandoning the concept of a leader and allowing any replica to directly accept writes from clients. Some of the earliest replicated data systems were leaderless, but the idea was mostly forgotten during the era of dominance of relational databases. It once again became a fashionable architecture for databases after Amazon used it for its in-house Dynamo system. vi Riak, Cassandra, and Voldemort are open source datastores with leaderless replication models inspired by Dynamo, so this kind of database is also known as Dynamo-style.

  ...


【Distributed System】Replication - Multi-leader Replication (Master–master or Active/active Replication)

Multi-leader Replication (Master–master or Active/active Replication)

So far we have only considered replication architectures using a single leader. Although that is a common approach, there are interesting alternatives.

Leader-based replication has one major downside: there is only one leader, and all writes must go through it. If you can’t connect to the leader for any reason, for example due to a network interruption between you and the leader, you can’t write to the database.

A natural extension of the leader-based replication model is to allow more than one node to accept writes. Replication still happens in the same way: each node that processes a write must forward that data change to all the other nodes. We call this a multi-leader configuration (also known as master–master or active/active replication). In this setup, each leader simultaneously acts as a follower to the other leaders.

  ...


【Distributed System】Multi-region Active-active Architecture

演化

单机架构

我们从最简单的开始讲起。

假设你的业务处于起步阶段,体量非常小,那你的架构是这样的:

这个架构模型非常简单,客户端请求进来,业务应用读写数据库,返回结果,非常好理解。

但需要注意的是,这里的数据库是「单机」部署的,所以它有一个致命的缺点:一旦遭遇意外,例如磁盘损坏、操作系统异常、误删数据,那这意味着所有数据就全部「丢失」了,这个损失是巨大的。

  ...


【Distributed System】Scalability

Vertical Scalability

Vertical scaling, referred to as “scale up”, means the process of adding more power (CPU, RAM, etc.) to your servers. Horizontal scaling, referred to as “scale-out”, allows you to scale by adding more servers into your pool of resources.

When traffic is low, vertical scaling is a great option, and the simplicity of vertical scaling is its main advantage. Unfortunately, it comes with serious limitations.

  • Vertical scaling has a hard limit. It is impossible to add unlimited CPU and memory to a single server.
  • Vertical scaling does not have failover and redundancy. If one server goes down, the website/app goes down with it completely.
  ...


【Architecture】System Design - ID Mapping

Scope Title Bi-Directional Scalable Highly Available ID Mapping Question Description The e-commerce platform would like to maintain its own user id space so it can support users from different business / products Design a highly scalable, highly available bi-directional user id mapping service to bind an external user id to an internal e-commerce platform user id. map(external_uid) ==> internal_uid reverse_map(internal_uid) ==> external_uid Describe key components for your design, including database selection and design, cache design, logical flow etc.   ...