There are a few techniques that enable a database to scale out with more servers and therefore more capacity. Some techniques work better for specific types of databases or for specific vendors. Let’s look at each technique in turn.
Database Scaling Pattern #1: Read Replicas
Read replicas allow data to be available for reading across any number of servers, called “slaves”. One server remains the “master” and accepts any incoming write requests, along with read requests. This technique is common for relational databases, as most vendors support replication of data to multiple read-only servers. The more read replicas installed, the more read-based queries may be scaled.
In addition to supporting a greater scale of read-only queries, the read replica strategy is used to offload long-running reports that require long-running queries to dedicated servers and away from typical user traffic. This technique is also used for high availability of the database, since the master read/write server can be replaced by a slave server as it is an exact copy.
To take advantage of the read replica scaling technique, applications must be designed to direct read-based queries to any one of the read replicas, while directing database changes (create, update, delete) to the master server. As you likely noted, this technique works best when you have limited data changes but need to scale out read queries.
Database Scaling Pattern #2: Multi-Master
While the read replica technique allows for scaling out reads, what happens if you need to scale out to a large number of writes as well? The multi-master technique may be used to allow any client to write data to any database server. This enables all read replicas to be a master rather than just slaves. This enables applications to scale out the number of reads and writes. However, this also requires that our applications generate universally unique identifiers, also known as “UUIDs”, or sometimes referring to as globally unique identifiers or “GUIDs”. Otherwise, two rows in the same table on two different servers might result in the same ID, causing a data collision during the multi-master replication process.
Database Scaling Pattern #3: Partitioning
Very large data sets often produce so much data that any one server cannot access or modify the data by itself without severly impacting scale and performance. This kind of problem cannot be solved through read replicas or multi-master designs. Instead, the data must be separated in some way to allow it to be easily accessible.
Horizontal partitioning, also called “sharding”, distributes data across servers. Data may be partioned to different server(s) based on a specific customer/tenant, date range, or other sharding scheme. Vertical partioning separates the data associated to a single table and groups it into frequently accessed and rarely accessed. The pattern chosen allows for the database and database cache to manage less information at once. In some cases, data patterns may be selected to move data across multiple filesystems for parallel reading and therefore increased performance.
Database Scaling Pattern #4: Connection Pooling
When an application needs to communicate with a database, it will need to make a client connection, send the requested query, and receive the results from the database. Since there is network overhead associated with establishing the client connection to the database, connections are often established by the application server initially, then the connection is reused. As you scale your application, you will want to have more connections available to service each incoming request. This technique is called connection pooling.
With connection pooling, the application server creates a pool of connections to the database, preventing the need to establish a new database connection on each incoming request.
Database Scaling Pattern #5: Connection Load Balancing
Each database vendor is designed to handle a maximum number of client connections from application servers. This means that as you scale your software, a database will be required to handle more concurrent connections from the increasing number of application servers available. To overcome this limitation while ensuring your application scales, a technique known as connecting load balancing is required.
A database load balancer supports a higher number of concurrent connections from your application servers. The balancer is then responsible for distributing the incoming database query to one of the available servers using its connection pool. This not only reduces the number of overall client connections that a database must support, it also removes the need for applications to perform connection failover to another server in the event of a database server failure. Instead, the load balancer handles this job directly.