return Sign in

Shen Jian: 58 city database architecture best practices

Disclaimer: This article starting from the CSDN, reproduced in any form is prohibited without permission, can consult the commissioning editor at the end of this paper.

Basic concepts of database

The basic concept of this piece, mainly to allow you to reach a consensus on the concept of a number of databases.

Picture description

First is"Single library", the first time the database is so play, almost all of the business have such a library.

Picture description

Next is"Slice", the database of the slice is to solve the problem of large amount of data. If the amount of data is very large, it is to do the level of segmentation, there are some database support sharding auto. Before 58 city also used two years of mongodb, later found auto sharding function less controllable, do not know what time to data migration and data migration process will have large granularity lock, read and write is blocked, business will be jitter and burr, these are the business can not accept, so now moved back to the MySQL.

Once the split, it will face"Data routing"Problem: a request is made to route the request to the corresponding database. Internet commonly used data routing method has three kinds:

(1) the first one isRouting according to data range, for example, there are two sub films, a range of 0-1 million, a range of 100 million -2 million, so to route.
The advantage of this approach is very simple, and good scalability, if the two points is not enough, to increase a 200 million -3 million of the film can be.
The disadvantage of this approach is that although the distribution of the data is balanced, the amount of data in each library is about the same, but the load of the request will not be balanced. For example, there are a number of business scenarios, the new registered user activity higher, a large range of sub request load will be higher.

(2) the second isRouting according to hash, for example, there are two parts, the data module 2 search.
The advantage of this method is that the route is very simple, the data distribution is balanced, and the request load is balanced.
The disadvantage of this approach is that if the two pieces of data is too large, to become a three slice, data migration will be more trouble, that is, the expansion will be limited.

(3) the third isRouting service. In front of two data routing method has a disadvantage, the business line needs to be coupled with the routing rules, if the routing rules change, business line is needed to coordinate the upgrade. Routing service can realize the decoupling of business lines and routing rules, line of business of each access to the database before the call routing services to know data is actually stored in which pools.

Picture description

Next isGrouping"and"Copy", this solution is to expand the read performance, read high availability issues.

According to experience, most of the Internet business is to read more and write less. Taobao, Jingdong query goods, the request may be accounted for 99% of the search goods, only the order and pay the time to write requests.

58 city search posts, see list page, see the details page is read request, post is to write a request, the amount of writing request is also relatively small.

Can be seen, most of the Internet scenes are read and write less, so the reading performance will be the first to become a bottleneck, how to solve this problem quickly?

Generally speaking, reading and writing will be used to expand the library to improve the reading performance. At the same time to ensure the availability of reading, a reading of the library, and the other one can continue to provide services to provide services.

Picture description

Common database of the game is integrated into the "split" and "grouping""In order to improve the reading performance, the data quantity is large, and the reading performance is guaranteed,80% Internet Co database is the software architecture.

Usability architecture practice

Database we all use, usually in addition to according to the business design table structure, according to access to design index, but also should consider the availability of data in design, usability is divided intoRead high availabilityandWrite high availability.

Picture description

Above is a common way to read high availability. How to ensure high availability of database reading?The idea to solve the problem of high availability is redundancy.

Solve the problem of the availability of the site redundant multiple sites, the availability of services to solve the problem of redundant services, the availability of data to solve the problem of redundant data.

If you use a library to ensure that you can't read the high availability, you can copy the library, a library to read the other can still provide services, so as to ensure that the use of a copy of the availability of reading.

The redundancy of the data can lead to a side effect, which is the problem of consistency.

If it is a single library, read and write all fall on the same library, each read is the latest database, there is no consistency problem.

But in order to ensure the availability of data copied to a number of places, and the data is definitely not real-time synchronization, there will be a synchronization delay, so it is possible to read the old data. How to solve the problem of the consistency of the master and slave database and then explain.

Many Internet companies database software architecture is a main two from or a main from three, can not guarantee to "write" the high availability, because writing is in fact only a library, is a single point, if the library to hang with the words and write will be affected.That small partners why also use this framework?

Just mentioned that most of the Internet Co 99% of the business is "read" business, write the library is not the main contradiction, write library hang up, probably only 1% of users will be affected.

If you want to do "write" high availability, the impact of database software architecture is relatively large, not necessarily worth, to solve problem of 1% of the introduced the complexity of 80%, so many Internet companies are not resolved writing to the database high availability problems.

Picture description

How to solve this problem? Thinking or redundant,Read the high availability is redundant to read the database, write the high availability is redundant write library. Write a write into two, to do a double master synchronization, a hang up, I can write the flow automatically cut to another, the high availability of writing.

What kind of problems can be written with a dual master synchronization to ensure high availability?

Mentioned above,Using redundancy to ensure consistency of availability. Because the two main mutual synchronization, the synchronization is delayed, many companies use the characteristics of auto-increment-id some of this database. If two synchronous architecture, a principal ID by 10 turns 11, before not synchronized in the past, another main and a written request, also by 10 into 11, two-way synchronization will sync failed, there will be a loss of data.

There are two solutions to this dual master synchronization ID conflict:

(1) one isThe two main use different initial values, the same step size to generate ID, a library starting from 0 (02468), a library from the start of 1 (13579), the step size is 2, so that the two sides of the synchronization data will not conflict.
(2) the other way isDo not use the database auto-increment-id, and the business layer to ensure that the ID does not conflict.

Picture description

58 city did not use the above two ways to ensure the availability of reading and writing. 58 city use of dual master when the master of the way to ensure the availability of database reading and writing.

Although it seems to be two main synchronization, but reading and writing in a word, the other main library does not read and write traffic, completely standby. When a master library hang, the flow will automatically switch to another master, all this on line of business is transparent, is done automatically.

58 city of this program, read and write all in a main library, the synchronization delay is not stored, but the consistency of the problem, but there are two shortcomings:

First isThe utilization of database resources is only 50%;
Second is no way to expand the system's reading performance by increasing the way to read the library.
58 city database software architecture how to expand the reading performance, look at the following chapter.

Read performance architecture practice

How to increase the reading performance of the database, first look at the traditional play:

Picture description

(1) the first play is to increase from the library, through the increase from the library to improve the reading performance, there is the problem of what is it? From the library, the more the performance of the write, the longer the synchronization time, the higher the possibility of inconsistency.

Picture description

(2) second common ways areAdd cacheCache is one of the many ways to improve the performance of the system, especially for the Internet. Common cache play as shown above, the upper reaches of the business line, the bottom is read and write separate master slave synchronization, and then will add a cache.

For write operations:Will be out of the cache, and then write database.

For read operation:First read the cache, if the hit cache is returned to the data, if the cachemiss is read from the library, and then read out the data into the cache.

This is common cache play.

The traditional cache play in an abnormal timing, will lead to seriousConsistency problem, consider such a special timing:

(1) first to a write request, the elimination of cache, wrote the database;
(2) a read request, read cache, Miss cache, and then read from the library, at this time the request has not been synchronized to the library, so read a dirty data, and then dirty data into the cache;
(3) the final master slave synchronization is completed;

This time series can lead to dirty data has been in the cache can not be eliminated, the database and the data in the cache is not consistent.

58 city is also the way to improve the performance of the cache, and that there will be no data consistency problem, please look down.

Consistent architecture practice

Picture description

58 city use"Service + cache + Database" a set of ways to ensure the consistency of the dataDue to the use of 58 city "double main when using master-slave database to read and write high availability framework, reading and writing are in a master library, not read to so-called" Reading Library of dirty data ", so database and cache inconsistencies will not exist.

Traditional play, the problem is not consistent with the master and slave what kind of solution?Look at it together.

Picture description

Why is the master and slave inconsistent? As mentioned above, there is a time delay to read and write, and it is possible to read the old data from the library. The common method is the introduction of middleware, business layer does not have direct access to the database, but through the middleware access database, the middleware will record which key the write request, within the data of master-slave synchronization time window, if key and a read request, the request is routed to the main library up (because at this time from the library may not have synchronization is complete, the old data), use this method to ensure data consistency.

Middleware program is ideal,So why is it that most of the Internet companies do not use such a scheme to ensure the consistency of the master and slave data?That is becauseDatabase middleware technology threshold is relatively high, there are a number of large companies, such as Baidu, Tencent, Ali, they may have their own middleware, not all Internet Co have their own middleware products,Moreover, many of the Internet Co's business requirements for data consistency is not so high. For example, the city found a post, it may be 5 seconds after the search out of the user experience and not much impact.

Picture description

In addition to middleware,Read and write are routed to the main library, 58 city is to do so, but also a common solution to solve the inconsistent.

Solve the master and master, the second to solve isInconsistent database and cache, just mentioned the traditional cache play, dirty data is likely to enter the cache, how to solve it?

Two practice: the first is the cache double elimination mechanism, the second is recommended for all item set expiration time (the premise is to allow miss cache).

(1)Cache double elimination, the traditional play in the write operation, the first phase out of the main library cache. As mentioned above, in the master slave synchronization time window may have dirty data into the cache, this time if the re launch of an asynchronous out, even if the time is not consistent with the time window of internal data into the cache, it will be eliminated again.
(2)Set timeout for all item, for example, 10 minutes. Limit sequence, even if there are dirty data into the cache, the dirty data is up to ten minutes. The side effects that may be every ten minutes, the key has a read request will penetrate into the database, but we think this is a very small increase in the pressure on the database from the library is very small.

Extended architecture practice

Extensibility is a bit of a point to be considered when designing a database architecture. First to share a 58 city is very handsome second level data expansion program. What is the solution to this problem? The original database level is divided into N library, now want to expand into 2N library, is to solve this problem.

Picture description

Assume that the original is divided into two libraries, according to the hash in accordance with the way of sub. As shown above, is divided into an odd number of libraries and even libraries.

Picture description

First stepUpgrade from library, the bottom one from the library on the top (in fact, did not do any action);
Second stepsModify the configuration, the expansion is completed at this time, turned out to be 2 parts, modify the configuration into 4 parts, the process does not transfer the data. The original even number of that part is now two parts, one part is 0, a part is 2, the odd part is now 1 and 3. There is no data conflict between the 0 and 2 libraries, and the availability of the dual master is lost in a short time after the expansion.

Picture description

The third step is to do some finishing touches.Old double main to lift off, in order to ensure the availability ofAdd new dual master synchronization, the original has all the data, and now only half of the data to provide services to usDelete redundant data, the end of the three steps can be slowly after the operation. The whole process in the process of expansion in the second step from the library, modify the configuration in fact, the second level is completed, very handsome.

The disadvantage of this scheme isCan only achieve the N library to the 2N library expansion, 2 becomes 4, 4 becomes 8, can not be achieved 2 library 3 library, 2 library into a 5 library expansion. So, how to achieve this expansion it?

There are a lot of requirements for database scalability, for example, just now.2 library expansion 3 Library, 2 library expansion 5 library. Product managers often change demand,Extended table propertiesIs also a regular thing. This year's database of the general meeting of the general assembly also introduced some of the use of triggers to do schema change online program, butLimitations of triggersLie in

First, the impact of triggers on database performance is relatively large;
Second, the trigger can only be effective in the same library, and the Internet is characterized by a very large amount of data, the amount is very large, the library is distributed in different physical machine, the trigger can not get.

Finally there is a kind of expansion of demand,Changes in the underlying storage media, the original MongoDB storage, and now want to change into MySQL storage, which is also the expansion of the demand (although very little), these three types of demand, how to expand?

Method is the library, migration dataThere are several ways to migrate data, the firstStop service, if everyone's business to accept this method, it is strongly recommended to use this method, such as game companies, night a point to two server maintenance, may be in dry area or area of the guide base.

Picture description

If the businessNot allowed to stop the service, to achieve a smooth migration, double approach can solve this kind of problem.

(1) the first step is data transfer is doubleUpgrade service, the original service is to write a library, now the establishment of a new database, double write. For example, changes in the underlying storage media, we turned out to be the Mongo database, and now a good new MySQL database, and then the service of all the write interface for a double database upgrade.

(2)Second steps to write a small program to carry out the migration of data. For example, write an offline program, the two data re split into three curitiba. It is also possible that a user table with only three attributes is guided to the five attributes of the data table. This data migration to the speed limit, after the end of the two library data consistent? As long as the two written in advance, if there is no accident, the data on both sides should be consistent.

When will there be an accident?In the guide, a data process just occurred in a delete operation, the data has just been service double deletion, and migration program data into the new library, the very limit will inconsistent on both sides of the data.

(3) recommendationsThe third step to develop a small script, on both sides of the data for comparisonIf you find out, you can fix it. When the repair is completed, we believe that the data is consistent, and then the double write and become a single write data to complete the migration.

The advantages of this approach:

First, change is very small, relatively small impact on the service, single write variable pair to write and develop two gadgets, a is migration program, reads the data from a library, also a library insert; and a data checking program and two data comparison change was relatively small.

Second, at any time can be rolled back, program risk is relatively small, in any one step if you find a problem, you can stop the operation at any time. For example, the process of migrating data found wrong, the new database to kill, re move. Because before the switch, all online reading and writing services are provided by the old library, only after cut, is the new library services. This is the way we are very handsome, a smooth guide.


This share first introduces the concept of single library, sliced, copy, packet and routing rules. To solve the problem of large amount of data, replication and grouping is to improve the reading performance, to ensure the availability of the problem. Split will introduce the route, the three commonly used methods of routing, according to the scope, in accordance with hash, or new services to route.

How to ensure the availability of data? Idea is redundant, but will lead to data inconsistency, 58 city to ensure availability of practice is a dual master as a principal and subordinate, read and write traffic are in a library, another library standby, a master library hang flow automatically migrate to another main library, but resource utilization rate is 50%, and not by increasing from the library to improve their reading of.

Read the performance of the practice, the traditional way is to increase from the library or increase the cache. The problem is that the master and slave may not be consistent with the city's play is a service plus database plus a set of ways to solve these problems.

Consistency of practice to solve the master-slave inconsistent with two methods. One is adding the middleware and middleware records what key on a write operation, the read operation within the master-slave synchronization time window is also routed to the main library. The second way is to force the reader to read the. Database and cache consistency, our practice is double elimination, in the event of a write request, the elimination of the cache, write the database, and then do a delay of the cache. The second practice is to recommend a timeout period for all the item settings.

Scalability, today to share the 58 city a very handsome n pool expanded 2n library second expansion program, also shared a smooth double guide the writing base scheme, solve three library database expansion, increase in the database field, and changes in the underlying medium.

Share: Shen Jian, 58 home technical director, 58 city senior architect, 58 city technical committee responsible person. Baidu has been a senior engineer, has been involved in a number of major projects Baidu HI R & D, after joining the 58 city, is responsible for the 58 city instant messaging, payment system and amortization system reconstruction. Also involved in database middleware, 58 city recommended system, 58 city merchant platform App and 58 city secondary trading platform APP and other systems and project design and implementation.

Architecture Technology Practice Series (part):

This paper finishing from 58 home technology director Shen Jian recently in the UPYUN architecture and operation of the general assembly, the Beijing Railway Station, the keynote speech. [UPYUN architecture and operation & Maintenance Conference ( is the leading domestic new generation of cloud CDN service provider UPYUN hosted a large technical conference. Convention for the operation and maintenance, and architecture practitioners, the industry's first-line architects and operation and maintenance experts invited to share the pure dry, in order to promote the research and application of the operation and maintenance of technical and product structure in the Internet and mobile Internet.

(zebian / Qian Shuguang, focus on architecture and algorithms, seeking reports or submission, please email qianshg@PROG3.COM exchanges to explore and micro qshuguang2008, note name + company + position)

CSDN senior architect group, there are a lot of well-known Internet Co in the big cattle architect, welcomed the architect to add micro channel qshuguang2008 into the group, the name + company + post.