Hello, I'm Weige, the author of rocketmq technology insider, and won the honorary titles of excellent preacher in rocketmq official community and top 2 of csdn2020 blog. At present, he is a senior architect of the technical platform Department of Zhongtong express, mainly responsible for the R & D and implementation of products such as full link pressure test, message middleware and data synchronization. He has operation and maintenance experience of 100 billion message clusters. He not only has rich practical experience, but also has in-depth and systematic research on its source code. Welcome to pay attention to me and develop together.
I started with a big man in meituan from the connection reuse of netty. Finally, I talked about the high number of connections of a single node in the microservice architecture. I think this is worth discussing and communicating with you.
For example, the micro service architecture of an e-commerce is shown in the figure below:
Briefly list the composition of this system:
- Gateway domain
The entrance gateway of the whole microservice, cluster deployment.
- Coupon domain
It is mainly used to provide coupon services in the e-commerce system.
- Points service
Member points service in shopping malls.
- Payment services
Provide order payment related services
- Order field
The order field mainly provides order placing services. In the order field, you need to call coupons, points, payment and other services, including inventory and other services not shown in the figure.
The above architecture diagram is very simple and clear, but what does it mean if there are more than 5000 deployed instances in each domain?
The deployment of 5000 for each instance may make you feel unimaginable and think that it basically does not exist in reality. However, if it rises to the current front-line Internet companies, it may not be enough. Moreover, in front-line Internet enterprises, it is usually deployed in multiple computer rooms. The problem here emphasizes that there is such a large scale in one computer room.
If the above micro service architecture adopts Alibaba's open source Dubbo framework and orders services for load balancing, the number of connections held by a machine in the order domain can easily exceed 15000. The actual e-commerce business is very complex, and the services to be called are far less than those outlined in the figure above, which will put great pressure on the machine's memory and network scheduling, and in serious cases will affect the stability of the system, Timeout and other exceptions are easy to occur.
This paper will limit the use of Dubbo with the micro service framework, but the idea of this paper is not limited to Dubbo.
The above reasons why a single node can easily break 2W connection areLoad balancing mechanismCaused by.
The coupon service is deployed with 5000 nodes. As its consumer, the order service will obtain the whole service list from the registry. Then, when it is necessary to call the coupon service, load balancing will be carried out on the client, from the 5000 service providers included in the service listLoad algorithmSelect a service provider to call, that is, 5000 connections need to be created for the single service.
It is worth noting that the order placing service does not only call the coupon service, but also needs to call other services. As a result, an order placing service will create a very large number of connections with the increase of the number of services it calls.
How to break the game?
First, we need to understand the underlying purpose of load balancing:
- Avoid single point of failure
Due to the cluster deployment of service providers, the client can select one from the cluster according to some algorithm when initiating calls. The downtime of one will not affect the use of the client, so as to provideHigh availability assurance。
- Achieve high concurrency
A single node is limited by memory and connections, and its service capacity is not enough to carry a large number of requests. Therefore, it needs to rely on multiple nodes to provide services together, so as to form a service cluster. It is very common for a single service to deploy 5000 + nodes in large factories.
In general, load balancing is more important than the client,Its basic requirements: fully distribute the traffic evenly to the service nodes and make full use of the processing capacity of the cluster。
But you must have all the connections of the service provider to achieve load balancing? I don't think so. Please see the following diagram:
Its core idea is to group the client and server。
From a single client perspective,It is not necessary to hold a list of all service providersThus, there is no need to create TCP connections to all service providers, just hold some connections and assign the other part to other clients.
However, from the perspective of all clients, all service providers can be called to realize load balancing for all service providers.
Through the above grouping, the number of end connections of a single service node (whether client or server) can be doubled, and the effect will be very significant.
Can we use the existing mechanism in Dubbo?
The answer is: of course，Although the purpose of this function design is not the purpose of this paper, it is still similar.
The specific methods are as follows:
The client and server can be labeled. In this way, the order Service-1 is labeled C1, although the order Service-1 can obtain the list of all payment services (4 sets) from the registry,However, because routing is performed before load balancing, according to the label routing mechanism: order Service-1 can only access the service provider with tag C1 because its label is C1In this way, only connections to the payment service with tag C1 will be created, so as to reduce the number of connections.
Perfect solution。 At the end of the article, we will pay more attention to the Dubbo routing mechanism. Please refer to the official schematic diagram:
For the Dubbo routing mechanism, please refer to another blog post of the author:Gray publishing scheme of Dubbo service governance
One click three links (attention, praise and message) is my greatest encouragement。
- Source code analysis rocketmq column (48 articles +)
- Sentinel column on source code analysis (12 articles +)
- Source code analysis Dubbo column (28 articles +)
- Source code analysis mybatis column
- Netty column of source code analysis (29 articles +)
- Source code analysis JUC column
- Source code analysis elasticjob column
- Elasticsearch column (20 articles +)
- Source code analysis MYCAT column
- Source code analysis canal column