MasterofProject

Foundry Service Gateway Cloud source analysis

3434 people read comments(0) Collection Report
Classification:

Foundry Cloud is an open source platform that serves the product, it provides freedom of developers to choose the cloud platform, development framework and application services. While Foundry Cloud, the service is reflected in the application of advanced features, it is due to the presence of Service services, users can accelerate the application deployment and simplify the application management. First of all, it is a brief introduction about the Foundry Service Cloud.

Currently Foundry Service Cloud mainly includes three aspects: 1 database services (NoSQL and relation), such as: Redis, MondoDB, Neo4j, CouchDB, Mysql, Postgres and so on; 2 storage services, such as: Vblob, Filesystem, etc.; 3 other types of services, such as: RabbitMQ message queuing system, Memcached distributed memory object caching services, etc.. In Foundry Cloud, from the point of view of the component, service mainly includes two parts: gateway service and node service. From the workflow point of view, service has the following two most important functions: to create a service instance and to run the app binding one or a number of service examples, of course, there are some other important features, such as snapshot to create service and so on, but these are not in the scope of the source code of this article.

Personally feel that the research and analysis of a component of Foundry Cloud should start from two aspects: first, the process of starting the study of components; second, the study of the operation of the process of interaction with the user. About Foundry service gateway Cloud, the source is also the case.

On source, on GitHub cloud foundry sector, mainly has: vcap-service-base, and cloud-controller vcap-services, which cloud-controller as a control module, a lot of information about the service will through the form of HTTP forwarded to the service gateway, so the gateway and cloud-controller in communication worried related.

Preparatory work is ready, then start to analyze the source bar. Or from the start and run two aspects to understand.

1.gateway start

          

About gateway startup code, in the asynchronous_service_gateway.rb directory under the cloudfoundry-vcap-services-base\lib\base, which is one of the most important Ruby gateway file to study. In fact, gateway startup code is very clear, as follows:

Initialize def (opts)
Super
Setup (opts)
End

          Everyone knows that the initialize method is executed every time a new instance of the class is generated. So in the start of gatweway, the first implementation of the initialize method, and then call the setup method. Setup method just as its name implies, is the main gateway startup settings. Can be seen in the same Ruby file, the setup method in the initialize method, the function of the following components: the assignment of the instance variable, set the heartbeat and termination of the processing method, add unknown handles, register feedback handles.


1.1 set heartbeat and termination processing method

          The code for the setup method is very clear, for instance variable assignment, not to repeat them here. The so-called heartbeat, heartbeat, which means people are at the heart of each in less than a second, it will jump about, if do not jump, that is to say people appear suspended animation or have died. Such a concept for a gateway node, the same set up, the main function of the node to tell others alive, but also to the person can find their own URL, etc.. While the heartbeat cloud is sent to the controller gateway, as follows:

EM.add_periodic_timer (@hb_interval) {send_heartbeat}
EM.next_tick {send_heartbeat}

       You can see that gateway is a loop to send heartbeat to controller cloud. While EM and add_periodic_timer are related to the content of EventMachine, the specific reference to the laboratory of Bo WenResearch EventMachine on. The following is the definition of the send_heartbeat method:

Send_heartbeat def
@logger.info ("info to cloud controller: #{@offering_uri} Sending")
Req = create_http_request (
Head: = @cc_req_hdrs,
Body: = @svc_json
)
HTTP = EM:: HttpRequest.new (@offering_uri).Post (req)
Do http.callback
If http.response_header.status = = 200
@logger.info ("registered with cloud controller Successfully")
Else
@logger.error ("registering with cloud controller Failed, status=#{http.response_header.status}")
End
End
Do http.errback
@logger.error ("registering with cloud controller: #{http.error} Failed")
End
End

     The source is very clear, the first is to play info to cloud controller:#{@offering_uri} log Sending, and then create a HTTP request req, and then through the new HttpRequest class instance, do req a post operation, once the reply message in status 200, the successful registration. Here is the key part of the http=EM:: HttpRequest.new (@offering_uri).Post (req) this line of code. About HttpRequest code is achieved through the Gem package to be downloaded to the installation of gateway nodes, mainly through the way to cloud http Controller send Rest request.

All Rest requests received by controller cloud will be forwarded to the cloud_controller/cloud_controller/config/route.rb file, and the Rest request will be mapped to the file post'services/v1/offerings'=>'services#create', as=>: send_heartbeat: service_create this line of code, which will map to the create file in the service_controller.rb method. The following is the specific implementation of the Create method (omitted part of the code):

Create def
......
Req = VCAP:: Services:: Api:: ServiceOfferingRequest.decode (request_body)
......
Success = nil
SVC = Service.find_by_label_and_provider (req.label, req.provider = = "core"? Nil: req.provider)
If SVC
CloudController.logger.debug ("SVC #{svc.inspect} = Found")
......
Svc.update_attributes! (attrs)
Else
SVC = Service.new (req.extract)
......
Svc.save!
End

Render: JSON = > {}
End
Which will send a request to resolve it, save in the req, and then run the find_by_label_and_provider function. The beginning of the ruby and rails framework is not very mature, and then study the code through the grep to find the method, always find this method, and later found that the implementation of the method is encapsulated in the ActiveRecord.

ActiveRecord is an object relational mapping (ORM) layer provided by Rails. Record Active uses the basic ORM schema: tables are mapped into classes, rows are mapped into objects, and columns are mapped into objects. Record Active is minimized with a lot of different ORM libraries that use a large number of configurations. Because of the existence of ActiveRecord, programming for the database to read and write do not need to write the SQL statement, simply by specifying the format to write. For example, the find_by_label_and_provider method is called only in a database query method, the attributes of the query label and provider. So find_by_label_and_provider (req.label, Req.provider = = "core"? Nil: req.provider) of the logic can be equated to find label=req.label provider= (req.provider = = "core". Req.provider nil in an)). After performing the method, the return value is assigned to the SVC, once the SVC exists, the database in the record for the update operation; if not, then the SVC records save to the database. Send_heartbeat method can be regarded as a process of registration to cloud_controller, the registered information in the cloud_controller is in accordance with the Create method to achieve the storage of the database.

When it comes to the cloud_controller database storage, it is necessary to briefly explain. Postgres will cloud_controller as its storage information database. Among them, the information includes different gateway service to cloud_controller registered URL information, service binding information, credential service, etc.. If you need to understand the information of the storage model, then it must enter the database to view, in the experimental process, we use the Admin PG software, the establishment of the connection with the Postgres, so as to view the database model and record changes.

About heartbeat information, you can view the gateway and log cloud_controller to view, in order to have a deeper understanding.


1.2 send notification

          When gateway needs to log off, it will send send_deactication_notice to the cloud_controller notification, its main code is as follows:

Do Kernel.at_exit
EM.reactor_running if?
#: We can't stop others from / killing the event-loop here. Let's hope that they play nice
Send_deactivation_notice (false)
Else
EM.run {send_deactivation_notice}
End
End

Lets the cloud controller know that # we're going away
Send_deactivation_notice def (stop_event_loop=true)
@logger.info ("deactivation notice to Sending cloud controller: #{@offering_uri}")

Req = create_http_request (
Head: = @cc_req_hdrs,
Body: = @deact_json
)

HTTP = EM:: HttpRequest.new (@offering_uri).Post (req)

Do http.callback
If http.response_header.status = = 200
@logger.info ("deactivated with cloud controller Successfully")
Else
@logger.error ("deactivation with cloud controller Failed, status=#{http.response_header.status}")
End
If stop_event_loop EM.stop
End

Do http.errback
@logger.error ("deactivation with cloud controller: #{http.error} Failed")
If stop_event_loop EM.stop
End
End

The main way of implementation, then, and send_heartbeat consistent, through the cloud_controller way to send REST request to the cloud_controller, HTTP response and feedback, gateway received feedback after verification, and the termination of the EventMachine instance.


1.3 start gateway to add the required handles

Since gateway in the work process will be used to some service configuration and binding information, and gateway storage of information is through the node to open up memory to achieve, so every time in the gateway start, you need to send a cloud_controller to handles command, so as to find the cloud_controller in gateway responsible for service service information, return to their.

About the way through the node to open up memory to achieve data storage, I think there are advantages and disadvantages. Because gateway is a single node, and can have more than one service_node node, so the memory storage will not be safe. I think this can be done to improve, handles gateway information can be stored through the database. So at the start of the gateway phase, you do not need to cloud_controller to get the handles operation. In this case, the service information will be stored in two places in gateway and cloud_controller.

The following is the implementation of the code to add handles. First add a cycle clock through the EM, the call to the fetch_handles. And each call fetch_handles method, will send a HTTP request to the cloud_controller, once received the appropriate state, then show the successful acquisition of resp, the handles will return the update_handles block for operation, once successful, cancel the timer fetch_handles_timer. Here the update_handles method is implemented in the provision.rb file.

Add any necessary handles we don't # know about
Update_callback = do |resp| Proc.new
@provisioner.update_handles (resp.handles)
@handle_fetched = true
EM.cancel_timer (@fetch_handle_timer)

TODO remove it when we finish # the migration
Current_version = @version_aliases & & @version_aliases[: current]
Current_version if
@provisioner.update_version_info (current_version)
Else
@logger.info ("current version alias is supplied skip, update version in CCDB."), ("No")
End
End
@fetch_handle_timer = EM.add_periodic_timer (@handle_fetch_interval) {fetch_handles (&update_callback)}
em.next_tick { fetch_handles(与update_callback)}
#取典型状态(处理)从云控制器
DEF fetch_handles(与CB)
如果“fetching_handles返回

@记录器。信息(“取处理来自云控制器”# { @ handles_uri }”)
“fetching_handles =真

要求:头= = create_http_request > @ cc_req_hdrs
HTTP = EM::HttpRequest.新(@ handles_uri)得到(需求)。

http.callback做
“fetching_handles = false
如果http.response_header.status = = 200
@记录器。信息(“成功获取处理”)
开始
分别为VCAP::服务::::listhandlesresponse API。解码(HTTP响应)
救援=
@记录器。错误(“解码错误回复网关:”)
@记录器。错误(“# {一}”)
下一个
结束
CB的电话(RESP)。
其他的
@记录器。错误(“失败取处理,状态= # { HTTP。response_header。现状}”)
结束
结束

http.errback做
“fetching_handles = false
@记录器。错误(失败的处理:取# { HTTP错误}”)
结束
以上谈到的update_handles方法则是在asynchronous_service_gateway。Rb同目录下的规定。Rb文件中实现。由代码可知,从cloud_controller返回的处理信息则会传入update_handles方法,主要操作为验证处理的格式是否合法,另外将处理中的配置、凭据以及service_id保存起来。存储地址就是上文提及的网关开辟的内存prov_svcs中。具体代码如下:

DEF update_handles(处理)
@记录器。信息(“[ # { service_description } ]更新# {把手。大小}处理”)
handles.each做|处理|
除非verify_handle_format(手柄)
@记录器。警告(“跳过形成不好处理:处理# { }”。)
下一个
结束

H = handle.deep_dup
“prov_svcs [ H ] [ 'service_id ] = {
:配置= > H [ 'configuration ],
:凭据= > H [ 'credentials ],
:service_id = > H [ 'service_id ]
}
结束
@记录器。信息(“[ # { service_description } ]处理更新”)
结束


1.4启动网关时查看孤儿

何谓孤儿,其实孤儿就是在服务节点上已经创建的一个服务实例,但是它的存在还没有通知cloud_controller。比如说,在服务节点创建完实例并通知完服务网关,网关返回给而当cloud_controller时,发生了网络故障,从而cloud_controller通知终端用户,创建失败,当然也不会有信息更新在cloud_controller的数据库中。这样的话,就相当于在service_node上创建了一个没有用的实例,从而导致浪费了一些资源。孤儿和没有绑定的实例是有区别的,在实际情况中经常会出现未经绑定的实例,但是他们在cloud_controller中都是有数据记录的,而孤儿则没有。一般这种情况很罕见,但是源码中还是考虑了这一点。


2。提供服务

           

首先,要从代码的角度提供服务,讲述一下在中与创建的区别提供服务,为提供、创建一个的实例,而创建的实际实现为网关向cloud_controller注册的过程。

在通过研究网关代码后,发现规定一个服务实例的主要步骤如下图:


说了这么多,其实很多的组件间通信方式都是相似的。比如说VMC给cloud_controller发送HTTP请求后,经过路由器。Rb文件,转发到service_controller。Rb文件中的规定方法。在该方法中,cloud_controller在数据库中查找相应服务的URL以及港,然后将请求发送给指定网关。

网关收到请求信息后,首先对请求进行一系列的验证的信息中有如果解析到请求。“/网关/ V1 /配置”,经过解码等,会执行以下代码:

@供应者。provision_service(REQ)做|味精|
如果味精[ ''成功]
async_reply(VCAP::::::gatewayhandleresponse API服务。新(味精[ 'response ])。编码)
其他的
async_reply_error(味精[ 'response ])
结束
结束

In fact, the real provision operation method in the @provision.provison_service, and the method of implementation in the provison.rb file. As the code length is longer, can only pick a typical overview.

Gateway first according to the cloud_controller sent to the request of the plan, to choose the right node. Here to give an example to explain what is the plan node. Generally in the production environment, the same service will have more than one node, node_0, node_1,... And each node has a certain description of the capacity or function, and sent to the geteway. Assuming plan node_0 is less than provision 20MB database, and plan node_1 is greater than provison 30MB database, and cloud_controller sent to the request, paln is greater than 30MB, then gateway will be selected by the following code to join plan_nodes node_1.

Plan_nodes = @nodes.select{|_, node| node["plan"] = = plan}.values

Plan_nodes natural selection is not enough, the next is the choice of version. Version, then, mainly refers to the serivice version of the problem. Selected from the version_nodes plan_nodes, the main code is as follows:

Version_nodes = |node| plan_nodes.select{
Node["supported_versions"]! = nil & & node["supported_versions"].include (version)?

Elected version_nodes should be about the same, in fact, or, gateway also consider the use of these version_nodes in line with the conditions of how, Foundry Cloud is here to focus on the size of the node capacity service. So will be selected in the best_node version_nodes, the method used here is to select the method of version_nodes capacity in the node's largest max_by, the main code is as follows:

Best_node = version_nodes.max_by {node_score |node| (node)}

After the above operation, gateway began to prepare the appropriate data to send the information to the NATS, when the response received, will receive the credential information stored in prov_svcs. Immediately following the program back to the cloud_controller, to the asynchronous_service_gateway.rb reply.

Because the unprovision process is roughly consistent with the provision, so here is not about.


3 a service bind

        

Bind a service, in fact, is a lack of the subject of the statement, naturally there is a app, and then bind needs a app service. In fact, each bind operation is updated manifest itself in the app, and then send a update_app instruction.

The following is a major process for bind services:


The main code of the process is still very similar to the VMC, the first cloud_controller issued a command, route.rb through the provision distribution task. In service_controller.rb, bind method to achieve the binding of app and instance provisioned.

Bind def
Req = VCAP:: Services:: Api:: CloudControllerBindRequest.decode (request_body)

App =:: App.find_by_collaborator_and_id (user, req.app_id)
CloudError.new raise (CloudError:: APP_NOT_FOUND) app unless

CFG = ServiceConfig.find_by_name (req.service_id)
CloudError.new raise (CloudError:: SERVICE_NOT_FOUND) CFG unless
CloudError.new raise (CloudError:: FORBIDDEN) cfg.provisioned_by unless (user), ()

Binding = app.bind_to_config (CFG)

Resp = {
Binding_token: = binding.binding_token.uuid,
Label: = cfg.service.label
}
Render: JSON = > resp
End

First, the analysis of request_body, the reporter in the cloud_controller database query to the corresponding app corresponding user, then find the service_id configuration information, then binding = app.bind_to_config (CFG), the method of implementation in cloud_controller/app/models/app.rb, the method first creates a binding_token, and then create a request bind, and then sent through the following code request bind:

Client = VCAP:: Services:: Api:: ServiceGatewayClient.new (svc.url, svc.token, svc.timeout)
Handle = client.bind (req.extract)

The implementation of the method file is downloaded to the cloud_controller node in the form of gemfile. , named service_gateway_client.rb.

So the bind command is sent out from the cloud_controller, the recipient is naturally gateway, in order to down gateway processing code:

A previously provisioned instance # Binds of the service to an application
'/gateway/v1/configurations/ service_id/handles': do post
@logger.info ("request for service=#{params['service_id']} Binding")

req = VCAP::服务::::gatewaybindrequest API解码(request_body)。
@记录器。调试(“绑定选项:# {要求。binding_options。检查}”)

@供应者。bind_instance(req.service_id,请求。binding_options)做|味精|
如果味精[ ''成功]
async_reply(VCAP::::::gatewayhandleresponse API服务。新(味精[ 'response ])。编码)
其他的
async_reply_error(味精[ 'response ])
结束
结束
async_mode
结束

可见其主要代码在于bind_instance一行。该方法的实现在规定。Rb文件中。该方法的实现代码比较长,但是细看以后也能分割成几个模块:选找服务node_id,创建请求,通过NAT发送绑定请求,等待响应。

一旦绑定成功,则回到以上贴出的代码段。更新prov_svcs的同时,向cloud_controller回复。cloud_controller收到分别后,将其保存在service_binding的数据表中。

这样,从原理上讲绑定直接成功实现了,当更新应用程序的时候,cloud_controller中由于已经存在绑定信息,所以APP的在的过程中,会带有绑定信息,当启动应用程序,的时候,APP即通过访问服务所必需的凭据去访问服务,而此时是不经过网关的。


4。对网关的一些看法

       

最后分析完源码,对网关组件有一些自己的看法。

对于网关的看法,主要的从两个方面来看,一个是性能,一个是安全。

首先是性能。在了解网关实现机制的时候,我对它使用内存来存储信息的方式,感到很诧异。网关随着使用时间的增长,规定和绑定信息越来越多,很难保证内存不会出现性能问题。当存储信息很大时,其查询带来的时间消耗自然是不必要考虑的,因为在内存中数组的线性查询几乎可以不考虑时间消耗。但是网关一旦挂掉的话,由于使用内存存储信息的缘故,存储信息会全部遗失,的时候只有当重启网关,网关会从cloud_controller重新获得遗失的信息,但是如果信息量很大的话,这必定也会消耗大量的时间,而且如果网关又比较不稳定的话,这方面的时间开销几乎是不可容忍的。

既然性能上存在缺陷,如果在网关上使用轻量级的数据库来存储信息的话,这样的问题就有了改观,首先由于数据存储的信息是持久化的,所以不存在网关挂掉向cloud_controller索取的时间开销问题,但是数据库的IO读写会占不少的时间。

因此,如果网关能稳定工作的话,内存自然是首选;如果网关在大信息量以及不稳定的环境下,使用数据库存储信息不失为一个好的方法。

另外一个方面是安全。了解云可以知道,其中的每项服务都只有一个网关(单节点)。那么单节点的话,需要考虑负载的问题,需要考虑单节点挂掉的问题嘛?

关于负载的问题,也就是有很多的用户需要规定服务,有很多的APP都需要绑定服务,众多的访问量会不会使得网关崩溃?首先,我觉得应该出现这种情况,需要一个前提,那就是有众多的用户来规定服务,有众多的APP需要要绑定服务。比如说绑定的情况的,绑定只是一个一次性操作,一旦带完,就和网关无关了,对于一个APP的生命周期来讲,是极其小的一个时间段。众多的APP都要来绑定服务的情况本来就不大可能,而且就算有这样的可能性的话,那众多APP的运行环境DEA肯定是已经处于超负荷的情况,所以相对与整个云 铸造来讲,网关这边的多并发显得意义不是很大,因为瓶颈不在这里。

The security problem of single node die, is definitely a big problem. Because once hang, for the completion of the bind has service app, there is no any effect, but to need app bind or provision of services, will be a fatal problem. Through this, it can be known that the design of multi-gateway or gateway is very necessary. This is also the Foundry Cloud enthusiasts or Foundry Cloud staff will be involved in the future.


Reproduced clearly indicate the source.

This document is more out of my own understanding, certainly in some places there are shortcomings and errors. I hope this article can be in contact with Foundry service Cloud in the people some help,If you are interested in this area, and have better ideas and suggestions, please contact me.

My email address: shlallen@zju.edu.cn
Sina weibo:Fu Ruqing lotus seed

      




top
Zero
tread
Zero
Guess you're looking for
View comments
* the above user comments only represent their personal views, does not represent the views or position of the CSDN website
    personal data
    • Visit81621 times
    • Integral:One thousand three hundred and twenty-seven
    • Grade
    • Rank:18622nd name
    • Original47
    • Reproduced:0
    • Translation:1
    • Comments:49
    Blog column
    Contact information
    Latest comments
fanpagemoneymethod.co