• OpenCloud 2015 conference ended successfully

    Hosted by OpenCloud CSDN 2015 Conference on April 16th -18 at the National Convention Center successfully held. "2015 openstack technology conference, 2015 spark technology summit, 2015 container technology summit summit and three industry depth combat training won the highly recognized by the instructor and the audience, more than 40 experts line the depth of the theme speech won the applause.

Graphic record

  • April 17th
  • April 18th
  • 18:05Shanghai point of financial development and operation and Maintenance Director Wan Lintao introduced the important role of Docker in the point of financial network Devops process. Focus on Docker to simplify the development of the environment configuration, the application of rapid deployment, production environment, flexible operation and maintenance of the actual details of the operation. He thinks Docker is very helpful in reducing the need to configure and deploy. For example, in the process of continuous integration and delivery of products, we often need to deploy a number of sets of test environment, by way of the Docker mirror can achieve rapid deployment. However, Docker in the network, security, management and other aspects of the need to improve the place. Now to remove the firewall, Blance Load and database, have achieved Docker. Because Blance Load does not require frequent repetitive configuration and dynamic management. In the CI process, the point of the network is the use of Docker: Git to do the management code, Jenkins compiler, Docker will be packaged into image files, and finally delivered to the Runtime environment to run.

  • 17:35Senior engineer of Tencent, the underlying service group senior engineer Chen Fanglu and share the topic is the container technology in Tencent SNG middle tier application ", mainly from middle layer management evolution to introduce application of container technology. Content sharing involves LxC and docker two container technology, introduces Tencent consider the types of virtualization, on which resources isolated, how by optimizing the kernel implementation per container statistically isolate, how LxC production management, to build the largest LxC cluster, and Tencent to docker technology exploration and with hive cluster fusion attempts. He said that for Docker, Tencent has done three aspects of the optimization: first BUG fixes, such as Docker non 0 exit when RM does not come into force, for the bindmount config path BUG can not be cleared, such as true. Second is to optimize the docker's resource management strategies, such as Hardlimit memory management strategy, not only the user process is easy to kill, the more cause the waste of resources and the user to estimate the resource requirements of their business is also very high. The last is resource management latitude, docker in resource management latitude only CPU and memory of two dimensions, which for the sharing of cloud computing environment needs to improve is relative to the current lack of virtual machine.

  • 16:40IBM China Research Institute senior researcher Chen Guancheng brings the theme of OpenStack, Docker and Spark to build SuperVessel big data public cloud speech. According to Chen Guancheng introduction, SuperVessel is a public cloud built in OpenStack and Power7/Power8, providing as Service Spark, Service Docker and CompuNng Service CogniNve and other services. Why choose Docker and Spark technology to build SuperVessel public cloud, he also gave an explanation. There are two reasons for the choice of OpenStack: 1 community activists, community contributors and other competitors beyond the other 2 support Docker. Docker choice has three reasons: 1. Resource occupancy rate is far less than the KVM. 2. The start is very fast, 3. Can gradually build, recovery and reuse containers; spark selection based on four reasons: 1. Soon, the unity 2. And 3. Ecological systems are developing rapidly, 4.porting to power. At the end of the summary, he said Spark+OpenStack+Docker on the OpenPower server can be a good run, Docker services can make Devops more simple, he also stressed the attention to monitor everything.

  • 16:40Several technology founder Wang pujun today brings the topic is "docker and mesos combination", he said Apache mesos can run many of the distributed system types in the same cluster machines, more dynamic low effective rate of shared resources. Provide failure detection, task distribution, task tracking, task monitoring, low level resource management and fine-grained resource sharing, which can be extended to thousands of nodes. Mesos from the 0.20 version supports Mesos, Docker + Docker combination can provide a very powerful platform for deploying applications and services in a cluster environment. Because Docker is very suitable for application release, and then with the Mesos combination, greatly simplifies the deployment of distributed applications. And docker excellent can be a powerful mesos transplantation and distributed resource management, distributed application based on mesos and docker has good elasticity ability, easy to build high reliability applications for enterprises, greatly the development enterprise deployment of distributed application. He mesos only focus on resource scheduling, the resource exposure to upper computational framework, by the upper spark, Hadoop to consumption of praise.

  • 16:00AsiaInfo platform for big data technology R & D department manager Tian Yi focused on sharing the practice of multiple projects. For example, based on the transformation of Spark user tag analysis platform. Initial communication data and Internet data, through the database, TCL script, SQL to achieve exploration, monitoring and analysis. There are many problems: label quantity is more and more big, database workload is too high, extended high cost; table label number of columns with the tag number increasing increased, part of the site to 2000, only through the table to solve, queries need to join operation; tag and index calculation can not get rid of SQL constraints, can not be quickly integrated machine learning algorithm. The first transformation is to replace the SQL+HDFS SQL Spark. Benefits are obvious: SparkSQLParquet scheme of effectively guarantee the query efficiency; the original system basically do not have too big alteration; query system with parallel scalability. But there are also some new problems, such as increasing the poured out data from the database, the additional steps of loading to the HDFS; increase the conversion from text data to additional steps of parquet format. Second transformation of the original database into the HDFS, the TCL script for SparkSQL. Not only the expansion of the whole system to further enhance, and two sets of SparkSQL can be different according to their busy busy, sharing the whole system of computing resources. Wait until after the release of 1.3.0 External, Datasource API Spark to further enhance; DataFrame provides a rich variety of data source support; DataFrame provides a set of DSL for manipulating data. These help project completely get rid of the tag analysis algorithm for SQL dependence, the front end can also be extracted by the ExtDatasource data, reduce the ETL dependence on the system. And DF based processing program code is only the original program 1/10, greatly improve the readability. The same in-depth analysis of the project as well as Streaming Spark transformation content recognition platform, etc..

  • 16:00Huawei open source Competence Center open source strategy planning expert Wang today brings the theme is the scene of nfv docker container technology applied research ", talked about the nfv / SDN scene business challenges, and he used to container technology solve telecom business problem as an example explain the process to meet some of the problems in the implementation. He said ICT integration trend, telecommunications network also gradually in the field of IT technology introduced in, such as traditional virtualization Xen, KVM, VMware and objective said, these technologies brought some benefits, but can not completely solve the problem of telecom network, such as: telecom network for transmission delay requirements are very high, the frequent jitter will give customers experience a fatal blow. Container/Docker compared to traditional virtualization technology, its characteristics lies in the efficient operation of lightweight and application deployment, there are a lot of it / Internet enterprise application in the structure are fully taken into account the decoupling and distributed, migration to the Container/Docker basic on the not too big resistance, migration in the field of telecommunications is different, at present adjustment of the biggest challenge is the software architecture, resolution of telecom business, and also to ensure real-time and reliability, this to traditional virtualization technology is a challenge.

  • 15:20Following the morning "new directions for spark in 2015" speech, spark streaming project leader Tathagata Das for everyone introduced spark streaming over the past year the function update, practical application examples and the future new features. TD said in the past year, Streaming Python in API Stream, MLlib Spark algorithm, Steam API Kafka, Library and Infrastructure System have been updated. In practical application, Pearson, Pearson Education Publishing Group, big data solutions provider guavus and video site Netflix are in their respective business application the spark streaming. Pearson from the early Storm turned to Spark, the use of Spark combined with student activities and events to update the student learning model, and Netflix is a real-time analysis of the trend of TV and movies. In the future, TD revealed that Streaming Spark will be in the library, ease of use and performance of the business to upgrade.

  • 15:20Jingdong senior architect Tian Qi led us to deep understanding of the container technology, focusing on the system's bottom CGroup, Namespace, as well as Mapper Device related things. He explained that CGroup process resource management functions (including the memory, CPU, IO, etc.), do not rely on in the namespace, can be used alone, management functions are exposed through the VFS interface. CGroups to provide general framework, each subsystem is responsible for the realization. CGroup and ns of the problem is that NS isolation is not complete, need many kinds of namespaces, and IO CGroups control there are more problems, including bandwidth control only the CFQ scheduler, not suitable for high speed hardware, universal limiting strategy inelastic, buffer IO unable to accurately control. He also believes that although the docker in rapid development, but still need to improve, and container technology is heavily dependent on kernel features, its use must do some choice: custom or choose and work around or component kernel team. Docker storage drive overlayfs or device mapper, mirroring the distributed back-end storage is independent research and development or open source customization.

  • 14:50Intel big data technology center R & D Manager Huang Jie on the Spark memory management, IO upgrade and calculate and optimize the 3 aspects of a detailed explanation. The interactive survey found that nearly 80% of the hundreds of people on the site said they had or are ready to use the Spark. In this 80% of the guests, 10% of the friends expect to use Spark to do advanced machine learning and graph analysis, 10% of the friends expect to do complex interactive OLAP/BI, 10% of the friends want to do real-time flow calculation. For Spark, Huang Jie said, it will become an important role in big data, but also will become the main platform for the next generation of IA big data.

  • 14:50The lark technology chief architect and former Microsoft azure team chief architect chenkai. Speech is a "container management at cloud scale", chenkai first with "a tale of the story of the king 's shoes", to describe the relationship between the virtual machine and container, the virtual machine is the physical machine resources of encapsulation and isolation, the container is packaging applications. Docker and the container technology has been simplified, Docker will be applied to the external dependence of the full package, to be able to repeat the standard operation. But with Docker is not enough, how to manage the container? Chen Kai compares the Mesos, Kubernetes and Docker own release of Machine, Swarm and Compose these three management tools. Finally, chenkai introduced and implementation of the overall architecture of the lark AlaudaCloud, AlaudaCloud since the March 11 start beta testing, currently open on the number of containers has more than 10W.

  • 14:30Cloudera, a senior architect Phil Tian Feng Zhan Tian's speech on the theme is the spark driven intelligent data analysis application, for the spark, he believes that spark will replace the MapReduce becomes Hadoop common computational framework. This is mainly because: in with Hadoop community well integrated at the same time, spark moment has been more extensive support from the community and provider; excellence in scientific data and machine learning. During the speech, Dr. Tian through specific cases of multiple companies to show the spark of value: conviva through the real-time analysis of traffic patterns and the flow more precise control, optimize the end users of online video experience, for conviva. The main value of the spark is rapid prototyping, sharing of offline and online computation business logic, open-source machine learning algorithm; Yahoo through spark accelerated advertising model training pipeline, feature extraction improve 3x, use collaborative filtering content recommendation, for them the main value of the spark is to reduce the data pipeline delay, iterative machine learning, efficient P2P broadcast.

  • 14:00VMware China R & D center cloud native application senior architect, Foundry Cloud Chinese community, one of the earliest technology preacher Zhang Haining's speech direction is to use the pipeline to manage the container application". Gartner forecast: 2018 DevOps will become the absolute mainstream, so some of the original development of technology needs to be adjusted. Docker as a new technology, VMware is also actively supported. The bottom is such as NSX, ESXi and other products, which are Linux and Container?. For this level, next week, VMware will release a stack of new open source products. When it comes to efficient process management between the developer and the operation and maintenance staff, vRealize Code Stream VMware can achieve this.

  • 13:30U.S. mission cloud operation and maintenance development technology director Pan Yongzhi introduced the application of Docker in the United States mission. In order to solve the problem of resource competition (multiple applications at the same time build competitive machine calculation and IO release resources), environmental conflicts (construction of different application environment dependent released on one machine) and safety (application build scripts running in the public release of the machine, the script bug may affect the normal operation of the machine is released) so, try to take more beauty solutions worker and isolation publishing environment combined, the construction process into the Docker container, automatically trigger the submitted code, issued directly using the application package, which will build a good package uploaded to the beautiful cloud object storage service MSS, release pull software the package and release from MSS, also support the CoreOS template. Pan Yongzhi explained that there are three reasons for not using a virtual machine, taking into account the depth of the virtual machine and CMDB (start will trigger a series of actions), slow speed, and a large resource consumption.

  • 13:202015 spark technology summit, afternoon game of the first lecturers are from Alibaba Taobao technology senior technical experts Huang and his share of the theme is "figure flow wall: Spark streaming and graphx dynamic graph calculation based on", he first on the graphx and streaming + mllib development were introduced, but in practice that clean out treasure, they also meet the new problems and challenges. In the flow graph is amalgamative the advantages he summed up the two points: model delicate, compared to the use of ordinary operator can be through the strong operator, obtain better accuracy and efficiency; performance optimization, the graph operator can avoid RDD time-consuming operations. In the flow graph is amalgamative attention. He emphasizes the following points: resources guarantee: streaming tasks for long, the rational allocation of the core and the worker, memory, must guarantee for the most part, don't appear serious delay; spikes and fluctuations: online in real environment, the amount of data per cycle will fluctuate phenomenon; when switching data source, data completion will also generate spiked; first according to the N cycle before operation every cycle input per cycle and the amount of data processing time, the calculated threshold processing ability of the system, the next Zhou Qigen according to the threshold for peak processing. Feign death: message delivered in May will be too much that homework feign death, message limit the size required; data accumulation: when a cycle of input data, beyond the processing capacity of the system, will be postponed to the next cycle of data processing, the data will be accumulation; create a data buffer pool achieve peak, according to the input data quantity of each cycle estimated processing time, if estimated processing time is greater than the threshold time, part of the excess into the buffer pool, if estimated time is less than a threshold time, from the buffer pool release ratio of the corresponding data.

  • 13:102015 Container Technology Summit on the afternoon of the agenda, the afternoon of the host is Docker Chinese community founder ma.

  • 11:50Baidu Senior Software Engineer Ma Xiaolong speech content is Spark in Baidu's engineering practice to share, the main coverage of Baidu's Spark" And "Baidu public cloud Spark" two parts. In the explanation of Tahyon, Ma Xiaolong first shared the Baidu facing the problem, that is why Use Tachyon: data nodes and computing nodes may not be in the same data center; cross data center access delay. And share the Baidu solution Solution: use Tachyon as Cache Layer Cold; Query Transparent to read data from remote storage nodes; Hot Query read directly from Tachyon. Through the above efforts, Baidu finally in the query 10X to get the Warm\hot + performance upgrade.

  • 11:50Bring all the Siyuan CTO Wang Xu issue is the multi tenant environment docker's security isolation "introduced will be combined with the VM and container technology, explored in isolation and performance of compromise. Wang Xu believes that the container and image integration, docker let Devops rely on deployment of complex configuration management becomes simple, docker is a revolution in the field of operation and maintenance, can promote operation and maintenance services to a higher level of development. He said that the size of Docker between the traditional virtual machine and software packages, Docker's life cycle is shorter than the traditional VM, the number of more than VM. For application architecture, Docker want to use as far as possible without state, can be re deployed, the level of expansion, if the application is likely to need to adjust the traditional. For operation and maintenance, acquisition and monitoring indicators may occur some changes, not the traditional for OS and virtual machine monitor simple migration to docker and need corresponding adjustment according to the number and life cycle. In the future, we hope to combine the VM and Container technology to enhance the isolation between different tenants, different security requirements of the container, while trying to avoid the excessive performance loss caused by the strong isolation.

  • 11:20Red hat senior solution architect Cai Shu today brings the theme is the combination of Docker and OpenShift, he introduced the third generation OpenShift platform will be based on Docker and Kubernetes Google for development. The atomic is red hat the latest platform projects based on Dokcer and position in the red hat some similar based on VM virtualization platform ovirt. The project itself should be is inspired by the coreos intent to create a suitable on bare metal or is under the environment of cloud platform system, to deploy the application container. At the same time in the original namespace based on Docker and CGroup technology, based on the integration of SELinux support, to further increase the security isolation. At the same time docker also announced based on Red Hat Enterprise Linux and openshift docker's registry and docker commercial deployment support guide a jumpstart project and the plan for the organization or group.

  • 11:10Engineer Databrciks, spark Committer, spark SQL is one of the main developers of Liancheng detailed interpretation of the "spark SQL structured data analysis". He introduced a lot of new features in the Spark1.3 version. Focus on the introduction of DataFrame. Its evolved from the SchemaRDD, to provide a more high-level abstraction of the API, in the form and R and Python is very similar. DataFrame vs.RDD Spark, somewhat similar to the difference between dynamic and static language, in many scenarios, DataFrame advantage is more obvious. In the 1.3 edition, Spark further improve the external data source API, and intelligent optimization. Through light and abstract, DataFrame supports various types of data sources, such as support for Hive, S3, HDFS, Hadoop, Parquet, MySQL, HBase, dBase, etc., so it is easy to carry out various types of data analysis on its basis. Core Spark than the amount of Hadoop code to streamline a lot, SQL Spark code more streamlined, so much more readable.

  • 10:30Tencent senior engineer Wang Lianhui in-depth share of the application and practice of Tencent Spark optimization". At the beginning of this year, the TDW Tencent (tcehy distributed data warehouse) spark cluster has reached the following scale: Gaia cluster nodes, RMB8000 +; HDFS storage space, 150PB+; every new data, 1PB+; every day the number of tasks, 1M+; daily amount of calculation, 10PB+. Wang Lianhui said that Tencent has started from the 2013 version of the Spark 0.6, using the current version of Spark1.2. Typical applications in three areas: predicting the user's ad Click probability; calculating the number of common friends between two friends; SparkSQL and DAG tasks for ETL. Optimization, Tencent to do more in-depth. Such as application development experience; for the ETL job using dynamic resource expansion shrinkage characteristics; Redcue stage in map stage was not completed before the implementation of; partition number based on the data for prediction of the size of the stage; for each session of the SparkSQL assigned a driver, count (distinct) optimization; based on the sort of GroupBy/Join.

  • 10:30Linux contributor, Huawei Linux kernel development engineer Li Zefan today brings the issue is the Linux kernel container technology history, current situation and Prospect of ", he said docker's cornerstone -- CGroup and namespace and other Linux kernel characteristics of the development process. At present, they in the community in how the development situation, and in the current docker triggered boom, the kernel community will provide more perfect container technology from the bottom. Li Zefan believes that the advantages of the container is to close to the physical machine running efficiency provides a virtual function, can achieve a higher density of examples of virtual machines, to achieve the second level of start-up speed. Docker plus the level of innovation, making the release and deployment of software is very convenient. So the container /Docker has a wide range of application scenarios, application in the software architecture reference service architecture. He also pointed out that docker is not mature enough, existing network function weak (such as cross host vessel network interconnection), safety (daemon is a single point of failure), running a traditional business difficulties (such as CT business challenges.

  • 10:002015 Container technology summit, the first lecturer is Tsang Open, Source Contributor Sr., Developer Advocate at Google Ray. He used the time of one hour to explain in detail the Google open source of kubernetes technology, mainly including pod, service, kubelet etc., do some practical demonstration, and to answer the audience, including kubernetes in Google's application. Ray said that Kubernetes itself is the management of the container, but the process on the container to run. Now not only support Docker, but also support Rocket.

  • 09:30Second Spark Technology Summit 2015 speech from Microsoft Asia Research Institute researcher Zhou Hucheng, he shared the theme is Spark And Applications inside Microsoft Ecosystem ", he combined with SparkSQL, GraphX, MLLib and other components, in detail to share the Microsoft internal Spark ecosystem building experience.

  • 09:00OCC second 2015 days, 2015 Spark technology summit, Streaming Tathagata responsible person Das Spark was the first to share. TD first shared the status of the Spark in 2014: the contributor, from 150 to 500; code from 190 thousand lines to 370 thousand lines. At the same time, Spark has been deployed in more than 500 production environments. Then TD summed up the focus of the 2014 spark: enterprise applications; richer library; extended more and higher performance of the core engine; wider out of the box scene. And revealed the direction of the development of Spark in 2015: machine learning, for more people to use; more rich platform interface.

  • 09:002015 Spark technology summit by the yen burst table seven technical director Chen Chao chair. After seeing a lot of students standing in the class, Chen Chao Spark for the development of the situation is pleased.

  • 09:00Sponsored by CSDN, CSDN expert advisory group to support the OpenCloud 2015 conference to enter the second day, the 2015 Container summit opening, the morning of the general assembly by the East chief technical adviser Weng chi. He introduced the summit invited lecturer lineup, and hope that we can have a harvest.

  • 08:552015 Spark and 2015 Container technology summit summit came. All day long 23 lecturers, all dry cargo sharing, buddies, a lecturer in speech you admire the most? @CSDN cloud computing microblogging or micro channel, tell us your choice, there will be a mysterious gift oh!

Sina Weibo (#OCC2015#)@CSDN cloud computing

April 16th

OpenStack combat technology training

Enterprise records co founder Li Mingyu
Zhejiang,, China Cloud Mdt InfoTech Ltd Product Manager
OpenStack actual combat technology training site
OpenStack actual combat technology training site
OpenStack actual combat technology training site

Docker combat technology training

DaoCloud co founder Yu Yong
DaoCould co founder, vice president of R & D Guo Feng
Communication between the audience and the lecturer
Communication between the audience and the lecturer
Docker actual combat technology training site

April 17th

OpenStack Technology Conference

Spark combat technology training

Seven cattle cloud storage technology director Chen Chao
Liancheng Databricks Engineer
Intel China Research Institute senior engineer Yin Xusen
Spark actual combat technology training site
Spark actual combat technology training site

April 18th

Spark Technology Summit

Container Technology Summit

Related information

Assembly schedule