• 2015 China big data technology conference ended successfully

    The Ninth China big data technology conference was held in Beijing in December 2015 10-12. 16 sub forum contains database, the depth of learning, recommendation systems, security and other six major technology forum, finance, manufacturing industry, transportation, travel, Internet, medical and health, education and other 7 Application Forum and the three hot issues forum. [detailed]

  • Venustech company vice president Pan Zhuting: Interpretation of the 2016 big data technology development trend

    Promote the development of large data platform to drive industry ecology, the depth of the analysis to promote the application of large data intelligence, visualization to promote the popularization of large data. Data management authority and data sovereignty concern, the Internet, finance, health to maintain heat, smart city, enterprise data, industrial data is a new growth point. Open source, evaluation, competition spawned a healthy talent and technology ecology, but the big data security and privacy worrying. [detailed]

Conference highlights

General assembly in December 10th

China Mobile R & D center, general manager of Suzhou research and development center, assistant and CTO Sun Shaoling
Chinese Academy of Sciences researcher, CCF big data expert committee secretary Cheng Xueqi
Department of computer science and engineering, Ohio State University, Zhang Xiaodong
Venustech, vice president of CCF big data expert committee deputy secretary general Pan Zhuting
Alibaba group CTO Wang Jian
China Unicom Group, the general manager of the Ministry of information and e-commerce business division, Fan Jian, chief architect
IBM vice president, general manager of Greater China hardware system Guo Rensheng
HUAWEI IT product line big data solutions planning director Xu Xinghai
HUAWEI telecom software big data chief technical planning Qu Bo
Govan, academician of Chinese Academy of Engineering
Databricks co-founder, Spark chief architect of Xin Shi
China Mobile R & D center, general manager of Suzhou research and development center, assistant and CTO Sun Shaoling
Primeton information technology Limited by Share Ltd CTO Jiao Lieyan
Jingdong cloud platform, the chief architect, Liu Haifeng, head of the Department of system technology
Star Ring Technology founder and CTO Sun Yuanhao
Cloudera company R & D engineers, Kudu inventor Lipcon Todd
Michigan State University professor Jin Rong
Research Institute of computing technology, Chinese Academy of Sciences, deputy researcher (host)
Ant financial services group security intelligence director Chen Jidong (host)
Associate Research Fellow, Institute of computing technology, Chinese Academy of Sciences (host), Jin Xiaolong (host)

December 11th special forum

Big data policies and regulations and standardization of sub Forum

In 11 data policies, regulations and standards of sub forum, five expert at the Telecommunications Research Institute of the Ministry of industry and information policy and economic research institute legal department director Li Haiying, Gartner company Telecom Business Technology Research Department Director Liu Yi, Nanjing University of Posts and Telecommunications Information Industry Development Strategy Research Institute president Wang Chunzheng Hui, Tsinghua University big data technology research center deputy director Lu Wei, China Institute of information communication technology and the standard of the big data project manager Jiang Chunyu etc. is big data policies, regulations and standards of: the keynote speech.[report]

Big data infrastructure sub Forum

11 big data infrastructure sub forum, from Ali cloud, Hulu, Beijing Yi HENGCHUANG sources, Alibaba, enterprises and recorded and the Institute of computing technology experts to share the big data infrastructure from the design, build to the key technical points of various aspects of the platform test standard, practical problems and solutions.[report]

Database sub Forum

11 database forum sub forum, from the South common, Northwestern Polytechnical University, China Mobile Research and development center in Suzhou, East China Normal University, pivotal expert and professor of sharing experience in the era of big data technical data management and transaction processing. The forum is chaired by Zhou Aoying, President of the Institute of data science and engineering, East China Normal University.[report]

Deep learning forum

11 on the afternoon of the depth of learning forum, Zhi Jie Yan horizon robotics technology, senior engineer Yu Yinan, Alibaba iDST voice group of senior experts, Professor of Xiamen University Ji Rongrong, Huazhong University of science and Technology Professor, deputy director of the national security engineering center white Xiang, and Microsoft Asia Research Institute researcher Hong Chuntao shared depth study applied in image recognition, speech recognition, visual search and character recognition, and depth of open source learning framework of evolution.[report]

Financial big data sub Forum

11 major financial data sub forum, from ant finance, safe technology, Huawei, Singapore Management University School of information systems, should letter of the number of technical experts on how to through a large data risk management, credit assessment, the establishment of credit system of the whole society, are discussed.[report]

Industrial and manufacturing large data sub Forum

From Tsinghua University, Baosteel Central Research Institute, Harbin Industrial University, Sany Heavy Industry Huaxing digital company, State Key Laboratory, Hangzhou University of Electronic Science and technology of six experts and professors to share the big data in industrial and manufacturing applications and challenges. This forum by Harbin Institute of Technology, associate professor, doctoral tutor Wang Hongzhi and the Chinese Academy of Sciences Institute of computing technology, deputy researcher Jin Xiaolong Co chaired.[report]

Data market and Trade Forum

11 in the morning market data and transaction of the forum, Zhongguancun High Data Industry Deputy Secretary General of the League of Long Xin He Chen, AsiaInfo intelligence data company datahub product director Gong Jing, data Hall (Beijing) Technology Co., Ltd. co-founder and vice president Xiao Yonghong, Beijing law firm Allen & overy managing partner Wangxin Rui, China's information and Communication Research Institute, senior engineer Han Han shared their experiences and views in the transaction data.[report]

Medical health and biological big data sub Forum

In much attention to health care and biological data sub forum, from the Chinese Academy of Sciences Institute of computing technology, Shenzhen University, Tongji University, national defense science and and Huada gene of 5 experts liuzhiyong, 100053, Huang Deshuang, Shao Liang Peng, Liu is big data in medical health and biological data application, from their studies of delivered a keynote speech.[report]

HUAWEI big data technology

December 12th Special Forum

Network and communications big data sub Forum

12, the network and communications big data sub forum, deputy director of the HUAWEI Noah's Ark laboratory Zhang Baofeng presided over the successful convening. Network and communications, as the basis of the big data platform, although not directly generate a lot of revenue, but it is the future of all things Internet, the basis of information to play the value of security. This forum will focus on the challenges faced by large data network and communications and the exploration and practice of the enterprise in the field of exploration and practice to share.[report]

Big data analysis and ecosystem

12 days of big data analysis and ecological system sub forum, from Hortonworks, IBM, Jingdong, Baidu, eBay, UnionPay Chi Hui and the Nanjing University, seven experts on the development of large data analysis and ecological systems. The forum is chaired by Zhang Guangyan, an associate professor of computer science at Tsinghua University.[report]

Recommended system sub Forum

12 on the morning of the recommendation system of the forum, baidu infrastructure senior architect Shen Guolong, Li Yang, vice president of technology freewheel, Sina Weibo algorithm technical director Jiang Guibin, Jingdong digital marketing data, senior director of Wanhao and hunting hired chief data officer single art share the recommendation algorithm and machine learning applications in different areas of the search, advertising, social, commercial and recruitment.[report]

Big data security sub Forum

On on the afternoon of 12 large data security sub forum, from Alibaba data security Ali data security team director Zheng bin, Tencent secure cloud department assistant general manager Li Xuyang, Qihoo Zhuo 360 senior technical manager, slightly data technology partner Yang Wei, Venus senior researcher Zhou Tao, move software data analysis and mining engineer Gao Jiafeng and colleagues Shenjie, beauty, co-founder and CTO Liang Kun and many participants gather in the same hall, a large field of data security topics expanded share the theme.[report]

Traffic and tourism big data sub Forum

12, transportation and tourism data sub forum, from Beijing city traffic monitoring and dispatching center, China car, High German, Ctrip, road ox, way home six experts and professors shared the government and enterprises use large data in transportation and tourism practice. The forum is chaired by the deputy editor in chief Dong Shixiao CSDN.[report]

Internet big data sub Forum

12 on the afternoon of the Internet data of the forum, the drops of Machine Learning Research Institute of research and Development Director Liu Wei, baidu chief architect, machine translation technology for Yan Wei Liu, Renmin University of China associate professor Zhi Cheng Dou, vice president of state double technology selection, Beijing TRS Information Technology Co., Ltd., vice president of Rui Bao Liu, micro all the tax silver co-founder and coo had source share the data in the mobile travel, machine translation, data platform, analysis engine, credit and other areas of application for people he Zhongjun, Jingdong Mall big data research and development department.[Reported (on)]

Educational data sub Forum

On the morning of 12 educational big data sub forum, who learn with Vice President Luo Bin with big data big data decoding O2O education, explain the age of the Internet, the teacher and the students, the boundary reconstruction space distance was broken; fluent English said co-founder and chief scientist Lin Hui speech large data processing and mining architecture based on said. Analysis of the specific methods and practices to help users more effectively learn the language education; Beijing quantum science and technology limited liability company CTO Ding Wenpeng for the title by Simhash to search and optimization to share their experience; talkweb information architect Xu Chongbo introduced the internal relations and the core of the solution to the problem of image character recognition.[report]

Social governance big data sub Forum

12 social governance data sub forum, from Ali, ZTE, South China Normal University, Peking University, Chinese people's Public Security University, Beijing University of Aeronautics and Astronautics expert and professor to share data in social management, application, technical key points and practical problems reconciliation solution.[report]

Linux on Power IBM algorithm Marathon Challenge

Pan Zhuting: the application of big data will be more civilian

  • Venustech, vice president of CCF big data expert committee deputy secretary general Pan Zhuting

    The greatest advantage of data for decision analysis, in fact, is behind a kind of Internet service system, so a extended to more people's livelihood areas, such as health data, health care service system, the packaging capacity in the back, in front of the need to the system ontology is a people with a good interface, let people very good use. [detailed]

Zhang Jingliang: focus on big data landing in the industry

  • Product manager of big data branch DNT Technology Co., Ltd Zhang Jingliang

    On big data currently in the industry and landing areas of difficulty, branch DNT to trying to do, the idea is as many as possible with existing business data are combined, allow customers to smooth the transition to the new large data business model. [detailed]

Liu Yanwei: Jingdong real time data platform architecture design and implementation ideas

  • Jingdong big data platform for research and development of the person in charge Liu Yanwei

    Jingdong large data platform to support Jingdong orders, businesses, proprietary trading, warehousing, distribution, customer service, finance and o2o across all business, mainly to complete access, storage, computing the three basic work, based on Hadoop offline data platform JDW and Kafka, storm of real-time data platform JRDW. [detailed]

Shi Dongfeng: Exploration and practice of big data algorithms based on Power IBM 8 Platform

  • IBM Greater China hardware system server solutions vice president Shi Dongfeng

    In a private cloud and data center, due to the increase in the number of machines, hardware equipment damage is inevitable, the energy consumption will become users a lot of expenditure, based on generic x86 server of cluster and the horizontal expansion it cloud architectures exist many problems which need to users to face. In this regard, IBM based on Power 8 technology to make the product significantly improved efficiency. [detailed]

Sun Yuanhao: distributed is bound to replace the relational database

  • Star Ring Technology founder and CTO Sun Yuanhao

    Data processing market can be divided into three blocks, trading market, analysis of data, unstructured data, the three pieces of transaction type may be accounted for one third, type analysis accounted for more than a third, looking at the market the development of future Hadoop will completely replace the market analysis, because the former today's performance, function slowly began to surpass the latter node is probably in 2018, and data warehouse will also big change. [detailed]

Ren Xinqi: association mining to make the data more intelligent

  • Bright data SCOPA Product Manager Ren Xinqi

    The traditional big data companies past always blindly stressed can handle much data, achieve what kind of performance, seldom pay attention to these data really can give customers to provide much of the value, only the amount of data processing is truly revolutionary, slightly concerned about the data is can not let these data automatically for the user to provide wisdom, for the user to solve real problems. [detailed]

Jiao Lieyan: traditional enterprises play in the era of big data

  • Primeton information technology Limited by Share Ltd CTO Jiao Lieyan

    In the traditional industries, the financial industry with the Internet big data applications are smaller, mainly people's digital. The financial industry to use big data early historical data query, but now they are also as Internet companies, using a variety of information to the digital behavior, and on the label, so as to realize the marketing, credit, risk assessment, really do use data, so that the value of the data. [detailed]

Assembly Pavilion

Beyondsoft booth
HUAWEI booth
Hunting and recruitment network booth
Ming and a little data booth
Mu class network booth
Primeton booth
Tiancheng Sheng Industry Exhibition
Star Ring Technology booth
Gold data booth
The dawn of the branch
Branch DNT booth
Mobile software booth
Cloudera booth
HGST booth
CSDN software mall booth
View booth

Graphic record

  • December 10th
  • December 11th
  • December 12th
  • 17:40[Internet big data sub Forum] micro public tax bank co founder and COO has published a speech entitled "the application of government data in the field of credit investigation" speech. The existing credit system meet the needs of the financial main defects can include generous: credit evaluation system construction is not perfect, credit evaluation technology is backward, missing data, messy, resource is difficult to unity, evaluation result does not have predictive, evaluation model do not have common applicability. The use of large data platform for technical support, the use of big data technology to optimize performance.

  • 17:30[large data analysis and ecosystem Forum] Nanjing University Department of computer science, Pasa data laboratory of Professor Yi Hwa Huang speech is "octopus (Octopus): the R language cross platform large data in machine learning and data analysis system based on. He talked about big data machine learning is one of the two major aspects of machine learning and big data processing, the intersection of research topics.

  • 17:20[large data security forum] beauty company co-founder and CTO Liang Kun in the keynote speech in the sentry financial real-time wind control system. He said the real-time wind control system for the banking industry to continue to maintain high-speed development is becoming more and more important. Sentry financial real time wind control system is a real time transaction risk assessment system based on big data technology. The working process is, in every single transaction, real-time (1) business system transaction information sending wind control system; (2) to discover the transaction in the presence of abnormal behavior and suspicious scene; (3) according to found "evidence" of computing the risk coefficient of the transaction; (4) risk factor information feedback to the service system. Sentry use of open source components have distributed storage system Cassandra, real-time computing system Storm, distributed consistency protocol implementation ZooKeeper.

  • 17:00[Internet data sub Forum] Beijing TRS Information Technology Co., Ltd., vice president of Rui Bao Liu: problem for the data - the engine of government development in the Internet Era "of speech. Government led the wisdom of urban construction is facing a bottleneck, the wisdom of the city's market driven. Wisdom 1 is to make full use of the Internet channel. Wisdom is a whole, system, can not be separated from the whole ecological chain, the entire ecosystem support. The wisdom of the 2 is to achieve the data support of government regulation: data space and time; real-time comparison, found abnormal; mining data relationship; establish an exponential model. Big data so that each individual is true. Using machine learning to build a risk assessment model of net loan platform, quantitative analysis of the results of docking management, the implementation of supervision, to achieve the protection of the rights and interests of foreign, focusing on monitoring the object of real-time risk analysis. Wisdom 3, more and more people's behavior is digitized, the decision needs data support.

  • 16:50[large data security forum] from Tencent secure cloud department assistant general manager with a speech on the theme is "big data against social cheat cheat". He was first introduced to safety, in fact, including three aspects, including for public opinion, fear of violence, spy on national security; needle for information security, business security enterprise security; and to information disclosure, community fraud of public safety. Although many solutions, including multi dimension data collection, effective dimension selection, machine learning, but the effect of social fraud is not very good, so Tencent previously used in the security business in social fraud, good results, they said for dyeing, including Shuntengmogua find a den of thieves. In addition, in counterwork stage need attention to confidentiality and interfere with each other and other methodology. At last, he introduced the big data evaluation system in detail.

  • 16:40[Social Governance big data sub Forum] Beihang University professor Wu Junjie around the social computing and social public opinion management research progress on contemporary big data and social computing depth sharing. City beating the traffic, environment, energy consumption, medical, emergency and other large data network coordinated evolution of social and physical society, to bring a new opportunity for social perception, big data thinking has been deeply rooted. From the perspective of social system, the innovation system and the main communication mode in the era of big data mainly include cross network information communication, the emergence of group behavior, the coordination of human - machine - object and symbiotic intelligence, multi - Center governance. However, social computing is also faced with big data modeling, short text analysis, big data computing, multi discipline cross and other challenges.

  • 16:30[large data analysis and ecosystem Forum] to UnionPay Hui Chi co-founder and CTO Chiron brought share is UnionPay Hui Chi consumption data of the solutions. UnionPay wisdom relying on the overall situation of large data resources, through the unique technical means, the introduction of the depth of the industry. Data security, data privacy, data property is the big data industry chain three basic. Business intelligence through the collection, management, analysis and transformation of data, so that it can be used to obtain the necessary information. Through big data analysis of customer target characteristics, realize the wisdom of financial credit cycle integration solutions. Wisdom quanyun by collecting all kinds of internal and external data, using the Hadoop infrastructure such as query, statistics and analysis.

  • 16:20[data security forum] data security data security director Zheng Bin Ali Alibaba Group in the "big data under the data security" in his keynote speech to the data flow control as the center of the IT era is coming to be based on data sharing, activate the productive forces for the purpose of the DT era, and big data is a new factor of production. Internet plus the new infrastructure cloud terminal (cloud: cloud computing, big data, Networking: Internet; network; terminal: terminal, APP) is the activation of big data.

  • 16:10[Internet big data sub Forum] Huang Yongjian, vice president of the country's dual technology entitled "mining user behavior data in the gold mine" speech. He believes that the characteristics of the user's behavior data include a large number of data, can be collected, accurate, structured and unstructured. Marketing decision-making process in advance, the performance of 60% of the buyers in contact with the sales staff has completed its decision-making process. The problem: optimization to enhance the user experience? How to better design products in line with the user's interest? Our solution: acquisition, analysis, reporting, decision making, forecasting, and action. He introduces the technology of multi dimensional data analysis by means of an example. Data full association under the big data, can be based on data analysis to solve the difficult problems of enterprises.

  • 16:05[large data analysis and ecosystem sub Forum] eBay software engineer, Apache kylin PMC members Zhong Jian made a speech entitled "Apache kylin data visualization practice". Kylin key lies in its advance calculation, the color, size and other dimensions proposed by the Hadoop for processing. Starting from the eBay and Jingdong two cases, detailed showing Kylin rich visual interfaces and powerful data processing ability. Kylin Zeppelin and Apache were integrated and two development. Developers can develop their own back-end in the Zeppelin architecture, just write interpreter; while the corresponding Zeppelin statement on the other platforms can also run the same.

  • 16:00[Social Governance big data sub Forum] Chinese People's Public Security University professor Mei Jianming to "big data and the prevention of terrorism" as the theme sharing speech. Terrorist organizations use the Internet to carry out financing, money laundering activities, with the characteristics of hidden, real-time, easy to escape regulation. In this era of anti-terrorism prevention is the key. Big data to provide an opportunity to guard against terrorism, but the challenge is also obvious. These challenges come from many aspects, such as technology, law, culture, system and so on. In the aspect of technology, the data has the characteristics of wide source, multi form, high redundancy, fast update and weak association. At the same time, there are many difficulties in the legal, cultural and institutional aspects.

  • 15:40[Internet big data sub Forum] Renmin University of China associate professor Dou Zhicheng published entitled "big data era of Internet analysis engine" speech. There are some problems in the manual editing Directory: navigation is only suitable for a small number of sites; manual editing costs are high, users find sites difficult; find portal sites, rather than looking for information, and user needs. Search engines have not changed the basic model for 20 years: the web page is the basic unit; return a simple result list "blue links ten"; users to obtain information through the results of the reading. This model can not meet the needs of users on a large scale Internet data, some of the higher order information access requirements can not be very good to meet.

  • 15:30[Social Governance big data sub Forum] Chinese Social Science Research Center of Peking University research and development department director Gu Jiafeng bring big data era Chinese social survey of the new concept of science theme speech. Data of human social behavior and environment impact, since its inception in 2006, North China Social Science Research Center has carried out a number of Chinese families tracking survey (CFPS), and a number of actions, based on big data technology precise survey system, survey data element management system and cooperation with Peking University Library, integration of data resources, build a data sharing and utilization of integrated service platform to build.

  • 15:25[big data analysis and ecological system sub Forum] Baidu big data research and Development Engineer clothing country base speech theme is to carry out large data analysis by Elasticsearch". Distributed architecture elasticsearch originally constructed in the Lucene search engine. Inrecent years, full text retrieval system transformation for the data analysis platform. He believes that the rise of Elasticsearch in recent years with the ecology are inseparable. Elasticsearch has advantages such as multi dimension analysis, real time and easy to use.

  • 15:20[Social Governance big data sub Forum] under the auspices of the State Key Laboratory of complex systems management and control of the State Key Laboratory of management and control of the Institute of automation, Chinese Academy of Sciences, under the auspices of Zheng Xiaolong.

  • 15:20[data security forum] minglue technology partner Yang Wei lecture on the theme "enterprise big data platform construction safety, he said that the current enterprise security risks at present not only from outside the enterprise, but also from the internal and external risk and service risk, he introduced two aspects of Hadoop platform" safe "both safe and reliable, he focused on the lack of security configuration of Hadoop platform which is hidden, including: the authentication mechanism of SIMPLE, nobody can impersonate superuser; Linux user group information file access control based on local authority can be exploited by malicious users; no unauthorized data access and coarse-grained data access control, not authorized to obtain key data; open the underlying file storage, file that will lead to steal stolen data content.

  • 15:10[Social Governance big data sub Forum] South China Normal University computer school professor Zhao Gansen shared the "data fusion and security in the big data management". He shared the financial credit mining and criminal accomplice, tracking, based on synergistic blend of tax management, major guess investment cheap governance, synergistic blend of data security control of data of social governance cases. Major governance of guess investment, low intelligence, for example, synergistic blend of relational queries on the first step is to open up the data island, but also to do let isolated sheet association table and data reconstruction.

  • 15:00[Internet big data sub Forum] Jingdong Mall big data research and Development Department of the person in charge of Liu Yanwei published a speech entitled "implementation and application of real time data platform Jingdong" speech. He shared the Jingdong real time data platform architecture and implementation. The basic process is the data through the JDBUS (data through train) to write real-time data bus, and then introduced real-time data platform. Data through the car is a powerful data handling system, its value lies in the realization of the underlying complex technology through the product, so that all people can complete the work of data mining. Real time data bus is the data access and downstream consumption between a temporary data storage and messages in standard format reduce docking cost between the heterogeneous systems, access, multi user consumption, to achieve an asynchronous architecture, a single entity data stored in topic granularity. Real time computing platform is based on Storm to build a flow based computing platform, a unified real-time computing cluster, to achieve the maximum utilization of the company's resources, including human, technology, hardware, etc..

  • 14:50[big data security sub Forum] data analysis and data mining engineer Gao Jiafeng and colleagues to share the "telecom operator information security algorithm research and application of practice" theme speech. Gao Jiafeng said intercepting harassing phone scams and spam messages, based on statistics and business personnel experience of traditional management methods, high cost and low efficiency, how to use the data more efficient completion of the work, Gao Feng show China Mobile based on Algorithms and models of data test process. The big data platform management information platform contains the platform layer, algorithm layer, application layer, can provide a powerful data processing can be used to enhance the overall value of information security, using Spark Hadoop and other open source technology.

  • 14:45[large data analysis and ecological system sub Forum] Jingdong group cloud platform data chief architect Du Yufu shared the topic is to build a large data ecological environment". Data analysis includes data acquisition, storage, modeling, analysis, application. He described in detail the use of the link Spark, GraphX Flume, streaming and other technologies. Ecological significance lies in the joint operations between partners, to achieve a win-win situation. Jingdong ecological cloud is a cloud service available to the user: in the upper analysis tools are provided; in the cloud provides cloud storage, cloud analysis, sea of clouds, and provides developers with the use of data cluster and real-time analysis. Administrator through Jingdong data cloud management data gateway. In the end, he stressed that any data should have its own market, otherwise there is no value.

  • 14:30[Internet big data sub Forum] Baidu director of architects, machine translation technology director He Zhongjun share entitled Internet machine translation of the speech. Previous statistical machine translation needs bilingual translation model, but also requires the language model of the target language. The phrase based approach is shown to be not translated well, and it is difficult to use the global information, the process is complex, and the cost of resources is large. Based on deep learning, the method makes full use of the global information and the fluency of the target text, the model is small, the steps are simple, and the network structure is complex.

  • 14:20[Social Governance big data sub Forum] ZTE soft creative Polytron Technologies Inc Vice Minister of intelligence products Deng Hui shared the "big data in the government management of the application and thinking". The government is a natural big data trader. ZTE soft innovation in the process of cooperation in the process of discovery, relatively easy to solve infrastructure, technical architecture needs to be resolved from the beginning of 0, to solve the problem of government agencies. The key components include the main data management platform, large data acquisition terminal, big data center, big data analyzer, big data server, visualization server, big data client. In the data center, he takes the subscription library model as an example, analyzes the data analysis process. There are difficulties in the process of cooperation, such as complex data format, data quality is not high. Finally, he also shared the big data analysis process of the pit, such as feature extraction, assessment methods to determine the evolution of the algorithm, etc..

  • 14:12[large data analysis and ecosystem Forum] IBM data and analysis of the cause of the big data products Hong, director of Jianxun share is "spark data and design to meet changes. Customers join Spark and IBM began to innovate, to collect massive data analysis applications, such as for chronic disease prevention and so on. Data is the basis of enterprise competition, the research direction of data analysis has been inclined to the manufacturing industry and industry. IBM large data simple analysis framework by the front-end data acquisition, preprocessing, data mining, visual analysis of the composition. SQL Hadoop based on IBM technology, users can access the data in different ways. Finally, he talked about the ecology of big data is not just a Spark or Hadoop, the developer should re use the old technology, leaving the business value to talk about big data, not a real big data.

  • 14:00[large data security forum] focus of Zhou Tao, a senior fellow at Venus sharing is concern "in the safety analysis of how to avoid the" big "data analysis, he first to the enterprise safety data introduced, including safety data of the causes and characteristics and safety of large numbers according to the challenge, he believes that the attack is present in the same way, from the original conventional malicious code to now triggered by the apt data breach, the main countermeasures is to dispose of the incident response of passive mode change, from the more basic data in active threats are found.

  • 13:50[large data analysis and ecological system sub Forum] Hortonworks technician Yu Zhihong (Yu Ted) speech topic is "developments in HBase Recent". He loaded HFile backup from the batch, end to end Offheap read path optimization, HBase on the Slider three part of the detailed explanation of the latest progress of HBase.

  • 13:55[large data security sub Forum] Qihoo 360, senior technical manager Zhuo in his keynote address said to increase defense speed, lack of data on the Internet is the current security encountered two major difficulties. Space for time, is the response strategy proposed by Zhang Zhuo. At the same time, he disclosed Qihoo 360 for the first time, the deep learning is introduced into the field of security, used in the field of net asset identification, identifiable assets, net assets divided, alarm log data analysis.

  • 13:40Big data Internet forum [] by machine learning research institute director Liu Wei introduced how to apply the machine learning algorithm is integrated into the "drops of travel" major product lines, including estimates of real-time traffic detection, Jiejia time (ETA, in space, time and traffic forecast, involving the second discrete prediction function, hundreds of thousands of requests the engine (response), trading solution 500 million times per second, calculate the +NP hard problem, the effect for express carpooling success rate one month doubled), global intelligent dispatch system (which may be early to judge the extraction of tension between supply and demand determine the different time preferences, orders to the driver, car speed (portrait) Hashi algorithm matching), mobile social networking platform.

  • 13:35[Social Governance Forum] big data security department of Alibaba group of big data reference Wei Hong "thoughts Internet plus social governance". Alibaba security department was established in 2005, the current staff of nearly 2000 people, is committed to combat crime O2O (online wind control, the line jointly combat). Cyberspace has changed the physical space, and the traditional governance model has been challenged. Advantages in the Internet, such as data analysis, cloud computing power, but also to give social governance. Using the advantage of network space to create a new governance model, instead of helping to solve the governance problems encountered in physical space.

  • 13:30BDTC 2015 social governance big data sub forum, in the host, ZTE cloud computing and big data chief architect Luo Shengmei presided over the official start.

  • 13:20Large data analysis and ecological system sub forum in the Department of computer science, Tsinghua University, Zhang Guangyan, under the auspices of the.

  • 13:20[Forum] BDTC the afternoon of the third day of the Internet data sub forum officially began under the auspices of the Information School of Renmin University of China, Professor, Deputy Dean, wenjirong Internet data, this forum will cover mobile travel, machine translation, data platform, Internet analysis, financial and other aspects of the Internet.

  • 13:10Big data security forum, officially began in the Venus chief strategy officer under the auspices of Pan Zhuting.

  • 12:20[recommendation system sub Forum] chief data officer of the hunting and recruitment network in the publication entitled "the use of Reinforcement learning algorithm to enhance the effectiveness of the recommendations" of the speech. Reinforcement learning provides a theoretical framework for adaptive intelligent systems. He referred to the slot machine algorithm by leaving a small percentage of the test to observe the return, the best choice for the current strategy. Thomspon algorithm using Bayesian theory, according to the current after sampling, select the largest income. MAB model can help UI optimization, recommendation strategy test, user interest detection, content test. MAB Contextual model can be done more in-depth, better.

  • 12:10[Transport and tourism big data sub Forum] BI director of the way home to share the topic of "way home network data analysis business practice". He says the value of the data lies in the combination of technology and business. Qin Yong through multiple case around the way home business, explain the way home in the calculation of design method and result output, improve the analysis of landing driven business. Finally, he said: "data analysis of a certain point and specific business; data analysis there is no fixed method and mathematical model used in the analysis of data, through data analysis, researchers in the business unit rotation enhance data analysis results of value, strengthening data and business interaction".

  • 12:00[education large data sub Forum] in the application of photo search problem, the image text recognition is the focus of the key is also difficult. The whole recognition problem can be decomposed into key sub problems such as word block extraction, word block recognition, line structure analysis, formula structure analysis and so on. At the scene, the extension information architect Xu Chongbo detailed analysis of these issues inherent in the relationship and the core solution. To block word extraction and recognition as an example, the former logic from the background gray balance, local values, component extraction to the mix into a block of words, the latter to print to the image, and random background color, noise, Gaussian blur, distorted transformations to generate samples.

  • 11:45[network and big data communications Forum] network and big data communications Forum last speaker is from Huawei Noah's Ark lab senior researcher Zeng Jia, his speech is mainly focus on telecom data is a key technology challenge. He was first introduced to the unique advantages of Telecom spatio-temporal data, he said seven dimensional data constitute Telecom spatio-temporal data base, need spatio-temporal data analysis platform and production system docking up, and do a lot of business to try: spatial and temporal label, label credit and real-time analysis applications in real-time advertising, precision marketing, financial credit, mobile digital track and aspects.

  • 11:25[recommendation system sub Forum] Jingdong digital marketing senior director of senior director Wan Hao published a speech entitled big data technology in the application of Jingdong advertising. He believes that big data is not just big data, but more important is the insight into the data. Data insight is to refer to the data mining can be applied to the product, enhance the effect of the law. Advertising is the most important big data applications.

  • 11:15[transportation and tourism data sub Forum] the way cattle big data director Meng Jingci were a speech entitled "passers-by cattle travel data application". At present, the way cattle business model into the tourism business, financial technology, film and television media three. The tourism industry has the characteristics of non standardization, high / low frequency, strong timeliness and high guest unit price, and then derived the resource combination, reducing the risk management and risk management and price management needs. Tuniu using the optimal inventory and pricing model, revenue management, price forecasting based on financial derivatives and other solutions. He concluded that tourism products and financial needs to be closely combined, in order to achieve the best revenue management.

  • 11:10[network and big data communications Forum] from Parallel Technology Co., Ltd. research and development director Huang Xinping share mainly focus on Application of big data technology in 7 * 24 hours of data center online operation system and high performance calculation and analysis of the application of large data, he first provided to parallel technology mainly to provide data center cluster large-scale real-time monitoring and management, application performance management, operation data analytical mining. He stressed that the traditional operation and maintenance services lack of detailed analysis, professional operation and maintenance team and professional data center management software, in the face of increasingly complex systems, to solve the increasingly complex management problems. Parallel scientific and technological innovation of 7 * 24 hours online operation and maintenance, on-site operation and maintenance services for Internet Data Center online operation and maintenance, to reduce the purchase of the pressure of operation and maintenance software and professional administrators, to implement the task of automatic data acquisition, analysis, and he talked about the pre Paramon software acquisition data. Finally, he also introduces the application of the optimization case.

  • 11:05[transportation and tourism data sub Forum] advanced data Ctrip based on large data analysis manager Lei share "Ctrip based data architecture in practice". OTA company data has a large difference in business lines, high complexity of the characteristics of its big data landing scene is often the analysis of reports, user center, etc.. The Ctrip data architecture, data source layer buried specification, standard check; offline & real time data access layer log; engine algorithm layer by engine algorithm, audit portrait; general configuration layer achieve export data to general precision marketing platform Kai Fazhong; precision of application layer recommended column and personalized advertising for.

  • 11:00[Education Forum] data quantum science and technology education CTO Ding Wenpeng share the title "Simhash" based on the weight and search. At present, there are some common detection algorithm, such as Hamming distance, more biased in favor of the mathematical description of the Jaccard index, editing distance and vector space cosine similarity. But the use of universal detection algorithm to compare the efficiency of the similarity of the two topics is relatively low, so we need to use Hashing Locality-sensitive, such as MiniHash and SimHash. The speaker, for example, analyzed how to use the SimHash to carry out the problem to re. In practical use, they directly use the near-duplicate characteristics of clustering, the new title can be positioned to the corresponding cluster index, only with the cluster has been subject to similar will be entered, to prevent the zipper. In the search, will use the k=3 to further go to weight, to ensure that the search results in the best.

  • 10:50[recommendation system sub Forum] Sina micro Bo algorithm technical director Jiang Guibin published entitled "big data driven by the micro Bo social recommendation" speech. He is mainly from the following aspects: the role and positioning of the recommendation, the relationship between big data and recommended, the data driven by the micro blog recommendation, business recommendation. He thinks it is recommended to play the role of accelerators and regulators. Accelerator is to accelerate the spread of quality information, accelerate the construction of high value relationships, accelerate the growth of users. Control device is to optimize the structure of the user relationship network, control and initiation of information dissemination of information.

  • 10:40[network and communication data sub Forum] China Electronics Technology Group Corporation No. 7 Research Institute deputy chief engineer and Communications Design Institute chief engineer Shen Wenming in his keynote address said when confronting the challenges brought by the development of the mobile network, operators need more scientific planning network, intelligent network optimization optimization. How to make network planning and Optimization Based on big data, he referred to the "service + software" model, based on a large data platform, the service capabilities, service content, service team integration. In order to achieve the planning design, the company is trying to break through two key capabilities: multi vendor / multi interface data analysis capabilities, network side high precision positioning technology.

  • 10:30[recommendation system sub Forum] the next speech by the fine master technology (ADMaster) technology, vice president and chief architect lu. Lu billion Lei simply introduced the next three guests different topics of direction, including social networking, electricity suppliers and recruitment, and please the next lecturer on the stage speech.

  • 10:20[transportation and tourism data Forum] High German traffic data, a senior expert Fangxing speech topic is "big data to guide travel". He first demonstrated the data analysis of the domestic traffic situation. High moral map through real time data processing, the release of real-time traffic and events, real-time traffic and navigation. Its technical architecture, including: the front desk applications online services, Hbase and other online data storage, Ali ODPS data unified storage, production server real-time log collection. By combining with the existing road network, the road network, combined with the historical speed data to achieve road network data mining, traffic prediction, rendering, release.

  • 10:00[recommendation system sub Forum] FreeWheel Technology Vice President Li Yang published a speech entitled "FreeWheel big data based on the emerging video advertising measurement practice" speech. He mainly from the following three aspects: the emerging video advertising measurement methods, the user's gender age prediction, the completion rate of advertising. He mentioned that the challenge of advertising measurement is still the actual effect of advertising is difficult to be measured, the lack of effective measurement of video advertising. At present, most of the measurements are based on user feedback. New video advertising measurement methods include three kinds: Digital ratings, advertising completion, visible exposure. Video and web page feature extraction of the specific approach is: Based on the name and a brief description, extracted two features, that is, Genre and Topic, with similar users to smooth optimization. The characteristics of advertising completion rate prediction include three dimensions: Ad, User and Context.

  • 09:50[education data sub forum in English fluently says co-founder and chief scientist linhui explain the voice data and its application in language learning. Large data volume, four aspects of variety, velocity and veracity, fluent in English said to adaptive learning, the accumulation of the 2.5 million hours of speech data, this also means that single speech recognition requires 60 years (assuming real-time rate of 0.2), the need for 2 million + CPU cluster can in a short period of time processing is complete. Fluent speech data processing architecture includes real-time algorithm services, voice analysis services, depth model, redis, Kafka, etc..

  • 09:40[Forum] China Telecom cloud computing branch company data division chief data analyst Zhang Zhuyu speech focused in the China Telecom data application, him from the ability of telecom data expansion, he said telecommunications thousands of nodes in the data processing ability of network and communication data, daily gathering 500 million data. Then he introduced the specific process of China Telecom Data Platform for production and service, among them, data access capability 50tb, conduction ability 100TB calculated 200tb. After he made a detailed introduction to the application of China Telecom data, including chart for the risk prevention and control, precision marketing; Kunpeng the people portrait business location, flow thermography; for large data PAAS platform of dragon, open telecom level security open data. Finally, he also stressed that China Telecom has always uphold the openness and integration, the establishment of large data applications ecology.

  • 09:30[transportation and tourism data sub Forum] bring China car chief architect Li Laisi sharing "Shenzhou car large spatiotemporal data processing practice." Data driven Shenzhou car in, security is the basis, technology management and a two pronged approach; efficiency is the key, technology; growth is the target. Through the vehicle networking /OBD data real-time processing, at the same time off-line analysis machine learning to form a prediction model for the service front end scheduling, pricing. In space, the grid is divided into the grid, according to the time of the grid to supply and demand forecast, improve the efficiency of scheduling. In the end, he said that companies can be non core components, large-scale marketing activities, third party cooperation, short-term computing intensive tasks in the cloud deployment (Iaas).

  • 09:20[recommendation system sub Forum] Baidu infrastructure department senior architect Shen Guolong published a speech entitled "BML Baidu large-scale machine learning cloud platform practice". He refers to the process of large data processing includes six modules: data, mobile phones, storage, deformation, analysis, business scenarios. And, he shared the Baidu big data processing infrastructure, mainly to explain the large-scale machine learning algorithm framework ELF (Learning Framework Essential), and its characteristics are summarized as easy to use, efficient. He summed up the elements of machine learning success: first, data. Including data collection and multiple sets of data to get through, clear, clear, "clean" data source, Online & Offline data combination; two, system. Fast, low cost implementation, support for the rapid expansion of the scale of the efficient algorithm library, Test AB and model iteration mechanism; three, evaluation criteria. Coverage, confidence, difference, adoption rate, novelty, privacy, prediction of Auc, NDCG, income fluctuation, artificial experience and other indicators, the impact on the overall system.

  • 09:15[network and communication data sub forum of China Mobile Company Fujian Network Management Center Deputy General Manager Wei Min Yang said in his keynote address, in "the experience is king" era of mobile Internet, traditional network operation and maintenance system, has been difficult to adapt to the developing situation. Therefore the enterprise to build customer-oriented perception active operation and maintenance system "five yuan fifth order sample space replacement method put forward ideas., and strive to build a centralized management of the ecological system of performance, and continue to develop the application of letting a hundred flowers bloom.

  • 09:10[recommendation system sub Forum] Beijing Ming Software System Co., Ltd., co founder and CTO Feng Shicong announced the start of the recommendation system forum.

  • 9:08[transportation and tourism data Forum] Beijing city traffic monitoring and Dispatching Center Deputy Director Zhang speech topic is "Beijing city comprehensive traffic data system construction and application of. He introduced the Beijing city traffic operation monitoring and dispatching center (TOCC) basic situation. Next, Zhang talked about the monitoring data system for comprehensive traffic operation analysis in Beijing city. Based on the analysis of urban road network traffic, rail transit, taxi operation analysis, automatic generation of multi granularity integrated traffic analysis report. At the end of his speech, he said that through the open sharing of data resources, tools and environment, to provide an integrated support for professional institutions and personnel.

  • 09:052015 China big data technology conference of traffic and tourism data sub forum officially began, the forum by CSDN deputy editor in chief Dong Shixiao chair.

  • 09:05Network and communications big data sub forum in the HUAWEI Noah's Ark laboratory assistant director Zhang Baofeng under the auspices of the beginning.

  • 09:03Recommended system sub forum in Beijing, Ming and a software system Co., Ltd. co founder and CTO Feng Shicong under the auspices of the beginning.

  • 09:00[education big data sub Forum] with Luo Bin, vice president of large data sharing the "big data decoding education O2O". Who learn team entrepreneurship has a year and a half. He shared the value of big data at the marketing level, how to accurately find and attract potential users through a reasonable data analysis. Luo Bin they analyzed the user conversion rate of multiple channels. They found that users get information on the channel fragmentation, a lot of users in the choice of courses, teachers, users more trust in the relationship between the recommendation. Internet platform for teachers to show a high degree of education and youth trends, the effective teacher in the platform, there are 42% individual teachers rather than from institutions. On the platform, there are 26.2% teachers will buy other teachers courses, teachers and students in the Internet era will have a greater change in the Internet era. Finally, he shared the teacher platform on the GMV (transaction volume) maximum search model, as well as the value of the platform mechanism and data information to the teacher disclosure.

  • 09:00BDTC2015 China big data technology conference third days to start the agenda

Sina Weibo (2015# #BDTC)@CSDN cloud computing

The conference highlights

Sign in front of the crowded
Participants exchange business cards
CCTV reporter in the field interview
BTV reporter on the spot interview
General partner wall
Stand in front of the agenda of the participants

Assembly schedule

General assembly in December 10th
timeIssuedistinguished guest
09:00-09:05General meeting and guests at the meetingSun ShaolingAssistant to general manager of China Mobile Suzhou R & D center and CTO
09:05-09:10CCF big data expert committee secretary general speechCheng Xueqi   
Institute of computing, Chinese Academy of Sciences researcher, CCF big data expert committee of the Secretary General
09:10-09:15Speech by the chairman of the conferenceZhang XiaodongDirector of the Department of computer science and engineering, Ohio State University, Professor M. C ritchfield Robert
09:15-09:35Interpretation of the development trend of big data technology in 2016Vice President Pan Zhuting venustech, Deputy Secretary General of CCF big data expert committee
09:35-10:15Internet, data and ComputingWang Jian Alibaba group CTO
10:15-10:50Discussion on the application of network and communication big data in the field of tourism and credit investigationFan Jian, deputy general manager and chief architect of China Unicom Group Co., Ltd.
10:50-11:25The future of cognitive workload requires a new IT infrastructureGuo Rensheng IBM, vice president, general manager of Greater China hardware system
11:25-12:00Continuous support business innovation big data platform and business practiceXu Xinghai HUAWEI IT product line big data solutions planning director
District wave HUAWEI telecom software big data chief technology planning
13:00-13:40Analysis and search of multimedia dataAcademician of Chinese Academy of Engineering
13:40-14:15Spark development: review 2015, outlook 2016Xin Shi Databricks co-founder and chief architect Spark
14:15-14:50Cold thinking of big data heatSun Shaoling, general manager of China Mobile Suzhou R & D center assistant and CTO
14:50-15:25Big data era of enterprise business system to 3 TransformationJiao Lieyan Primeton information technology Limited by Share Ltd CTO
15:25-16:00From 2014 to 2016, the evolution of large-scale memory databaseLiu Haifeng Jingdong cloud platform, the chief architect of the Department of system technology
16:00-16:35The technological evolution and key characteristics of modern data warehouseSun Yuanhao star ring technology founder and CTO
16:35-17:10Algorithms for Big Data: Randomized Making the Impossible PossibleJin Rong tenured professor at Michigan State University
17:10-17:45Storage for Fast Analytics on Fast Data Kudu:Lipcon Cloudera Todd company R & D Engineer, Kudu inventor
December 11th -12 12, special forum
December 11th (Friday)


 Policies and regulations and standardization
Chairman of the forum:
Sun Shaoling
 data base
Chairman of the forum:
Zhou Aoying, Qian Ling
 Financial big data
Chairman of the forum:
Chen Jidong, Wang Jianzong
 Data market and transactions
Chairman of the forum:
He Hongling, Qi Hongwei
afternoon Big data infrastructure
Chairman of the forum:
Zhang Wensong, Zhou Haojie
 Deep learning
Chairman of the forum:
Shan Shiguang, Yu Kai
 Industry and manufacturing industry big data
Chairman of the forum:
Wang Hongzhi, Jin Xiaolong
 Medical health and biological data
Chairman of the forum:Hu Bin
HUAWEI big data technology (need to apply separately)

December 12th (Saturday)


 Network and communications big data
Chairman of the forum:
Zhang Baofeng, Zhang Yunyong
 Recommendation system
Chairman of the forum:
Feng Shicong, Lu Yilei
 Traffic and tourism big data
Chairman of the forum:
Dong Shixiao
 Education big data
Chairman of the forum:
Lin Shiding, Yang Dong


 Large data analysis and ecological system
Chairman of the forum:
Zhang Guangyan
 Big data security
Chairman of the forum:
Pan Zhuting, Tan Xiaosheng
 Internet big data
Chairman of the forum:
Wen Jirong, Liu Jiang
 Social governance big data
Chairman of the forum:
Luo Shengmei, Ceng Dajun
all dayLinux on Power IBM algorithm Marathon Challenge (need to apply separately)