Article author C. Oliver Andrew is a professional software consultant, as well as North Carolina, Durham, big data consulting company Software Integrators Open, President and founder. Using Hadoop for a long time, he found that these 12 things really affect the ease of use of Hadoop.
Pangu is a distributed file system, Ali cloud flying platform. In the flying 5K project, cluster scale rapid expansion to 5000 nodes, the rapid expansion of the scale, on the dawn of the performance brought no small challenge. This share to ensure high performance and optimization scheme of Pangu Master.
ODPS is a distributed massive data processing platform, which provides rich data processing functions and flexible programming framework. This article from the ODPS challenges, technical architecture, Hadoop migration to ODPS, the application of practical attention to lead us to a preliminary understanding of the status quo and prospects of ODPS.
Spark in memory computing more advantages are well known. However, in many people heart, disk data computing, Hadoop is still hale and hearty. In order to break this misunderstanding, recently Databricks and AWS together to complete a Gray Sort category of Benchmark Daytona, and to create a new record of the test.
In just two years, Drill Apache has received more than 40 companies to support and contribute, and in recent days, more MapR to join the company's big data platform, as a developer preview version of the show to the user.
By CSDN and the "programmer" magazine editor and community co - building, covering large data and Hadoop field, extraction of the most essential Hadoop and big data technology content, four per week to send.