Intel, EMC release Hadoop distributions

Supporters of using Apache Hadoop for processing big data have two big boosters on their side: Intel Corp., and EMC Greenplum, which this week released their own distribution of the Hadoop software for storing and processing large amounts of data.
Chipmaker Intel said Tuesday that its version of the open source software, which includes a manager for deployment, is optimized for its Xeon processors and includes encryption that supports Intel’s AES New Instructions for security on its CPUs.
Meanwhile on Monday the Greenplum division of EMC announced Pivotal HD, which integrates the Greenplum database with Apache Hadoop.
 
They join three other commercial Hadoop distributions — Cloudera, Hortonworks and MapR — as likely to appeal to organizations.
 
Forrester Research analyst Mike Gualtieri found the announcements exciting. It makes sense for Intel to get into the fray because Hadoop is a storage and data processing platform, he said. As a chip maker it can help with getting data more efficiently into Hadoop, he added. He also noted that Intel says it’s not trying to compete with enterprise software companies, who try to lock customers in to their technologies.
 
Gualtieri was less impressed with Greenplum’s announcement, even though its Pivotal HD software includes a SQL database. One of Hadoop’s shortcomings is accessing data through SQL. However, he noted that there are other solutions. Cloudera is working on a project called Impala to put a fast SQL layer on top of Hadoop, he pointed out.

The ability to process big data – broadly defined as data bigger than most analytics software can handle – could bring big benefits to business, argues Intel. But “only a small fraction of the world is able to extract meaning from all of this information because the technologies, techniques and skills available today are either too rigid for the data types or too expensive to deploy.”

The optimizations made for the networking and IO technologies in the Intel Xeon processor platform also enable new levels of analytic performance, Intel said in a news release.
RELATED CONTENT
Analyzing one terabyte of data, which would previously take more than four hours to fully process, can now be done in seven minutes, it claims, thanks a combination of Intel hardware and the company’s Hadoop distribution.

The proprietary management software is aimed at simplifying the deployment, configuration and monitoring of the Hadoop processing cluster. Optimal performance can be had through an automatic tuner, Intel says.

Intel [Nasdaq: INTC] said it is also contributing enhancements to the open source code covering the YARN distributing processing framework, the Hadoop Distributed File System and the Hive SQL Query functions.

Intel Distribution for Apache Hadoop will be sold with technical support by solution and service providers. Support options include 24 hour-seven day coverage, or eight hour a day five days a week.

Partners supporting Intel Hadoop include Cisco Systems Inc., Cray, Dell, Red Hat, SAP, SAS, Teradata and a range of others.
 
Greenplum says its distribution significantly expands Hadoop by adding tools including a command centre for monitoring the file system, virtualization extensions and Isilon support; installation, configuration and management tools; and support for the Spring framework.

It also includes a relational database Greenplum calls HAWQ that has its own execution engine.

According to a Grenplum blog. HAWQ is “hundreds of times faster” than Hadoop’s HIVE data warehouse.

Would you recommend this article?

Share

Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.


Jim Love, Chief Content Officer, IT World Canada

Featured Download

Howard Solomon
Howard Solomon
Currently a freelance writer, I'm the former editor of ITWorldCanada.com and Computing Canada. An IT journalist since 1997, I've written for several of ITWC's sister publications including ITBusiness.ca and Computer Dealer News. Before that I was a staff reporter at the Calgary Herald and the Brampton (Ont.) Daily Times. I can be reached at hsolomon [@] soloreporter.com

Featured Articles

Cybersecurity in 2024: Priorities and challenges for Canadian organizations 

By Derek Manky As predictions for 2024 point to the continued expansion...

Survey shows generative AI is a top priority for Canadian corporate leaders.

Leaders are devoting significant budget to generative AI for 2024 Canadian corporate...

Related Tech News

Tech Jobs

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Tech Companies Hiring Right Now