SecTor 2013: Consultant scorns Apache Hadoop’s lack of security

The open source Apache Hadoop platform for big data management has had a meteoric rise since its release in 2006 with an advanced distributed file system, but an industry consultant warns its security protection is the equivalent to software released in 1993.

Kevvie Fowler, a risk consulting partner at KPMG Canada, compared the security in Hadoop to Windows for Workgroups 3.11to an audience at the SecTor security conference in Toronto on Wednesday.

“There’s not a lot of security in the operating system for Windows for Workgroups … and that’s similar to Apache Hadoop.”

And given that Hadoop clusters can hold huge amounts of data, he said the risks are significant.

In fact Fowler was baffled why organizations put up with software that’s so unprotected. But he suggested it follows a pattern.

“Business, to try to improve itself, in a lot of cases trumps security. It’s not the correct approach but if you look at it what business did is took a technology that had no business being in an enterprise and said, ‘You know what? I’m going to become smarter and more agile and make better decisions about my business. I’m going to take this technology, this nuclear waste, and stick it in my organization because it’s going to help me in the immediate future.’  Not looking at the security ramifications.”

Often companies initiate a small big data project, and when it demonstrates business value it is expanded, Fowler said – and by that time it’s too late if there are security holes.

Security professionals need to alert management when projects are at an early stage, he said.

Apache Hadoop isn’t the only version of the platform. A number of software companies have taken it and added capabilities — Intel Hadoop, for example, comes with encryption built-in.

MORE FROM THE CONFERENCE

Are there limits to ethical hacking?

A video tour of the trade show floor

Meanwhile, Fowler offered eight steps to better secure Apache Hadoop custers:

–If you don’t need sensitive financial or personal information, don’t put it in Apache Hadoop. Once in, it’s hard to erase data in the clusters. Obfuscate sensitive data that has to go in – and before it  goes in;

–Use a configuration management tool to deploy and manage nodes and clusters in a consistent way. If necessary there are free services like Puppet;

–Lock the front door. “It’s almost comical” that Hadoop doesn’t have default user authorization. Set that up before allowing any users to access data. He advises using Kerberos – it’s not easy but it offers secure authentication;

–Secure the underlying operating system by hardening servers and encrypt data at rest. If you don’t do this then when anyone logs into the system Apache Hadoop looks like a group of files.

–Use transmission-level security, otherwise data from Hadoop goes through your infrastructure in plain text;

–Have a choke point to stop intruders, such as a VPN to log and control users access before accessing the cluster;

–Secure Hadoop-related applications, such as Apache Hive for creating data warehouses and Apache HBase, a noSQL database. A lot of the SQL injection vulnerabilities in SQL databases are present in HiveQL, he said. And a number of other databases connect directly to Hadoop, he added, so attacks can be layered.

“You can spend all the time in the world securing your Hadoop without securing your applications and you’re going to have a huge disaster on your hands.”

Fowler also noted that latest versions of Hive (Sever 2) have the ability to revoke access to the warehouse, but any version of Hive server 1x only secures metadata and not the underlying data.

–Ensure your incident response and forensics program incorporates big data technology.

Would you recommend this article?

Share

Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.


Jim Love, Chief Content Officer, IT World Canada

Featured Download

Howard Solomon
Howard Solomon
Currently a freelance writer, I'm the former editor of ITWorldCanada.com and Computing Canada. An IT journalist since 1997, I've written for several of ITWC's sister publications including ITBusiness.ca and Computer Dealer News. Before that I was a staff reporter at the Calgary Herald and the Brampton (Ont.) Daily Times. I can be reached at hsolomon [@] soloreporter.com

Featured Articles

Cybersecurity in 2024: Priorities and challenges for Canadian organizations 

By Derek Manky As predictions for 2024 point to the continued expansion...

Survey shows generative AI is a top priority for Canadian corporate leaders.

Leaders are devoting significant budget to generative AI for 2024 Canadian corporate...

Related Tech News

Tech Jobs

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Tech Companies Hiring Right Now