Wednesday, July 24, 2013

OSCON Getting Hadoop, Hive and HBase up and running

The Getting Hadoop, Hive and HBase up and running in fifteen minutes page on OSCON.

Mark Grover (Cloudera)
4:10pm Wednesday, 07/24/2013

Hadoop is a distributed batch processing system
Installation and configuring hadoop projects is hard
Integration testing not done across versions.

Apache Bigtop addresses this issue
- generates packages of various Dadoop ecosystem componenets for various distros
- provides deployment code for various projects
- convenience artifacts available e.g hadoop-conf-pseudo to fake multiple nodes on local workstation
- integration testing of latest project releases

Follow steps on wiki

github.com/markgrover/oscon-bigtop/blob/master/readme.md

Hive
- allows SQL syntax query into hadoop
- don't claim to be SQL complaint but are very close. do not support correlated sub-queries.

Code for demo:
github.com/markgrover/oscon-bigtop

No comments:

Post a Comment