30 Aug / 2011

Hadoop Utility Classes

Some handy classes for using Hadoop / Map Reduce / Hbase

IdentityMapper  / IdentityReducer


org.apache.hadoop.mapreduce.Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

org.apache.hadoop.mapreduce.Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

jar : hadoop-core.jar

if your mappers and reducers write inputs to outputs, then use these guys.  No need to receate them.

Shell  / ShellCommandExecutor

org.apache.hadoop.util.Shell
org.apache.hadoop.util.Shell.ShellCommandExecutor

jar : hadoop-core.jar

handy for executing commands on local machine and inspect outputs

StringUtils

org.apache.hadoop.util.StringUtils

jar : hadoop-core.jar

lots of functions to deal with Strings.  I will highlight a few

StringUtils.byteDesc() : User-friendly / human-readable byte lengths

how many megabytes is 10000000 bytes?   this will tell you.

StringUtils.byteToHexString() : Convert Bytes to Hex strings and vice-versa

We deal with byte arrays in Hadoop / map reduce.  This is a handy way to print / debug issues

StringUtils.formatTime() :  human readable elapsed time

how long is 100000000 ms?   see below

Hadoop Cluster Status

ClusterStatus : org.apache.hadoop.mapred.ClusterStatus

jar : hadoop-core.jar

Find out how many nodes are in the cluster, how many mappers, reducers …etc

Hbase Handy Classes

Bytes

org.apache.hadoop.hbase.util.Bytes

jar : hbase*.jar

handy utility for dealing with bytes and byte arrays

Bytes.toBytes() : convert objects to bytes

Bytes.add()  : create composite keys

Sujee Maniyam
Sujee is a founder, principal at Elephant Scale where he provides consulting and training on Big Data technologies

1 Comment:


  • By Ashish Jha 22 Jan 2015

    Hi ,
    I am getting following error while running Hbase Map reduce program.
    Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
    at mrd.hbase.training.hbasemr.main(hbasemr.java:50)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:266)

    I am writing the code in eclipse there is no error compile time.I am getting this error when i am exporting whole project as a jar and trying to run through below command.

    notroot@ubuntu:~$ hadoop jar lab/programs/HbaseTraining.jar mrd.hbase.training.hbasemr

    My configuration file are
    hbase-site.xml

    hbase.rootdir
    hdfs://localhost:8020/hbase

    dfs.replication
    1

    hbase.master
    localhost:60000

    hbase.cluster.distributed
    true

    hbase.zookeeper.quorum
    localhost

    and java path given in hbase-env.sh
    everything is correct still ia m getting below error.

Leave a Reply



Copyright 2015 Sujee Maniyam (