Thomas Dudziak's Blog

How to install Scribe with HDFS support on Ubuntu Karmic

with 9 comments

Prerequisites

Install some pre-requisites (more might be needed, my system had a bunch of things already):

sudo apt-get install bison flex sun-java6-jdk ruby1.8-dev ant

Create a build folder

We won’t install scribe or thrift on the machine itself, instead keep it confined to a folder. For this we should

mkdir scribe-build
cd scribe-build
mkdir dist

The dist folder will contain the binary distribution of scribe once we’re done, including all libraries.

Install Boost

On Ubuntu, you can simply install boost via the package manager:

sudo apt-get install libboost1.40-dev libboost-filesystem1.40-dev

These are the only two parts of boost that are needed. Also, please make sure to get at least version 1.40.

If you want to install from source instead, download boost version 1.40 or newer from http://www.boost.org/ (current version is 1.41.0) and then unpack it into the scribe-build folder. After that, cd to the created folder and build it:

cd boost_1_41_0
./bootstrap.sh --prefix=`pwd`/../dist
./bjam install
cd ..

Install Libevent

Again, libevent can simply be installed via the package manager:

sudo apt-get install libevent-dev

On Karmic, this will install libevent (if not installed already) and libevent development files for version 1.4.11 or newer. If you want to install it from source, download the 1.4.x source distribution from http://www.monkey.org/~provos/libevent/ (1.4.13 is the current version) and unpack it into the scribe-build folder. Then cd into the generated folder and build it:

cd libevent-1.4.13-stable
./configure --prefix=`pwd`/../dist
make
make install
cd ..

Thrift and FB303

Download version 0.2.0-incubating from http://incubator.apache.org/thrift/download and unpack it into scribe-build. This should generate a folder scribe-build/thrift-0.2.0. To build it, run:

cd thrift-0.2.0
export PY_PREFIX=`pwd`/../dist
export JAVA_PREFIX=`pwd`/../dist
./configure --prefix=`pwd`/../dist \
    --with-boost=`pwd`/../dist \
    --with-libevent=`pwd`/../dist
make
make install
cd ..

This will most likely throw an error when trying to setup the ruby binding since it won’t be allowed to write into the system directory. This is due to a bug in the thrift build scripts – there is no way that I could find to tell it to install the ruby bindings locally. However, the things that we want will have been installed successfully, so let’s move on.

Next build the FB303 project:

cd contrib/fb303
export PY_PREFIX=`pwd`/../../../dist
./bootstrap.sh \
    --with-thriftpath=`pwd`/../../../dist \
    --with-boost=`pwd`/../../../dist \
    --prefix=`pwd`/../../../dist
make
make install
cd ../../..

Libhdfs

Scribe currently requires libhdfs 0.20.1 with patches applied – the stock version from the Hadoop 0.20.1 distribution won’t work. You can either use the Cloudera 0.20.1 distribution which has these patches applied, or use a newer version – presumably 0.21 works, but I haven’t tried it.

On Ubuntu, you can either install the Cloudera Hadoop distribution via debian packages, or you can compile it from source. The Debian/Ubuntu setup steps are described here:
http://archive.cloudera.com/docs/_apt.html.

We however are going to compile libhdfs from source to get an independent library. Download from
http://archive.cloudera.com/cdh/testing/hadoop-0.20.1+152.tar.gz
and unpack it into the scribe-build folder. This will create a hadoop-0.20.1+152 folder, so let’s go there:

cd hadoop-0.20.1+152

Unfortunately, we also need to tweak two files by adding this line

#include <stdint.h>

right under the existing

#include <stdint.h>

in these two files

src/c++/utils/api/hadoop/SerialUtils.hh
src/c++/pipes/api/hadoop/Pipes.hh

Once you’ve done that, run:

cd src/c++/libhdfs
./configure --enable-shared \
    JVM_ARCH=tune=k8 \
    --prefix=`pwd`/../../../../dist
make
make install
cd ../../../..

Note that this seems to have been fixed in the 0.20.1+168.89 cloudera release.

Build scribe

Download scribe 2.1 from http://github.com/facebook/scribe/downloads or clone the git repository (git://github.com/facebook/scribe.git). If you download the distribution, unpack it into the scribe-build directory, yielding a scribe-build/scribe- folder. cd to the scribe folder and the run:

cd scribe-2.1
export LD_LIBRARY_PATH="`pwd`/../dist/lib:"\
"/usr/lib/jvm/java-6-sun/jre/lib/amd64:"\
"/usr/lib/jvm/java-6-sun/jre/lib/amd64/server"
export CFLAGS="-I/usr/lib/jvm/java-6-sun/include/ "\
"-I/usr/lib/jvm/java-6-sun/include/linux/"
export LDFLAGS="-L`pwd`/../dist/lib "\
"-L/usr/lib/jvm/java-6-sun/jre/lib/amd64 "\
"-L/usr/lib/jvm/java-6-sun/jre/lib/amd64/server"
export LIBS="-lhdfs -ljvm"
./bootstrap.sh --enable-hdfs \
    --with-hadooppath=`pwd`/../dist \
    --with-boost=`pwd`/../dist \
    --with-thriftpath=`pwd`/../dist \
    --with-fb303path=`pwd`/../dist \
    --prefix=`pwd`/../dist
make
make install
cd ..

Adjust the jre/lib paths in the LDFLAGS to match your environment (e.g. 32bit vs. 64bit). The HDFS/Hadoop path in there is optional (i.e. enabled via the –enable-hdfs option) and only required if you want hdfs support.

Test that it works

Simply start scribe with the library path set correctly:

cd dist
export LD_LIBRARY_PATH="`pwd`/lib"
./bin/scribed ../scribe-2.1/examples/example1.conf

This should generate output like this:

[Tue Jan 19 00:31:07 2010] "STATUS: STARTING"
[Tue Jan 19 00:31:07 2010] "STATUS: configuring"
[Tue Jan 19 00:31:07 2010] "got configuration data from file "
[Tue Jan 19 00:31:07 2010] "CATEGORY : default"
[Tue Jan 19 00:31:07 2010] "Creating default store"
[Tue Jan 19 00:31:07 2010] "configured  stores"
[Tue Jan 19 00:31:07 2010] "STATUS: "
[Tue Jan 19 00:31:07 2010] "STATUS: ALIVE"
[Tue Jan 19 00:31:07 2010] "Starting scribe server on port 1463"
Advertisements

Written by tomdzk

January 19, 2010 at 12:32 am

Posted in Uncategorized

9 Responses

Subscribe to comments with RSS.

  1. […] gathered that building Scribe is notoriously difficult and I’ve found a few installation guides, but mostly for less package-conservative linux distributions than CentOS.  The steps I outline […]

  2. what is the version of your ubuntu ? is it 64bit or 32bit ? I am trying to install scribe on my ubuntu 10.04.1 32bit machine , I don’t know weather there is a limit that only 64bit os can install it .

    panfei

    September 18, 2010 at 4:31 pm

    • I’ve only tried this on Ubuntu Lucid 64bit. Though none of the steps seem to be 64bit specific. What is the error you’re getting ?

      tomdzk

      September 18, 2010 at 5:45 pm

      • Thanks ! I’ve updated the post.

        tomdzk

        October 9, 2010 at 4:51 am

  3. configure: creating ./config.status
    config.status: creating Makefile
    config.status: creating src/Makefile
    config.status: creating lib/py/Makefile
    config.status: executing depfiles commands
    EXTERNAL_PATH /home/flight/DataCenter/facebook-scribe-3f14e93
    flight@flight-laptop:~/DataCenter/facebook-scribe-3f14e93$ make
    make all-recursive
    make[1]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
    Making all in .
    make[2]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
    make[2]: Nothing to be done for `all-am’.
    make[2]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
    Making all in src
    make[2]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
    make all-am
    make[3]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
    g++ -Wall -O3 -L/usr/lib -lboost_system-mt -lboost_filesystem-mt -o scribed store.o store_queue.o conf.o file.o conn_pool.o scribe_server.o network_dynamic_config.o dynamic_bucket_updater.o env_default.o HdfsFile.o -L/usr/local/lib//lib -L/usr/local/lib -L/home/flight/DataCenter/hadoop-0.20.2+320//lib -lfb303 -lthrift -lthriftnb -levent -lpthread -lhdfs -ljvm libscribe.a libdynamicbucketupdater.a
    /usr/bin/ld: cannot find -lthriftnb
    collect2: ld returned 1 exit status
    make[3]: *** [scribed] Error 1
    make[3]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
    make[2]: *** [all] Error 2
    make[2]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
    make[1]: *** [all-recursive] Error 1
    make[1]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
    make: *** [all] Error 2
    flight@flight-laptop:~/DataCenter/facebook-scribe-3f14e93$

    panfei

    September 19, 2010 at 4:59 pm

  4. Hi, thank you for the useful guide. I want to notify you that there is an error in copy and paste of this guide.
    In fact the lines to be added as a patch to hadoop hdfslib files are
    #include
    and
    #include
    I think the line was recognised as a tag.

    Meanwhile the newest version of hadoop (http://archive.cloudera.com/cdh/testing/hadoop-0.20.1+169.89.tar.gz) contains the patch.

    Wildnove

    October 5, 2010 at 7:09 am

    • So

      #include <stdint.h>

      Wildnove

      October 5, 2010 at 7:10 am

  5. Oh man, thankyou soooo much. I was going crazy withouth your hints 🙂 Really, thankyou. Cheers!

    Ste

    May 20, 2011 at 3:08 pm

  6. What a joke. Why can’t the linux world move beyond this kind of bullshit and just give people ready to use binaries?

    d

    June 24, 2011 at 5:38 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: