How to install Scribe with HDFS support on Ubuntu Karmic
Prerequisites
Install some pre-requisites (more might be needed, my system had a bunch of things already):
sudo apt-get install bison flex sun-java6-jdk ruby1.8-dev ant
Create a build folder
We won’t install scribe or thrift on the machine itself, instead keep it confined to a folder. For this we should
mkdir scribe-build cd scribe-build mkdir dist
The dist folder will contain the binary distribution of scribe once we’re done, including all libraries.
Install Boost
On Ubuntu, you can simply install boost via the package manager:
sudo apt-get install libboost1.40-dev libboost-filesystem1.40-dev
These are the only two parts of boost that are needed. Also, please make sure to get at least version 1.40.
If you want to install from source instead, download boost version 1.40 or newer from http://www.boost.org/ (current version is 1.41.0) and then unpack it into the scribe-build folder. After that, cd to the created folder and build it:
cd boost_1_41_0 ./bootstrap.sh --prefix=`pwd`/../dist ./bjam install cd ..
Install Libevent
Again, libevent can simply be installed via the package manager:
sudo apt-get install libevent-dev
On Karmic, this will install libevent (if not installed already) and libevent development files for version 1.4.11 or newer. If you want to install it from source, download the 1.4.x source distribution from http://www.monkey.org/~provos/libevent/ (1.4.13 is the current version) and unpack it into the scribe-build folder. Then cd into the generated folder and build it:
cd libevent-1.4.13-stable ./configure --prefix=`pwd`/../dist make make install cd ..
Thrift and FB303
Download version 0.2.0-incubating from http://incubator.apache.org/thrift/download and unpack it into scribe-build. This should generate a folder scribe-build/thrift-0.2.0. To build it, run:
cd thrift-0.2.0
export PY_PREFIX=`pwd`/../dist
export JAVA_PREFIX=`pwd`/../dist
./configure --prefix=`pwd`/../dist \
--with-boost=`pwd`/../dist \
--with-libevent=`pwd`/../dist
make
make install
cd ..
This will most likely throw an error when trying to setup the ruby binding since it won’t be allowed to write into the system directory. This is due to a bug in the thrift build scripts – there is no way that I could find to tell it to install the ruby bindings locally. However, the things that we want will have been installed successfully, so let’s move on.
Next build the FB303 project:
cd contrib/fb303
export PY_PREFIX=`pwd`/../../../dist
./bootstrap.sh \
--with-thriftpath=`pwd`/../../../dist \
--with-boost=`pwd`/../../../dist \
--prefix=`pwd`/../../../dist
make
make install
cd ../../..
Libhdfs
Scribe currently requires libhdfs 0.20.1 with patches applied – the stock version from the Hadoop 0.20.1 distribution won’t work. You can either use the Cloudera 0.20.1 distribution which has these patches applied, or use a newer version – presumably 0.21 works, but I haven’t tried it.
On Ubuntu, you can either install the Cloudera Hadoop distribution via debian packages, or you can compile it from source. The Debian/Ubuntu setup steps are described here:
http://archive.cloudera.com/docs/_apt.html.
We however are going to compile libhdfs from source to get an independent library. Download from
http://archive.cloudera.com/cdh/testing/hadoop-0.20.1+152.tar.gz
and unpack it into the scribe-build folder. This will create a hadoop-0.20.1+152 folder, so let’s go there:
cd hadoop-0.20.1+152
Unfortunately, we also need to tweak two files by adding this line
#include <stdint.h>
right under the existing
#include <stdint.h>
in these two files
src/c++/utils/api/hadoop/SerialUtils.hh src/c++/pipes/api/hadoop/Pipes.hh
Once you’ve done that, run:
cd src/c++/libhdfs
./configure --enable-shared \
JVM_ARCH=tune=k8 \
--prefix=`pwd`/../../../../dist
make
make install
cd ../../../..
Note that this seems to have been fixed in the 0.20.1+168.89 cloudera release.
Build scribe
Download scribe 2.1 from http://github.com/facebook/scribe/downloads or clone the git repository (git://github.com/facebook/scribe.git). If you download the distribution, unpack it into the scribe-build directory, yielding a scribe-build/scribe- folder. cd to the scribe folder and the run:
cd scribe-2.1
export LD_LIBRARY_PATH="`pwd`/../dist/lib:"\
"/usr/lib/jvm/java-6-sun/jre/lib/amd64:"\
"/usr/lib/jvm/java-6-sun/jre/lib/amd64/server"
export CFLAGS="-I/usr/lib/jvm/java-6-sun/include/ "\
"-I/usr/lib/jvm/java-6-sun/include/linux/"
export LDFLAGS="-L`pwd`/../dist/lib "\
"-L/usr/lib/jvm/java-6-sun/jre/lib/amd64 "\
"-L/usr/lib/jvm/java-6-sun/jre/lib/amd64/server"
export LIBS="-lhdfs -ljvm"
./bootstrap.sh --enable-hdfs \
--with-hadooppath=`pwd`/../dist \
--with-boost=`pwd`/../dist \
--with-thriftpath=`pwd`/../dist \
--with-fb303path=`pwd`/../dist \
--prefix=`pwd`/../dist
make
make install
cd ..
Adjust the jre/lib paths in the LDFLAGS to match your environment (e.g. 32bit vs. 64bit). The HDFS/Hadoop path in there is optional (i.e. enabled via the –enable-hdfs option) and only required if you want hdfs support.
Test that it works
Simply start scribe with the library path set correctly:
cd dist export LD_LIBRARY_PATH="`pwd`/lib" ./bin/scribed ../scribe-2.1/examples/example1.conf
This should generate output like this:
[Tue Jan 19 00:31:07 2010] "STATUS: STARTING" [Tue Jan 19 00:31:07 2010] "STATUS: configuring" [Tue Jan 19 00:31:07 2010] "got configuration data from file " [Tue Jan 19 00:31:07 2010] "CATEGORY : default" [Tue Jan 19 00:31:07 2010] "Creating default store" [Tue Jan 19 00:31:07 2010] "configured stores" [Tue Jan 19 00:31:07 2010] "STATUS: " [Tue Jan 19 00:31:07 2010] "STATUS: ALIVE" [Tue Jan 19 00:31:07 2010] "Starting scribe server on port 1463"
[...] gathered that building Scribe is notoriously difficult and I’ve found a few installation guides, but mostly for less package-conservative linux distributions than CentOS. The steps I outline [...]
Building Facebook Scribe 2.1 on CentOS 5.5 | blog.milford.io
June 18, 2010 at 2:43 pm
what is the version of your ubuntu ? is it 64bit or 32bit ? I am trying to install scribe on my ubuntu 10.04.1 32bit machine , I don’t know weather there is a limit that only 64bit os can install it .
panfei
September 18, 2010 at 4:31 pm
I’ve only tried this on Ubuntu Lucid 64bit. Though none of the steps seem to be 64bit specific. What is the error you’re getting ?
tomdzk
September 18, 2010 at 5:45 pm
Thanks ! I’ve updated the post.
tomdzk
October 9, 2010 at 4:51 am
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating lib/py/Makefile
config.status: executing depfiles commands
EXTERNAL_PATH /home/flight/DataCenter/facebook-scribe-3f14e93
flight@flight-laptop:~/DataCenter/facebook-scribe-3f14e93$ make
make all-recursive
make[1]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
Making all in .
make[2]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
make[2]: Nothing to be done for `all-am’.
make[2]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
Making all in src
make[2]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
make all-am
make[3]: Entering directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
g++ -Wall -O3 -L/usr/lib -lboost_system-mt -lboost_filesystem-mt -o scribed store.o store_queue.o conf.o file.o conn_pool.o scribe_server.o network_dynamic_config.o dynamic_bucket_updater.o env_default.o HdfsFile.o -L/usr/local/lib//lib -L/usr/local/lib -L/home/flight/DataCenter/hadoop-0.20.2+320//lib -lfb303 -lthrift -lthriftnb -levent -lpthread -lhdfs -ljvm libscribe.a libdynamicbucketupdater.a
/usr/bin/ld: cannot find -lthriftnb
collect2: ld returned 1 exit status
make[3]: *** [scribed] Error 1
make[3]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
make[2]: *** [all] Error 2
make[2]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93/src’
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/flight/DataCenter/facebook-scribe-3f14e93′
make: *** [all] Error 2
flight@flight-laptop:~/DataCenter/facebook-scribe-3f14e93$
panfei
September 19, 2010 at 4:59 pm
Hi, thank you for the useful guide. I want to notify you that there is an error in copy and paste of this guide.
In fact the lines to be added as a patch to hadoop hdfslib files are
#include
and
#include
I think the line was recognised as a tag.
Meanwhile the newest version of hadoop (http://archive.cloudera.com/cdh/testing/hadoop-0.20.1+169.89.tar.gz) contains the patch.
Wildnove
October 5, 2010 at 7:09 am
So
#include <stdint.h>
Wildnove
October 5, 2010 at 7:10 am
Oh man, thankyou soooo much. I was going crazy withouth your hints
Really, thankyou. Cheers!
Ste
May 20, 2011 at 3:08 pm
What a joke. Why can’t the linux world move beyond this kind of bullshit and just give people ready to use binaries?
d
June 24, 2011 at 5:38 pm