My favorites | Sign in
Project Logo
                
Search
for
Updated Sep 19, 2009 by vicaya
Labels: Featured
UpAndRunningWithHadoop  
Instructions on how to get Hypertable up and running with Hadoop (HDFS)

Starting the Servers

Step 1. Synchronize clocks on all machines

The system cannot operate correctly unless the clocks on all machines are synchronized. Use the Network Time Protocol (ntp) to ensure that the clocks get synchronized and remain in sync. Run the 'date' command on all machines to make sure they are in sync. The following Capistrano shell session show the output of a cluster with properly synchronized clocks.

cap> date
[establishing connection(s) to motherlode000, motherlode001, motherlode002, motherlode003, motherlode004, motherlode005, motherlode006, motherlode007, motherlode008]
 ** [out :: motherlode001] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode002] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode003] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode004] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode005] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode007] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode006] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode000] Sat Jan  3 18:05:33 PST 2009
 ** [out :: motherlode008] Sat Jan  3 18:05:33 PST 2009

Step 2. Install and start Hadoop.

Step 3. Create the directory /hypertable in HDFS and make it writeable by all.

$ hadoop fs -mkdir /hypertable
$ hadoop fs -chmod 777 /hypertable

Step 4. Edit the config file conf/hypertable.cfg Change the following property to point to the Hadoop filesystem that got up and running in step 2 (assuming hdfs://motherlode000:9000):

HdfsBroker.fs.default.name=hdfs://motherlode001:9000

Change the following two properties to point to the location of the Hypertable Master and Hyperspace (assuming motherlode001):

Hyperspace.Master.Host=motherlode001
Hypertable.Master.Host=motherlode001

Step 5. Configure Capistrano for your specific cluster and HDFS. See How to Deploy Hypertable for details. The following is an example of how the variables at the top of the Capfile might be changed for HDFS.

------------- Capfile ----------------
set :source_machine, "motherlode000"
set :install_dir,    "/opt/hypertable" 
set :hypertable_version, "0.9.2.7"
set :default_dfs, "hadoop"
set :default_config, "/opt/hypertable/cluster1-standard.cfg"

role :master, "motherlode001"
role :slave,  "motherlode001", "motherlode002", "motherlode003", "motherlode004", "motherlode005", "motherlode006", "motherlode007", "motherlode008"

Step 6. Compile the Hypertable code and install under the installation directory (e.g. /data1/doug/hypertable)

Step 7. Distribute the installation

$ cap dist

Step 8. Start the servers

$ cap start

Now you sould be able to run the ~/hypertable/bin/ht shell HQL command interpreter and start playing around.

Stopping the System

$ cap stop

Comment by get2srinath, Feb 07, 2008

how to for windows?

Comment by dr.chamberlain, Feb 14, 2008

hypertable?$ bin/start-master.sh hadoop DfsBroker? (hadoop) hasn't come up yet, trying again in 5 seconds ... Num CPUs=2 HdfsBroker?.Port=38030 HdfsBroker?.Reactors=2 HdfsBroker?.Workers=20 HdfsBroker?.Server.fs.default.name=hdfs://localhost:9000 org.apache.hadoop.ipc.RPC$VersionMismatch?: Protocol org.apache.hadoop.dfs.ClientProtocol? version mismatch. (client = 14, server = 20)

at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:253) at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:141) at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:153) at org.apache.hadoop.dfs.DistributedFileSystem?.initialize(DistributedFileSystem?.java:66) at org.apache.hadoop.fs.FileSystem?.get(FileSystem?.java:159) at org.apache.hadoop.fs.FileSystem?.getNamed(FileSystem?.java:118) at org.apache.hadoop.fs.FileSystem?.get(FileSystem?.java:90) at org.hypertable.DfsBroker?.hadoop.HdfsBroker?.<init>(HdfsBroker?.java:71) at org.hypertable.DfsBroker?.hadoop.main.main(main.java:136)
Problem statring DfsBroker? (hadoop)

Does this means I have a hadoop release that's too new? Could you add to this wiki the version of hadoop that the current hypertable release supports?.

Thanks a lot, Can't wait to get this thing up and running to migrate all our "vintage" systems.

Comment by dr.chamberlain, Feb 14, 2008

Found the fix. Replace your hypertable_installation_dir/lib/java/hadoop-core-x-x-x.jar file with the one that comes in your hadoop_installation_dir/hadoop-core-x-x-x.jar.

Comment by chrulle, Apr 08, 2008

Is it necessary to turn on extended attributes on all nodes?

Comment by vicaya, Apr 08, 2008

@chrulle, Only the node running hyperspace needs it.

Comment by sydney.pang, Jul 28, 2008

I believe the kill-servers.sh script is now changed to stop-servers.sh

Comment by sydney.pang, Aug 15, 2008

I'm having difficulty with these instructions running Ubuntu (Hardy). First, Rubygems does not correctly install Capistrano. I get a command not found error after 'sudo gem install capistrano'. I was able to find a fix for this, but that is not related to Hypertable so I will leave out the details.

Once Capistrano is correctly installed, I can get 'cap dist' to work properly and distribute files across my cluster, but I get errors when I try to do 'cap start'. The error messages read 'error while loading shared libraries: libHyperDfsBroker.so: cannot open shared object file: No such file or directory.' In the past, I have gotten past this problem by setting the LD_LIBRARY_PATH environment variable locally on each machine, so what I did to fix this was to add a line in the start_master task, just before the line in which start-dfsbroker.sh is called, which reads 'export LD_LIBRARY_PATH=#{install_dir}/#{hypertable_version}/lib/ &&'.

Now, my problem is that I am getting timeout errors from the DfsBroker? which reads:

Waiting for DfsBroker? (hadoop) to come up... Waiting for DfsBroker? (hadoop) to come up... Waiting for DfsBroker? (hadoop) to come up... ERROR: DfsBroker? (hadoop) did not come up

I have hadoop up and running on all of the machines that I am working with, so I'm just a little confused right now because this worked with the previous version of Hypertable that I used (0.9.0.7).

Any thoughts?

Comment by jamesserver, Dec 02, 2008

Hello,

When I run "cap dist" i get the following output: hadoop@hypertable1:~/hypertable/conf$ cap dist

  • executing `dist'
  • transaction: start
    • executing `copy_config'
    • executing "cp /home/hadoop/hypertable/conf/cluster1-standard.cfg /home/hadoop/hypertable/0.9.0.12/conf"
    • servers: ["localhost"]
connection failed for: localhost (TypeError?: no implicit conversion from nil to integer)

Does any one know what this error is? Are there instructions available that do not use cap?

Comment by trungntbk, Dec 08, 2008

Hi all,

When I ran "cap dist", I don't know "cluster1-standard.cfg" format.

Could you show me how to configure thi file?

Thnaks a lot!

Trung

Comment by trungntbk, Dec 29, 2008

Hi, when I ran hypertable under hadoop in a single machine. I received error: Listening for transport dt_socket at address: 8000 Num CPUs=2 HdfsBroker?.Port=38030 HdfsBroker?.Reactors=10 HdfsBroker?.Workers=10 HdfsBroker?.Server.fs.default.name=hdfs://localhost:16000 Dec 30, 2008 10:43:03 AM org.hypertable.DfsBroker?.hadoop.HdfsBroker? <init> SEVERE: ERROR: Unable to establish connection to HDFS. ShutdownHook? called Exception in thread "Thread-1" java.lang.NullPointerException?

at org.hypertable.DfsBroker?.hadoop.main$ShutdownHook?.run(main.java:69)
Does any one know what this error is? How can I fix this error?

Thanks a lot! Trung

Comment by sydney.pang, Jan 26, 2009

UPDATE: For 'cap start' failure with error that reads "cannot open shared object file ... libHdfsBroker.so". The following fix worked for me.

Add <HYPERTABLE>/<VERSION>/lib to /etc/ld.so.conf.d/hypertable.conf

Then, run 'sudo ldconfig'

Comment by sydney.pang, Jan 26, 2009

For those who are getting errors on 'cap start' that reads:

sh: let: not found

Check your default shell using 'which sh', then 'ln -l <answer from previous command>'

If you are on Ubuntu, chances are it is 'dash'. To change the default shell from 'dash' to 'bash', do the following:

'sudo update-alternatives --install /bin/sh sh /bin/dash 1', then 'sudo update-alternatives --install /bin/sh sh /bin/bash 1', then 'sudo update-alternatives --config sh' and choose 'bash'.

Comment by vicaya, Jan 26, 2009

Thanks for the comment, Sydney. We'll put the random sleep stuff in a script for the next release.

Comment by michael.armbrust, Jan 27, 2009

Hey Trung,

I was having a similar problem, and it was due to the fact that I was using an older version of hadoop. Check your hadoop logs for messages like "Incorrect header or version mismatch", and if you see them upgrade to 0.19.0

Michael

Comment by dlogothetis, Feb 20, 2009

Hi, Do they different instances of the servers output logs in some directory? I only see a single log file under $HYPERTABLE_HOME/log on a cluster with 10 nodes.

Thanks.

Comment by burcsahinoglu, Mar 01, 2009

i have just tried to setup hypertable in conjunction with hadoop and i am having a problem with its startup. I have a setup with two machines (vmware instances), hadoop is running fine. At startup everything seems to go fine until I start seeing:

[out::machinename] Waiting for Hypertable.RangeServer to come up ...]"

When I take a look at the logs i see the following line in DfsBroker?.hadoop.log

INFO: [/192.168.223.139:50655 ; Sun Mar 01 10:42:54 EET 2009] Connection Established
Mar 1, 2009 10:42:54 AM org.hypertable.DfsBroker.hadoop.ConnectionHandler handle
INFO: [/192.168.223.139:50655 ; Sun Mar 01 10:42:54 EET 2009] Disconnect - COMM broken connection : Closing all open handles from /192.168.223.139:50655
Closed 0 input streams and 0 output streams for client connection /192.168.223.139:50655"

Note: is this the best place to post installation issues?


Sign in to add a comment
Hosted by Google Code