|
UpAndRunningWithHadoop
Instructions on how to get Hypertable up and running with Hadoop (HDFS)
Starting the ServersStep 1. Synchronize clocks on all machines The system cannot operate correctly unless the clocks on all machines are synchronized. Use the Network Time Protocol (ntp) to ensure that the clocks get synchronized and remain in sync. Run the 'date' command on all machines to make sure they are in sync. The following Capistrano shell session show the output of a cluster with properly synchronized clocks. cap> date [establishing connection(s) to motherlode000, motherlode001, motherlode002, motherlode003, motherlode004, motherlode005, motherlode006, motherlode007, motherlode008] ** [out :: motherlode001] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode002] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode003] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode004] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode005] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode007] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode006] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode000] Sat Jan 3 18:05:33 PST 2009 ** [out :: motherlode008] Sat Jan 3 18:05:33 PST 2009 Step 2. Install and start Hadoop. Step 3. Create the directory /hypertable in HDFS and make it writeable by all. $ hadoop fs -mkdir /hypertable $ hadoop fs -chmod 777 /hypertable Step 4. Edit the config file conf/hypertable.cfg Change the following property to point to the Hadoop filesystem that got up and running in step 2 (assuming hdfs://motherlode000:9000): HdfsBroker.fs.default.name=hdfs://motherlode001:9000 Change the following two properties to point to the location of the Hypertable Master and Hyperspace (assuming motherlode001): Hyperspace.Master.Host=motherlode001 Hypertable.Master.Host=motherlode001 Step 5. Configure Capistrano for your specific cluster and HDFS. See How to Deploy Hypertable for details. The following is an example of how the variables at the top of the Capfile might be changed for HDFS. ------------- Capfile ---------------- set :source_machine, "motherlode000" set :install_dir, "/opt/hypertable" set :hypertable_version, "0.9.2.7" set :default_dfs, "hadoop" set :default_config, "/opt/hypertable/cluster1-standard.cfg" role :master, "motherlode001" role :slave, "motherlode001", "motherlode002", "motherlode003", "motherlode004", "motherlode005", "motherlode006", "motherlode007", "motherlode008" Step 6. Compile the Hypertable code and install under the installation directory (e.g. /data1/doug/hypertable) Step 7. Distribute the installation $ cap dist Step 8. Start the servers $ cap start Now you sould be able to run the ~/hypertable/bin/ht shell HQL command interpreter and start playing around. Stopping the System$ cap stop |
Sign in to add a comment
how to for windows?
hypertable?$ bin/start-master.sh hadoop DfsBroker? (hadoop) hasn't come up yet, trying again in 5 seconds ... Num CPUs=2 HdfsBroker?.Port=38030 HdfsBroker?.Reactors=2 HdfsBroker?.Workers=20 HdfsBroker?.Server.fs.default.name=hdfs://localhost:9000 org.apache.hadoop.ipc.RPC$VersionMismatch?: Protocol org.apache.hadoop.dfs.ClientProtocol? version mismatch. (client = 14, server = 20)
Problem statring DfsBroker? (hadoop)Does this means I have a hadoop release that's too new? Could you add to this wiki the version of hadoop that the current hypertable release supports?.
Thanks a lot, Can't wait to get this thing up and running to migrate all our "vintage" systems.
Found the fix. Replace your hypertable_installation_dir/lib/java/hadoop-core-x-x-x.jar file with the one that comes in your hadoop_installation_dir/hadoop-core-x-x-x.jar.
Is it necessary to turn on extended attributes on all nodes?
@chrulle, Only the node running hyperspace needs it.
I believe the kill-servers.sh script is now changed to stop-servers.sh
I'm having difficulty with these instructions running Ubuntu (Hardy). First, Rubygems does not correctly install Capistrano. I get a command not found error after 'sudo gem install capistrano'. I was able to find a fix for this, but that is not related to Hypertable so I will leave out the details.
Once Capistrano is correctly installed, I can get 'cap dist' to work properly and distribute files across my cluster, but I get errors when I try to do 'cap start'. The error messages read 'error while loading shared libraries: libHyperDfsBroker.so: cannot open shared object file: No such file or directory.' In the past, I have gotten past this problem by setting the LD_LIBRARY_PATH environment variable locally on each machine, so what I did to fix this was to add a line in the start_master task, just before the line in which start-dfsbroker.sh is called, which reads 'export LD_LIBRARY_PATH=#{install_dir}/#{hypertable_version}/lib/ &&'.
Now, my problem is that I am getting timeout errors from the DfsBroker? which reads:
I have hadoop up and running on all of the machines that I am working with, so I'm just a little confused right now because this worked with the previous version of Hypertable that I used (0.9.0.7).
Any thoughts?
Hello,
When I run "cap dist" i get the following output: hadoop@hypertable1:~/hypertable/conf$ cap dist
- transaction: start
- executing `copy_config'
- executing "cp /home/hadoop/hypertable/conf/cluster1-standard.cfg /home/hadoop/hypertable/0.9.0.12/conf"
connection failed for: localhost (TypeError?: no implicit conversion from nil to integer)Does any one know what this error is? Are there instructions available that do not use cap?
Hi all,
When I ran "cap dist", I don't know "cluster1-standard.cfg" format.
Could you show me how to configure thi file?
Thnaks a lot!
Trung
Hi, when I ran hypertable under hadoop in a single machine. I received error: Listening for transport dt_socket at address: 8000 Num CPUs=2 HdfsBroker?.Port=38030 HdfsBroker?.Reactors=10 HdfsBroker?.Workers=10 HdfsBroker?.Server.fs.default.name=hdfs://localhost:16000 Dec 30, 2008 10:43:03 AM org.hypertable.DfsBroker?.hadoop.HdfsBroker? <init> SEVERE: ERROR: Unable to establish connection to HDFS. ShutdownHook? called Exception in thread "Thread-1" java.lang.NullPointerException?
Does any one know what this error is? How can I fix this error?Thanks a lot! Trung
UPDATE: For 'cap start' failure with error that reads "cannot open shared object file ... libHdfsBroker.so". The following fix worked for me.
Add <HYPERTABLE>/<VERSION>/lib to /etc/ld.so.conf.d/hypertable.conf
Then, run 'sudo ldconfig'
For those who are getting errors on 'cap start' that reads:
sh: let: not found
Check your default shell using 'which sh', then 'ln -l <answer from previous command>'
If you are on Ubuntu, chances are it is 'dash'. To change the default shell from 'dash' to 'bash', do the following:
'sudo update-alternatives --install /bin/sh sh /bin/dash 1', then 'sudo update-alternatives --install /bin/sh sh /bin/bash 1', then 'sudo update-alternatives --config sh' and choose 'bash'.
Thanks for the comment, Sydney. We'll put the random sleep stuff in a script for the next release.
Hey Trung,
I was having a similar problem, and it was due to the fact that I was using an older version of hadoop. Check your hadoop logs for messages like "Incorrect header or version mismatch", and if you see them upgrade to 0.19.0
Michael
Hi, Do they different instances of the servers output logs in some directory? I only see a single log file under $HYPERTABLE_HOME/log on a cluster with 10 nodes.
Thanks.
i have just tried to setup hypertable in conjunction with hadoop and i am having a problem with its startup. I have a setup with two machines (vmware instances), hadoop is running fine. At startup everything seems to go fine until I start seeing:
When I take a look at the logs i see the following line in DfsBroker?.hadoop.log
Note: is this the best place to post installation issues?