What steps will reproduce the problem? * start boot.sh * startup a trivial java interface that accepts HTTP PUTs and turns them into scalaris OtpErlangString writes * nail that web interface with many synthetic 1k PUTs * non-deterministic crash - happened after a few hours of developing the interface
What is the expected output? * value is committed despite heavy load
What do you see instead? * crashed boot.sh
What version of the product are you using? On what operating system? * SVN http://scalaris.googlecode.com/svn/trunk revision 121 * Mac OS X 10.5.6 * Erlang (BEAM) emulator version 5.6.5 [source] [smp:2] [async-threads:0] [kernel-poll:false]
Please provide any additional information below. * code for REST interface available upon request * this was after experimenting with trying to do erlang type conversion in java - ala
OtpOutputStream oos = new OtpOutputStream(otpValue);
OtpInputStream ois = new OtpInputStream(oos.toByteArray());
OtpErlangBinary bin;
String answer = "";
try {
bin = new OtpErlangBinary(ois);
answer = new String(bin.binaryValue());
} catch (OtpErlangDecodeException e) {
// TODO Auto-generated catch block
status = HttpServletResponse.SC_INTERNAL_SERVER_ERROR; //500
error = e.toString();
is_error = true;
}
response.getOutputStream().print(answer);
(because i wanted a byte-for-byte reproduction of what was in the store put into the OutputStream.)
Comment #1
Posted on Jan 15, 2009 by Helpful Monkeypoppy@levitate:~/Source/scalaris-read-only/bin$ ./boot.sh Erlang (BEAM) emulator version 5.6.5 [source] [smp:2] [async-threads:4] [kernel-poll:false]
Eshell V5.6.5 (abort with ^G) (boot@levitate)1> Config files: ["scalaris.cfg","scalaris.local.cfg"] Can't load config file "scalaris.local.cfg": enoent. Ignoring. listening on {0,0,0,0}:14195
=INFO REPORT==== 14-Jan-2009::18:58:46 === Yaws: Listening to 0.0.0.0:8000 for servers - http://localhost:8000 under ../docroot [ I | RM | <0.92.0> ] starting ring maintainer [ I | RT | <0.94.0> ] starting ringtable this() == {{127,0,0,1},14195} [ I | Node | <0.97.0> ] joining 135547905503803006856817392082617988910 [ I | Node | <0.97.0> ] join as first 135547905503803006856817392082617988910 [ I | FD | <0.69.0> ] starting pinger for {{127,0,0,1},14195,<0.97.0>} [ I | Node | <0.97.0> ] joined
=ERROR REPORT==== 14-Jan-2009::20:38:39 === Error in process <0.85.0> on node 'boot@levitate' with exit value: {{badmatch,undefined},[{proc_lib,exit_p,2}]}
=ERROR REPORT==== 14-Jan-2009::23:03:29 === yaws: Failed to accept - terminating: {error,emfile}
=ERROR REPORT==== 14-Jan-2009::23:03:29 === Accept proc died, terminate gserv =ERROR REPORT==== 14-Jan-2009::23:03:29 === yaws: FATAL gserv died shutdown
=ERROR REPORT==== 14-Jan-2009::23:03:29 === ** Generic server yaws_server terminating ** Last message in was {'EXIT',<0.84.0>,shutdown} ** When Server state == {state, {gconf, "/Volumes/Data/Source/scalaris-read-only/contrib/yaws", false,204,"../log",
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"], [],15000,400,1000000,8000,10240,[],10000000, 30,
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"], "/usr/bin/php-cgi","Yaws 1.77","default", false}, [{<0.84.0>, [{sconf,8000,3,[],undefined,undefined, "../docroot",[], {0,0,0,0}, "localhost",40985,undefined,[],nolimit,[], yaws_404,yaws_404,yaws, "boot_server_1995904074",undefined, [yaws,php,cgi], [],[]}]}], 14,true} ** Reason for termination == ** {restartme,[{yaws_server,handle_info,2}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 14-Jan-2009::23:03:29 === Error in process <0.8016.6805> on node 'boot@levitate' with exit value: {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}
=ERROR REPORT==== 14-Jan-2009::23:03:29 === Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission errs: emfile =ERROR REPORT==== 14-Jan-2009::23:03:30 === ** Generic server yaws_server terminating ** Last message in was {setconf, {gconf, "/Volumes/Data/Source/scalaris-read-only/contrib/yaws", false,204,"../log",
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"], [],15000,400,1000000,8000,10240,[],10000000,30,
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"], "/usr/bin/php-cgi","Yaws 1.77","default",false}, [[{sconf,8000,3,[],undefined,undefined, "../docroot",[], {0,0,0,0}, "localhost",undefined,undefined,[],nolimit, [],yaws_404,yaws_404,yaws, "boot_server_1995904074",undefined, [yaws,php,cgi], [],[]}]]} ** When Server state == {state,undefined,[],0,true} ** Reason for termination == ** {emfile,[{yaws_server,setup_dirs,1}, {yaws_server,init2,5}, {yaws_server,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Error in process <0.9982.6805> on node 'boot@levitate' with exit value: {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission errs: emfile =ERROR REPORT==== 14-Jan-2009::23:03:30 === ** Generic server yaws_server terminating ** Last message in was {setconf, {gconf, "/Volumes/Data/Source/scalaris-read-only/contrib/yaws", false,204,"../log",
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"], [],15000,400,1000000,8000,10240,[],10000000,30,
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"], "/usr/bin/php-cgi","Yaws 1.77","default",false}, [[{sconf,8000,3,[],undefined,undefined, "../docroot",[], {0,0,0,0}, "localhost",undefined,undefined,[],nolimit, [],yaws_404,yaws_404,yaws, "boot_server_1995904074",undefined, [yaws,php,cgi], [],[]}]]} ** When Server state == {state,undefined,[],0,true} ** Reason for termination == ** {emfile,[{yaws_server,setup_dirs,1}, {yaws_server,init2,5}, {yaws_server,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Error in process <0.12243.6805> on node 'boot@levitate' with exit value: {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission errs: emfile =ERROR REPORT==== 14-Jan-2009::23:03:30 === ** Generic server yaws_server terminating ** Last message in was {setconf, {gconf, "/Volumes/Data/Source/scalaris-read-only/contrib/yaws", false,204,"../log",
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"], [],15000,400,1000000,8000,10240,[],10000000,30,
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"], "/usr/bin/php-cgi","Yaws 1.77","default",false}, [[{sconf,8000,3,[],undefined,undefined, "../docroot",[], {0,0,0,0}, "localhost",undefined,undefined,[],nolimit, [],yaws_404,yaws_404,yaws, "boot_server_1995904074",undefined, [yaws,php,cgi], [],[]}]]} ** When Server state == {state,undefined,[],0,true} ** Reason for termination == ** {emfile,[{yaws_server,setup_dirs,1}, {yaws_server,init2,5}, {yaws_server,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Error in process <0.13534.6805> on node 'boot@levitate' with exit value: {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission errs: emfile =ERROR REPORT==== 14-Jan-2009::23:03:30 === ** Generic server yaws_server terminating ** Last message in was {setconf, {gconf, "/Volumes/Data/Source/scalaris-read-only/contrib/yaws", false,204,"../log",
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"], [],15000,400,1000000,8000,10240,[],10000000,30,
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"], "/usr/bin/php-cgi","Yaws 1.77","default",false}, [[{sconf,8000,3,[],undefined,undefined, "../docroot",[], {0,0,0,0}, "localhost",undefined,undefined,[],nolimit, [],yaws_404,yaws_404,yaws, "boot_server_1995904074",undefined, [yaws,php,cgi], [],[]}]]} ** When Server state == {state,undefined,[],0,true} ** Reason for termination == ** {emfile,[{yaws_server,setup_dirs,1}, {yaws_server,init2,5}, {yaws_server,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Error in process <0.15175.6805> on node 'boot@levitate' with exit value: {emfile,[{erlang,open_port,[{spawn,"/bin/sh -s unix:cmd 2>&1"},[stream]]},{os,start_port,0},{os,'-unix_cmd/1-fun-0-',2}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === Failed to list "/Users/poppy/.yaws/yaws/default" probably due to permission errs: emfile =ERROR REPORT==== 14-Jan-2009::23:03:30 === ** Generic server yaws_server terminating ** Last message in was {setconf, {gconf, "/Volumes/Data/Source/scalaris-read-only/contrib/yaws", false,204,"../log",
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/ebin"], [],15000,400,1000000,8000,10240,[],10000000,30,
["/Volumes/Data/Source/scalaris-read-only/contrib/yaws/examples/include"], "/usr/bin/php-cgi","Yaws 1.77","default",false}, [[{sconf,8000,3,[],undefined,undefined, "../docroot",[], {0,0,0,0}, "localhost",undefined,undefined,[],nolimit, [],yaws_404,yaws_404,yaws, "boot_server_1995904074",undefined, [yaws,php,cgi], [],[]}]]} ** When Server state == {state,undefined,[],0,true} ** Reason for termination == ** {emfile,[{yaws_server,setup_dirs,1}, {yaws_server,init2,5}, {yaws_server,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}
=ERROR REPORT==== 14-Jan-2009::23:03:30 === ** Generic server <0.95.0> terminating ** Last message in was {'EXIT',<0.90.0>,killed} ** When Server state == {state, {<0.95.0>,cs_sup_and}, one_for_all, [{child,<0.97.0>,cs_node, {cs_node,start_link,["cs_node_2615678616"]}, permanent,brutal_kill,worker,[]}, {child,<0.96.0>,cs_db_otp, {cs_db_otp,start_link,["cs_node_2615678616"]}, permanent,brutal_kill,worker,[]}], {dict,0,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[], []}, {{[],[],[],[],[],[],[],[],[],[],[],[],[],[], [],[]}}}, 10,1,[],cs_sup_and, ["cs_node_2615678616"]} ** Reason for termination == ** killed
=INFO REPORT==== 14-Jan-2009::23:03:30 === application: boot_cs exited: shutdown type: temporary
(boot@levitate)1>
Comment #2
Posted on Jan 16, 2009 by Quick HorseHi,
thanks for your issue report on scalaris. Here is what I found out:
The first error message at 20:38 seems to be related to yaws (the builtin webserver). They released a new version of yaws a week ago, which gives a more meaningful error message in this case or fixes the problem. However, they pulled the new version yesterday, because of some undisclosed bugs. As soon as they re-release the code, I will update scalaris to the new yaws version. Either the problem goes away or we get a better error message.
The second bunch of error messages is around 23:03. These error messages hint at scalaris running out of file descriptors (emfile). Could you check with lsof what kind of files scalaris is opening? My best bet at the moment is that there are a lot of TCP connections between scalaris and your Java server.
Each Scalaris object will create its own TCP connection. I don't know whether you are reusing Scalaris objects on the server or not. With Scalaris.closeConnection() you can explicitly close the TCP connection once you don't need the Scalaris object anymore.
Thorsten
Comment #3
Posted on Mar 27, 2009 by Quick Horse(No comment was entered for this change.)
Status: Done
Labels:
Type-Defect
Priority-Medium