Export to GitHub

molniya - issue #22

molniya terminates on parser error


Posted on Oct 30, 2010 by Happy Kangaroo

What steps will reproduce the problem? 1. Leave molniya running on my debian sid system running icinga 1.2.1 2. Molniya terminates at random time with trace found below

What is the expected output? What do you see instead? Software should be robust and not bomb out. Yeah I could run it in a while loop, but it could just be robust in the first place. Why not back off for $time and retry parsing the file or something like that?

What version of the product are you using? On what operating system? trunk, r47 debian sid, icinga 1.2.1

Please provide any additional information below.

/usr/src/molniya-trunk/nagios.rb:75:in parse_object': unexpected line: (RuntimeError) from /usr/src/molniya-trunk/nagios.rb:55:inparse_status' from /usr/src/molniya-trunk/nagios.rb:496:in parse' from /usr/lib/ruby/1.8/pathname.rb:812:inopen' from /usr/lib/ruby/1.8/pathname.rb:812:in open' from /usr/src/molniya-trunk/nagios.rb:496:inparse' from /usr/src/molniya-trunk/nagios.rb:159:in _refresh' from /usr/src/molniya-trunk/nagios.rb:127:inrefresh_if_needed' from /usr/lib/ruby/1.8/monitor.rb:242:in synchronize' from /usr/src/molniya-trunk/nagios.rb:124:inrefresh_if_needed' from /usr/src/molniya-trunk/nagios.rb:442:in refresh_if_needed' from /usr/src/molniya-trunk/nagios.rb:164:incontents' from /usr/src/molniya-trunk/molniya.rb:445:in status_report' from /usr/src/molniya-trunk/molniya.rb:623:inupdate_status_msg' from /usr/src/molniya-trunk/molniya.rb:568:in run' from /usr/src/molniya-trunk/molniya.rb:801:inlaunch' from -e:1

Comment #1

Posted on Nov 2, 2010 by Happy Kangaroo

Right now I seem to have icinga in a state which makes this problem reproducible. I've saved both status.dat and . Let me know if you want those as I'm not going to attach them to a public bug.

In addition here's the relevant part of a strace -f. Maybe that gives you an idea of what's happening.

[pid 10688] gettimeofday({1288680165, 578538}, NULL) = 0 [pid 10688] stat64("/var/lib/icinga/status.dat", {st_mode=S_IFREG|0664, st_size=53296, ...}) = 0 [pid 10688] stat64("/var/lib/icinga/status.dat", {st_mode=S_IFREG|0664, st_size=53296, ...}) = 0 [pid 10688] open("/var/lib/icinga/status.dat", O_RDONLY|O_LARGEFILE) = 3 [pid 10688] fstat64(3, {st_mode=S_IFREG|0664, st_size=53296, ...}) = 0 [pid 10688] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb776c000 [pid 10688] read(3, "################################"..., 4096) = 4096 [pid 10688] read(3, "apping=0\n\tpercent_state_change=0"..., 4096) = 4096 [pid 10688] read(3, "=0\n\tcheck_command=check-host-ali"..., 4096) = 4096 [pid 10688] read(3, "0\n\tretry_interval=1.000000\n\teven"..., 4096) = 4096 [pid 10688] rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0 [pid 10688] close(3) = 0 [pid 10688] munmap(0xb776c000, 4096) = 0 [pid 10688] write(2, "/usr/src/molniya-trunk/nagios.rb"..., 53/usr/src/molniya-trunk/nagios.rb:75:in `parse_object') = 53 [pid 10688] write(2, ": ", 2: ) = 2 [pid 10688] write(2, "unexpected line: ", 17unexpected line: ) = 17 [pid 10688] write(2, " (", 2 () = 2 [pid 10688] write(2, "RuntimeError", 12RuntimeError) = 12 [pid 10688] write(2, ")\n", 2)
) = 2

Hope that helps.

Comment #2

Posted on Nov 3, 2010 by Happy Kangaroo

When one specific host came back online I was able to start molniya again. After looking at both status files, to me there are two things that look like possible causes here: - there is an empty line in that status block when it's down - the plugin_output filed contains an IPv6 address

See also attached snippets.

PS: I'm not convinced that this is also the source of the random termination during parsing. More likely it fails parsing something else too.

Attachments

Status: New

Labels:
Type-Defect Priority-Medium