|
Project Information
Featured
Downloads
|
CREDITS
-------
Paul Kendall (testing, code review, bug fixes)
Samant Maharaj (testing, code review, bug fixes)
EVS4J is a pure-Java(tm) implementation of the Totem single ring protocol:
"The Totem Single-Ring Ordering and Membership Protocol",
Y. Amir, L. E. Moser, P. M. Melliar-Smith, D. A. Agarwal, and P. Ciarfella,
ACM Transactions on Computer Systems 13, 4 (November 1995), 311-342.
NOTE: The flow control algorithm in this article uses a fixed window size.
I found that this would make it impossible to use the same
network for anything else, so I implemented congestion control.
Now the window backs off nicely when needed. A maximum window size is
still required.
Features:
- Group membership (configuration) service
- Reliable multicast
- Total ordering
- Flow control
- Congestion control
- Recovery of messages when a processor fails or joins
WARNING: Using multicasting on your LAN can take away precious bandwidth from
others on the network and create huge delays. Do not try this code on your LAN
unless you know that the packets won''t end up on some production network (and
sometimes this happens by mistake.) Talk to your sys admin.
Usage
-----
To use this protocol you need to do the following:
1. Pick integer ids for the nodes in your cluster, e.g. 1,2,3.
2. Pick a port number to send multicast packets on.
3. Pick a multicast address.
4. If you have more than one adapter on each node pick one subnet you want to use.
5. Get an instance of a Connection object.
6. (Probably) create a Listener object.
7. Call open() on the connection.
Some time after open() returns, the method Listener.onConfiguration() is called.
This is to notify the application that a configuration (a ring, or group) has been created.
After this you can expect to receive messages through the onMessage() method.
See the API documentation and src/Example.java for an example.
Configuration parameters
------------------------
The class which implements evs4j.Connection is:
evs4j.impl.SRPConnection
The constructor has the signature:
SRPConnection(long, Processor, String)
The first parameter can be zero 0 when running a test, but in a real application it should
be the id of the last configuration used before the system was shut down. This id should be
forced to disk in Listener.onConfiguration() and it should be read from disk when creating
a new connection.
The second parameter is a Processor object with an arbitrary integer id which must be
unique across the cluster. This id could be generated from the ip address of each,
but that will depend on each cluster.
The third parameter is a string containing name value pairs using ''='' between name and
value and ''&'' between pairs, for example:
"port=9100&ip=224.0.0.1&nic=192.168.254.0/255.255.255.0"
This is the complete list of properties:
port The port number used on multicast packets. Required.
ip The _multicast_ ip address used, e.g. 224.0.0.1. Required.
nic The network interface to be used. This can be either of the form
###.###.###.###/###.###.###.###, e.g. 192.168.254.0/255.255.255.0
or the ''name'' of the interface, e.g eth0. This property is required
only when the machine has more than one network adapter installed.
This is the proper behavior because the EVS4J benchmark creates a
multicast storm and we don''t want to do that to the wrong network.
windowSize The _maximum_ window size (number of messages) to be used for
flow control. The window size at any given moment varies between
0 and this value. The window size will decrease automatically
(following the Van Jacobson et al. protocol) if you start ftp
file transfers etc. on the same network. In a production
environment you might consider using a dedicated network.
You can use this parameter and the size of the ring to tune
latency and throughput.
Optional. The default is 30.
tokenDroppedTimeout A timeout in milliseconds used to determine if the token
was dropped by the network or by the receiver''s buffer and
needs to be re-sent. See totem article for details.
Optional. The default is 3.
tokenLossTimeout A timeout in milliseconds used to determine if the token
was dropped because one of the processor _failed_. When
this timeout expires the remaining processors attempt to
form a new configuration. See totem article for details.
Optional. The default is 1000.
joinTimeout Analogous to tokenDroppedTimeout but applies to the membership
protocol. See totem article for details.
Optional. The default is 3.
consensusTimeout Analogous to tokenLossTimeout but applies to the membership
protocol. See totem article for details.
Optional. The default is 1000.
Known issues
------------
Under certain conditions on Linux the network card (or cards) have to be
configured specifically to support certain multicast addresses. If you are
not receiving any packets this may be the reason.
Support
-------
There is no commercial support for this code, however you can try me at
akiva dot lichtner at gmail dot com if you like.
Benchmark
---------
The following command runs 2 processors both sending and receiving 1450-byte packets in one JVM:
java -classpath evs4j.jar \
-Xincgc -client \
evs4j.tool.benchmark.Main \
-props "port=9100&ip=224.0.0.1&nic=192.168.254.0/255.255.255.0" \
-procs 1 2
In a recent test on 4 dual-core desktop machines evs4j achieved 25,000 messages per
second (270 Mpbs,) with a token rotation time of 2ms. We had to change the maximum
message size in the code from 1450 to 1423, as it turns out that this fits exactly into a
udp packet (thanks for Paul Kendall for discovering that the maximum message size was too large.)
When the switch supports jumbo frames you can change the maximum packet size and the
maximum message size to 9000 and 8923, respectively (presently these have to be changed
in the code,) and achieve about 6000 messages per second (400 Mpbs.)
EVS4J 1.0b3
-----------
Incorrectly handled case of 2 network adapters. Should have forced user to
specify one, but didn''t. Fixed.
Paul Kendall and Samant Maharaj sent in a patch for two bugs, one being a bug in
the code that collected free nodes in the message buffer and put them back in the
free list (a 6-line change) and the other a bug in updating the last safe message id.
Legal note: on 3/2/2006 Paul assigned the copyright to me (Guglielmo Lichtner) on
behalf of himself and Samant Maharaj.
EVS4J 1.0b2
-----------
On Linux 2.6.x with ipv6 support the nic-finding code threw an exception. Fixed.
On Windows XP when you bring an adapter down it seem to disappear entirely.
Added code to report to handle this case more informatively.
In SRPRecovery.java there was a FIXME about delivering messages from the
transitional processors _only_. Fixed.
EVS4J 1.0b1
-----------
First release since around January 2004 (0.9.1).
Deprecated EVS4J project in SourceForge because it uses CVS which is
hard to administer. I am using my own Subversion repository now. I have
removed a lot of code and done some refactorings to simplify the system
as much as possible.
Changed the license. This code is now licensed under the Apache License,
version 2.0.
|