My favorites | Sign in
Project Home Wiki Issues Source
Search
for
BenchmarkingV2  

Featured
Updated Jan 25, 2011 by david.yu...@gmail.com

Benchmarks (2011-01-25)

WARNING: Benchmarks can be misleading.

  • These tests use a specific data value (DataStructuresV2). A different data value will yield different results.
  • The tools have different sets of features (BeyondNumbers). Some of these features make things safer or easier, but come with a performance cost.
  • Different hardware and software environments will yield different results.
  • We don't take memory usage into account.

In short, before you make a decision on which tool to use, make sure you try it out in an environment you care about. To start, download the benchmark code and run it on your hardware with data values you care about.

Setup

Hardware: Intel Core 2 Quad 2.66GHz

Software: Sun JRE 1.6.0_22 (64-bit server VM), Ubuntu 10.04

JVM options: -Xmx16m -server

Data value being tested: DataStructuresV2.

Version of the benchmarking code: Git tree

Methodology:

  • Before taking measurements, we warm things up by running the test several times.
  • For a test, measure the time taken to perform 2000 operations (serialization, deserialization, etc.). Then divide the result by 2000.
  • Run each test 500 times and report the best result.
  • Look at the code for more details. BenchmarkRunner.java

Tool Versions (lib/):

Charts

Total Time ("total")

Create an object, serialize it to a byte array, then deserialize it back to an object.

Serialization Time ("ser")

Create an object, serialize it to a byte array.

  • Java's built-in serializer faithfully represents arbitrary object graphs, which hurts performance. All the other serializers flatten the structure out to a tree.

Deserialization Time ("deser+deep")

Often the most expensive operation. To make a fair comparison, all fields of the deserialized instances are accessed - this forces lazy deserializers to really do their work. The raw data below shows additional measurements for deserialization.

Serialized Size ("size")

The size of the serialized data. These numbers may vary depending on the exact data value being used.

  • Java's built-in serializer stores the full class name in serialized form. So you don't need to know ahead of time what kind of object you're reading in.
  • The 'scala' test, which uses Java's built-in serialization, yields a larger serialized representation because it usually creates more Java classes under the hood.

Serialization Compressed Size ("size+dfl")

The size of the serialized data compressed with Java's built-in implementation of DEFLATE (zlib).

Object Creation Time ("create")

Object creation is not so meaningful since it takes in average 100 nano to create an object. However, the different tools vary in how "fancy" their objects are. Some just create a plain Java class and let you access fields directly, while others have set/get methods, while others use the "builder" pattern.

  • Protobuf and Thrift use the "builder" pattern to create objects, which makes the operation more expensive.
  • Avro stores Strings in UTF8 form. The time taken to convert from Java "String" values to UTF-8 is included under "create", "ser", "deser+shal", and "deser+deep", which isn't quite representative of real-world usage. Real code that uses Avro might be able to keep strings in UTF-8 form, thus avoiding the need to convert back and forth (in which case the "ser+same" and "deser" results might be more accurate reflections of Avro's performance).

Numbers

Times are in nanoseconds, sizes are in bytes.

                                 create     ser   +same   deser   +shal   +deep   total   size  +dfl
java-built-in                       217   13413   12065   60842   61022   61848   75262    889   517
java-manual                         219    1907    1693    1298    1378    1489    3396    255   147
scala/java-built-in                 663   21049   18217   89160   90141   91449  112498   1312   700
scala/sbinary                       676    4128    3246    2871    3027    3354    7481    255   147
hessian                             219   11201   10327   11996   12144   12267   23468    501   313
kryo                                226    2545    2626    2483    2533    2628    5173    233   147
kryo-opt                            226    2185    2390    2138    2192    2287    4472    219   135
kryo-manual                         221    1777    1587    1838    1908    2044    3822    219   132
protobuf                            501    4161    2024    2181    2306    2479    6640    239   149
protobuf/activemq+alt               385    4289      10      17    1375    2745    7035    239   149
protostuff                          343    1444    1191    1975    2059    2163    3607    239   150
protostuff-manual                   223    1430    1208    1688    1905    2047    3477    239   150
protostuff-runtime                  222    1861    1658    2197    2405    2558    4419    241   151
protobuf/protostuff                 344    1512    1285    2054    2105    2202    3714    239   149
thrift                              416    4472    4080    2448    2509    2623    7095    349   197
thrift-compact                      417    4196    3676    2376    2457    2622    6817    240   148
avro                               1790    4497    2696    6018    6959    7772   12269    221   133
avro-generic                       2417    5070    2412    5518    6752    7936   13006    221   133
json/jackson-manual                 229    2718    2530    3978    4268    4624    7343    468   253
json/jackson-databind               221    3728    3470    5393    5514    5862    9590    503   270
json/protostuff-manual              222    3243    2970    3995    4236    4475    7718    449   233
json/protostuff-runtime             223    3815    3460    4905    4999    5356    9171    469   243
json/protobuf                       501   23598   22241  115652  116043  116419  140018    488   253
json/google-gson                    222   97269   96780   74066   74849   76006  173275    486   259
bson/jackson-manual                 223    9759    9516   12219   12449   12678   22436    495   278
bson/jackson-databind               229   11783   11416   14277   14506   14619   26403    519   293
bson/mongodb                        223    8923    8592   36600   35848   36001   44924    495   278
smile/jackson-manual                223    2418    2244    2679    2881    3016    5434    341   244
smile/jackson-databind              223    3574    3286    4283    4426    4565    8139    364   260
smile/protostuff-manual             230    2915    2748    2983    3271    3430    6345    325   233
smile/protostuff-runtime            224    3562    3288    3967    4336    4440    8003    339   240
xml/manual-woodstox                 223    7613    7256   11013   11156   11469   19082    653   304
xml/manual-aalto                    223    5004    4881    6512    6689    6888   11893    653   304
xml/manual-fastinfo                 229   14879   14558   15548   16242   16460   31339    377   284
xml/xstream+c                       230   15810   14567   46866   43983   47458   63268    487   244
xml/xstream+c-woodstox              229   13918   12543   26628   27342   27604   41522    525   273
xml/xstream+c-aalto                 230   11664   10202   19946   20570   20816   32481    525   273
xml/xstream+c-fastinfo              230   20343   18762   22019   22284   22607   42950    345   264
xml/jackson-databind/aalto          223    8175    7775   11367   11746   11863   20038    712   297
xml/javolution                      229   10445    9766   16627   16882   17195   27639    504   263

Columns:

  • create: create an object (using the classes specified by the serialization tool)
  • ser: create an object and serialize it
  • +same: serialize the same object (i.e. doesn't include creation time)
  • deser: deserialize an object
  • +shal: deserialize an object and access the top-level fields
  • +deep: deserialize an object and access all the fields
  • total: create + serialize + deserialize and access all fields
  • size: the size of the serialized data
  • +dfl: the size of the serialized data compressed with Java's built-in implementation of DEFLATE (zlib)

Comment by bryan.du...@gmail.com, Apr 22, 2010

Is there any way I could convince you to use the (soon to be released) Thrift 0.3 in this benchmark instead of 0.2? There are a lot of performance improvements in 0.3.

Additionally, you should probably use the TCompactProtocol instead of TBinaryProtocol if you're going to have a column for serialized size, since it's desinged specifically to reduce serialized size.

Comment by project member kannan%c...@gtempaccount.com, Apr 22, 2010

I think our intention is to use the latest available version of every tool. Once Thrift 0.3 is released, file a bug or something to let us know we should update the results.

The benchmarking code now includes both TBinaryProtocol and TCompactProtocol ("thrift-compact"). I tried it out locally and, for the data value we use in the benchmark, TCompact seems to run just as fast as TBinary. Maybe you guys should change the sample code on the wiki to use TCompact?

While we're making requests (assuming you're involved with Thrift), could you fix the lib/java/build.xml to not include the source files in libthrift.jar? :-)

Comment by eg...@google.com, May 30, 2010

Do you think it would be hard to measure memory footprint as well as CPU time? I'm thinking of the static footprint of the library, the footprint of the (unserialized) objects, and the allocations thrown off during serialization/deserialization. (You already have the size of the serialized object, yay!)

In mobile scenarios this is often just as important (if not more so) as CPU consumption.

Comment by project member eis...@gmail.com, Jun 1, 2010

Please use the java-serialization-benchmarking google group: http://groups.google.com/group/java-serialization-benchmarking?pli=1


Sign in to add a comment
Powered by Google Project Hosting