Export to GitHub

thrift-protobuf-compare - Benchmarking.wiki


We moved to a new home

The wiki moved to http://wiki.github.com/eishay/jvm-serializers/

For discussions please use http://groups.google.com/group/java-serialization-benchmarking

Intro

Started with http://www.eishay.com/search/label/protobuf'>few blog posts and with the help of many contributes, this project is now benchmarking much more then just protobuf and thrift. Thanks to all who looked at the code, contributed, suggested and pointed bugs. Three major contributions are from http://www.cowtowncoder.com/blog/blog.html'>cowtowncoder who fixed the http://stax.codehaus.org/'>stax code, http://www.samsarin.com'>Chris Pettitt who added the http://www.json.org/'>json code and http://github.com/davidB'>David Bernard for the xstream and java http://java.sun.com/j2se/1.3/docs/api/java/io/Externalizable.html'>externalizable. The charts below are displaying the latest results. Note that the charts are scaled to best fit the results and they might be misleading in come cases. If you wish to see the numbers scroll down to the chart at the end of the page. Overall we have benchmarks for protobuf, thrift, java, scala, few implementations of stax, binaryxml, json, xstream, javolution, hessian, avro, sbinary, JSON Marshaller, and Kryo.

Numbers are not everything

Benchmarks can be very misleading. Different hardware, use cases, and/or datasets will provide different results and sometimes a marginal performance boost is eclipsed by other features like http://www.eishay.com/2009/04/protocol-buffers-forward-backward.html'>forward and backward compatibility, cross language support, simpler API, and more.

Charts

Setup

The following measurements were performed with revision http://code.google.com/p/thrift-protobuf-compare/source/browse/?r=128'>https://code.google.com/p/thrift-protobuf-compare/source/detail?r=128'>r128 on Windows 7 64-bit using Sun's JVM 1.6.0_15 JRE 32-bit, with an Intel Core i7 920 CPU. Note the tests are run with a JVM heap size of 16MB and using the server HotSpot compiler.

Omitted from the first three charts: json/google-gson and scala. These serializers are so slow, they would break the scale of our charts. See below for the naked data.

Total Time

Including creating an object, serializing and deserializing:

http://chart.apis.google.com/chart?chtt=totalTime&chf=c||lg||0||FFFFFF||1||76A4FB||0|bg||s||EFEFEF&chs=689x430&chd=t:5917.121999999999,6911.219,7746.0375,10105.623,11598.1135,11702.6325,12798.176,13289.6525,13308.289,14413.6315,15197.342,15251.716,16006.7985,18152.156499999997,20703.491,21125.9925,29864.972999999998,34437.3555,42352.662,65733.528,97347.103&chds=0,107081.81330000001&chxt=y&chxl=0:|java|JsonMarshaller|xstream (stax with conv)|hessian|binaryxml/FI|json/jackson-databind|stax/woodstox|javolution xmlformat|protostuff-json|thrift|protostuff-numeric-json|stax/aalto|json (jackson)|sbinary|avro-generic|activemq protobuf|protobuf|avro-specific|kryo|kryo-optimized|java (externalizable)&chm=N *f*,000000,0,-1,10&lklk&chdlp=t&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666&cht=bhg&chbh=10&nonsense=aaa.png' />

Serialization Time

Serializing with a new object each time (object creation time included):

http://chart.apis.google.com/chart?chtt=timeSerializeDifferentObjects&chf=c||lg||0||FFFFFF||1||76A4FB||0|bg||s||EFEFEF&chs=689x430&chd=t:2874.1185,2912.7375,3437.8375,5749.9665,6330.785,6777.6865,7223.819,7226.509,7302.9785,7584.8375,8011.9495,8018.866,8542.4285,8639.84,8704.781,10550.4115,13354.7855,15336.6385,16116.891,24618.395,25773.3065&chds=0,28350.637150000002&chxt=y&chxl=0:|java|JsonMarshaller|xstream (stax with conv)|binaryxml/FI|hessian|json/jackson-databind|sbinary|protostuff-json|stax/woodstox|avro-generic|protostuff-numeric-json|javolution xmlformat|thrift|protobuf|json (jackson)|activemq protobuf|stax/aalto|avro-specific|kryo|kryo-optimized|java (externalizable)&chm=N *f*,000000,0,-1,10&lklk&chdlp=t&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666&cht=bhg&chbh=10&nonsense=aaa.png' />

Deserialization Time

Often the most expensive operation. To make a fair comparison, all fields of the deserialized instances are accessed - this forces lazy deserializers to really do their work. The raw data below shows additional measurements for deserialization.

http://chart.apis.google.com/chart?chtt=timeDeserializeAndCheckAllFields&chf=c||lg||0||FFFFFF||1||76A4FB||0|bg||s||EFEFEF&chs=689x430&chd=t:3043.0035,3998.4815,4308.2,4355.6565,4371.6045,4584.8715,4779.31,4924.946,6084.47,7185.3925,7366.9585,7948.7375,8082.8465,10567.319,10575.581,12161.0625,14528.3345,21082.57,26235.771,41115.133,71573.7965&chds=0,78731.17615&chxt=y&chxl=0:|java|JsonMarshaller|xstream (stax with conv)|hessian|binaryxml/FI|stax/woodstox|json/jackson-databind|javolution xmlformat|stax/aalto|thrift|protostuff-json|protostuff-numeric-json|json (jackson)|activemq protobuf|avro-generic|sbinary|protobuf|avro-specific|kryo|kryo-optimized|java (externalizable)&chm=N *f*,000000,0,-1,10&lklk&chdlp=t&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666&cht=bhg&chbh=10&nonsense=aaa.png' />

Serialization Size

May vary a lot depending on number of repetitions in lists, usage of number compacting in protobuf, strings vs numerics, assumptions that can be made about the object graph, and more. Interesting point is Scala and Java which holds the name of the classes in the serialized form. I.e. longer class names = larger serialized form. In Scala its worse since the Scala compiler creates more implicit classes then java.

http://chart.apis.google.com/chart?chtt=length&chf=c||lg||0||FFFFFF||1||76A4FB||0|bg||s||EFEFEF&chs=689x430&chd=t:207.0,211.0,211.0,226.0,231.0,231.0,264.0,264.0,300.0,353.0,359.0,370.0,378.0,399.0,419.0,448.0,465.0,470.0,475.0,475.0,526.0,919.0,2024.0&chds=0,2226.4&chxt=y&chxl=0:|scala|java|hessian|stax/woodstox|stax/aalto|json/google-gson|json/jackson-databind|protostuff-json|javolution xmlformat|xstream (stax with conv)|json (jackson)|JsonMarshaller|protostuff-numeric-json|thrift|binaryxml/FI|sbinary|java (externalizable)|protobuf|activemq protobuf|kryo|avro-specific|avro-generic|kryo-optimized&chm=N *f*,000000,0,-1,10&lklk&chdlp=t&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666&cht=bhg&chbh=10&nonsense=aaa.png' />

Object Creation Time

Object creation is not so meaningful since it takes in average 100 nano to create an object. The surprise comes from http://code.google.com/p/protobuf/'>protobuf which takes a very long time to create an object. Its the only point in this set of benchmarks where it didn't perform as well as http://incubator.apache.org/thrift/'>thrift. Scala (and to a lesser point - java) on the other hand is fast, seems like its a good language to handle in memory data structures but when coming to serialization you might want to check the alternatives.

http://chart.apis.google.com/chart?chtt=timeCreate&chf=c||lg||0||FFFFFF||1||76A4FB||0|bg||s||EFEFEF&chs=689x430&chd=t:125.51285,126.19877,168.21827,168.80043,169.048285,170.108855,170.740975,171.880325,173.08692,173.28482,173.569175,173.82471,174.35308,174.39919,175.258025,175.75949,227.9755,254.11328,470.51276,475.759915,479.06845,2643.18237,4024.12346&chds=0,4426.535806&chxt=y&chxl=0:|avro-generic|avro-specific|protostuff-numeric-json|protostuff-json|protobuf|activemq protobuf|thrift|json (jackson)|javolution xmlformat|binaryxml/FI|xstream (stax with conv)|stax/aalto|json/jackson-databind|json/google-gson|stax/woodstox|JsonMarshaller|java (externalizable)|kryo-optimized|kryo|java|hessian|sbinary|scala&chm=N *f*,000000,0,-1,10&lklk&chdlp=t&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666&cht=bhg&chbh=10&nonsense=aaa.png' />

Numbers

Times are in nanoseconds, sizes are in bytes. , Object create, Serialize, /w Same Object, Deserialize, and Check Media, and Check All, Total Time, Serialized Size avro-generic , 4024.12346, 8018.86600, 4030.37150, 4779.31000, 4779.31000, 4779.31000, 12798.17600, 211 avro-specific , 2643.18237, 5749.96650, 3204.01250, 4355.65650, 4355.65650, 4355.65650, 10105.62300, 211 activemq protobuf , 254.11328, 6777.68650, 71.48850, 14.60200, 2574.77900, 4924.94600, 11702.63250, 231 protobuf , 470.51276, 7226.50900, 3698.94400, 3478.95350, 3826.13550, 4371.60450, 11598.11350, 231 thrift , 227.97550, 7302.97850, 7165.79000, 7948.73750, 7948.73750, 7948.73750, 15251.71600, 353 hessian , 168.21827, 13354.78550, 12849.28050, 21082.57000, 21082.57000, 21082.57000, 34437.35550, 526 kryo , 169.04829, 3437.83750, 3297.00450, 4308.20000, 4308.20000, 4308.20000, 7746.03750, 226 kryo-optimized , 170.10886, 2912.73750, 2813.59000, 3998.48150, 3998.48150, 3998.48150, 6911.21900, 207 java , 168.80043, 25773.30650, 25136.38350, 71573.79650, 71573.79650, 71573.79650, 97347.10300, 919 java (externalizable) , 170.74098, 2874.11850, 2674.87900, 3043.00350, 3043.00350, 3043.00350, 5917.12200, 264 scala , 125.51285, 62838.27450, 61814.38950, 194495.92550, 194495.92550, 194495.92550, 257334.20000, 2024 json (jackson) , 175.75949, 7223.81900, 7113.92050, 6084.47000, 6084.47000, 6084.47000, 13308.28900, 378 json/jackson-databind , 173.56918, 10550.41150, 10443.20400, 10575.58100, 10575.58100, 10575.58100, 21125.99250, 465 JsonMarshaller , 171.88033, 24618.39500, 24488.50800, 41115.13300, 41115.13300, 41115.13300, 65733.52800, 370 protostuff-json , 475.75992, 8639.84000, 8069.39750, 7366.95850, 7366.95850, 7366.95850, 16006.79850, 448 protostuff-numeric-json , 479.06845, 8011.94950, 7405.96250, 7185.39250, 7185.39250, 7185.39250, 15197.34200, 359 json/google-gson , 173.28482, 449118.35900, 449995.44750, 491268.32050, 491268.32050, 491268.32050, 940386.67950, 470 stax/woodstox , 173.08692, 8542.42850, 8408.70150, 12161.06250, 12161.06250, 12161.06250, 20703.49100, 475 stax/aalto , 173.82471, 6330.78500, 6159.79150, 8082.84650, 8082.84650, 8082.84650, 14413.63150, 475 binaryxml/FI , 174.39919, 15336.63850, 15210.98450, 14528.33450, 14528.33450, 14528.33450, 29864.97300, 300 xstream (stax with conv), 174.35308, 16116.89100, 15564.12000, 26235.77100, 26235.77100, 26235.77100, 42352.66200, 399 javolution xmlformat , 175.25803, 7584.83750, 7446.11950, 10567.31900, 10567.31900, 10567.31900, 18152.15650, 419 sbinary , 126.19877, 8704.78100, 8743.20950, 4584.87150, 4584.87150, 4584.87150, 13289.65250, 264