My favorites | Sign in
Project Home Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Project Information
Members

Graph Serializers

This project aims to provide fast java serialization by performing class analysis and creating the corresponding efficient serializers.

It uses similar techniques as Kryo (http://code.google.com/p/kryo/) for number packing, albeit using a distinct mechanism for code generation.

This library is intended to use as a serializer for (rich) domain models, although it can be extended as full fledged serialization solution.

Originally created to handle a deep graph of objects that caused hibernate/Oracle to choke, it's consistently been used to deserialize => process => serialize ~1M objects, meeting an SLA of 20 seconds.

Simple Usage

In non-frozen mode the API usage is very simple, assuming your domain objects aren't too much exotic:

Serializer s = GraphSerializerFactory.serializer(MyObject.class);

ByteBuffer dest = ByteBuffer.allocateDirect(1024*10);

s.write(dest);

Usage for persistence

For efficiency and space, the serializers usually do not write any object "headers" (class/field names), they just write packed positive integers to represent them, which can be auto generated or pre-supplied (making the engine frozen).

Non-frozen means we entrust the generation of the mappings int <=> Class to the engine, which is transient and depends on the ordering of the objects found in a graph, making it a not reliable for persistence/networking.

These mappings however can be provided at bootstrapping, via the ServiceLoader API or by explicitly passing a list of ProvisionServices, which among other things, contains the declaration:

Iterable<Map.Entry<Integer, Class<?>>> installableMappings();

If any mapping is detected at bootstrapping, the engine will be frozen. Once frozen, no mappings will be allowed to be auto-generated and if any class is found to be unmapped, the following may happen:

# (Default behavior)-It will fallback to CompatibleSerializer (which has extra overhead, because it writes fully qualified names instead of ids)

# An exception is thrown (api booted with System Property: c.n.m.s.g.c.SR.compatibleFallback=false), which should be used to enforce compatibility and performance.

Overhead

For every class in your domain model a corresponding serializer class will be generated as a singleton. Collection and maps all share the same serializer, but a Instantiator is created for each of them in order to speed up the creation of array-based data structures or to properly create Sorted data structures by passing a Comparator as parameter.

There's also a ThreadLocal data structure used as a book-keeper of the objects being serialized/deserialized. It contains to (trove - http://trove4j.sourceforge.net/) Maps, one used for serialization (Objects to Ids) and other for deserialization (Ids to Objects), which are properly cleared at the end of the Graph stream.

One can customize the serialization to avoid this book-keeping for large collections, if they are known in advance to be non-polymorphic and have no insane cyclic relationships (eg. a Collection which references an element which in turn references the Collection itself). This will keep trove's array-based hashtables within a reasonably small size.

Powered by Google Project Hosting