My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
BestPractices  
Patterns and best practices for using Objectify
Featured
fr , en
Updated May 3, 2012 by lhori...@gmail.com

Registering Your Entities

The first question you will have is "when and how should I register my entity classes?" The obvious answer is to do it at application startup in a servlet context listener or an init servlet - wherever your application starts running. However, there is an easier way:

Use a DAO

By accessing Objectify through your own DAO class, you can register your entities in a static initializer and also add domain-specific helper methods. Create a DAO class extending DAOBase:

public class DAO extends DAOBase
{
    static {
        ObjectifyService.register(YourEntity.class);
        ObjectifyService.register(YourOtherEntity.class);
    }

    /** Your DAO can have your own useful methods */
    public MyThing getOrCreateMyThing(long id)
    {
        MyThing found = ofy().find(clazz, id);
        if (found == null)
            return new MyThing(id);
        else
            return found;
    }
}

Now you can use your DAO and any higher-level, application-specific methods:

DAO dao = new DAO();

MyThing thing = dao.getOrCreateThing(123);
thing.incrementUseCount();

dao.ofy().put(thing);

Access the factory by calling dao.fact().

How NOT To Register Entities

You might think that you could register an entity as a static initializer for the entity class itself:

public class ThingA
{
    static { ObjectifyService.factory().register(ThingA.class); }
    // ... the rest of the entity definition
}

This is dangerous! Because Java loads (and initializes) classes on-demand, Objectify cannot guarantee that your class will be registered at the time that it is fetched from the database. For example, suppose you execute a query that might return several different kinds of entities:

Query<Object> lotsOfThingsQuery = ObjectifyService.begin().query();
lotsOfThingsQuery.ancestor(someParent);    // could find both ThingA and ThingB entities
lotsOfThingsQuery.get();    // throws IllegalStateException!

When Objectify tries to reconstitute an object of type ThingA, it won't be able to because the ThingA class will not yet have been loaded and the static initializer will not have been called. If your application actually does use a ThingA before this query is executed, it will work - and in fact, it may work 99.99% of the time. But do you really want to hunt down mysterious IllegalStateExceptions 0.01% of the time?

Automatic Scanning

Most J2EE-style frameworks, including Appengine's JDO/JPA system, do classpath scanning and can automatically register classes that have @Entity or other relevant annotations. This is convenient and could easily be added to Objectify without changing a single source file. There are, however, several reasons why this isn't part of the core:

  1. This feature requires either Scannotations or Reflections, bringing in 5-6 dependency jars. Objectify requires zero dependency jars, and we are loathe to change that.
  2. Developers would need to add a startup hook to your web.xml (a ServletContextListener) in order to trigger this scanning. Objectify currently requires zero changes to web.xml.
  3. Classpath scanning is slow because it opens each .class and .jar file in your project and processes every single class file with a bytecode manipulator. For a moderately sized project this easily adds 3-5 seconds to your application initialization time. That's 3-5 additional seconds that real-world users must sit waiting while your application cold-starts.

Of these issues, the last is the most fatal. If you think "My application gets a lot of traffic! I don't need to worry about cold starts!", you are overlooking the fact that App Engine starts and stops instances to meet demand all the time - at least one user somewhere is going to be affected on every spinup. Plus this happens every time you redeploy your application! There is no escaping cold-start time.

Furthermore, classpath scanning costs accumulate. If you use other tools that perform classpath scanning (Weld, Spring, JAX-RS, etc), they each will also spend 3-5s scanning your jars. It isn't hard to push your cold-start time into the tens of seconds.

That said, 3-5s might be reasonable for your specific project. It should be very easy to add as your own ServletContextListener that calls Reflections and registers the @Entity classes. Spring and other framework users should examine the Extensions.

Use Batch Gets Instead of Queries

In SQL, all data lives in tables and is accessed through queries. It is best not to imagine the Appengine datastore this way - conceptually shift to thinking of the datastore as a key-value store that happens to also let you index and query some values.

The reason this shift is important is because your most effective tool when working with chunks of data is the batch get() and put(). A batch get() by key will quickly fetch thousands of entities in parallel; running thousands of queries would take a relative eternity. Asynchronous queries can only provide limited help because GAE limits you to 10 concurrent requests.

Furthermore, a batch get() can be efficiently cached. Use Objectify's @Cached annotation and your get() may never need a trip to the datastore.

Of course, batch gets and queries are not necessarily fungible operations - but when they are, use a batch get.

Use Indexes Sparingly

If you make no explicit decision, all fields of your objects will be indexed. This might be convenient if you're not sure what you will need to query on later, but comes at a high computational price. Every single indexed property requires a separate write into a BigTable tablet. Unindexed properties are almost "free".

Consider using @Unindexed on your entities and then only @Indexed the fields you specifically need to query.

Avoid @Parent

Note: This was written before the advent of the high-replication datastore. @Parent is important now to ensure strongly consistent queries, and so this section needs to be revised. The general advice is still good but there are lots of places you will now need @Parent. TBD.

New appengine developers tend to overuse the @Parent annotation feature because at first it seems conceptually similar to owned relationships in the world of JPA. However, they are quite different concepts.

In JPA, an "owned" entity relationship provides referential integrity checking and cascading deletes/saves. The appengine datastore does neither of these things! All entities are simply values in a key/value store. Relationships are simply fields of type Key.

@Parent defines something different - a relationship that is embedded within an entity's Key. This means that part of what identifies the entity is the parent lineage. Why would you do this?

  • Transactions only work on entity groups, and entity groups are defined by the root parent. Transactions can only be used across entities with a common root parent.

However, using @Parent has several undesirable effects:

  • Creating key objects is cumbersome: new Key<Comment>(Comment.class, 123) is a lot easier to type than new Key<Comment>(new Key<Blog>(Blog.class, 456), Comment.class, 123)
  • Generated ids are only unique within a parent. Your entities will have duplicate ids (but always unique keys!)
  • You cannot reparent an entity without deleting it and recreating it. If the entity you wish to reparent itself has children, all those must be deleted and recreated.
  • Using @Parent can diminish the performance of your application when there is contention.

This last point is worth explaining in detail. All datastore writes are transactional - even if you have not explicitly defined a transaction. When an entity is put(), a transaction timestamp journal at the root of an entity group is updated, and if a collision is detected, the put() is retried (without an explicit transaction) or rolled back (with an explicit transaction). This means that writes to all entities in an entity group contend for the same transaction journal, potentially causing numerous retries.

Unless you specifically require transactional behavior, you should probably avoid using @Parent even when when entities have a conceptual parent/child relationship (eg Blog/Comment).

Use Pythonic Transactions

Transactions on Appengine are limited to a single entity group, but that doesn't make them useless. In fact, transactions are critical for updating entities when there is possible write contention.

Let's take the example of a counter. For simplicity's sake, we will ignore the fact that you need to shard counters for performance. The logic for a counter is basically "read value, add one, write value" - leaving a critical section in which two threads can read the same value, add one, and write the same value, missing out on a count.

The solution is to use transactions. When the transaction commits, Appengine will check the timestamp of the counter entity and throw a ConcurrentModificationException if the data was updated by a different thread. This is all fine and good and prevents data corruption, but it leaves our app dealing with pesky exceptions during periods of contention.

The solution is to retry those transactions. This is where it helps to build your transactions as little chunks of business logic that can be executed by code. In Python you use a lambda function; in Java you use an inner class.

DAOT.repeatInTransaction(new Transactable() {
	@Override
	public void run(DAOT daot)
	{
		Counter count = daot.ofy().find(Counter.class, COUNTER_ID);
		count.increment();
		daot.ofy().put(count);
	}
});

Simple, clean, without endless lines of boilerplate dealing with transaction setup and retries.

All the boilerplate goes into the DAOT class:

package com.similarity.queen;

import java.util.ConcurrentModificationException;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.googlecode.objectify.ObjectifyOpts;

/**
 * DAO that encapsulates a single transaction.  Create it and forget about it.
 * Also provides very convenient static methods for making GAE/Python-like transactions.
 * 
 * @author Jeff Schnitzer
 */
public class DAOT extends DAO  // DAO is your class derived from DAOBase as described above
{
	/** */
	private static final Logger log = LoggerFactory.getLogger(DAOT.class);
	
	/** Alternate interface to Runnable for executing transactions */
	public static interface Transactable
	{
		void run(DAOT daot);
	}
	
	/**
	 * Provides a place to put the result too.  Note that the result
	 * is only valid if the transaction completes successfully; otherwise
	 * it should be ignored because it is not necessarily valid.
	 */
	abstract public static class Transact<T> implements Transactable
	{
		protected T result;
		public T getResult() { return this.result; }
	}
	
	/** Create a default DAOT and run the transaction through it */
	public static void runInTransaction(Transactable t)
	{
		DAOT daot = new DAOT();
		daot.doTransaction(t);
	}
	
	/**
	 * Run this task through transactions until it succeeds without an optimistic
	 * concurrency failure.
	 */
	public static void repeatInTransaction(Transactable t)
	{
		while (true)
		{
			try
			{
				runInTransaction(t);
				break;
			}
			catch (ConcurrentModificationException ex)
			{
				if (log.isWarnEnabled())
					log.warn("Optimistic concurrency failure for " + t + ": " + ex);
			}
		}
	}
	
	/** Starts out with a transaction and session cache */
	public DAOT()
	{
		super(new ObjectifyOpts().setSessionCache(true).setBeginTransaction(true));
	}
	
	/** Adds transaction to whatever you pass in */
	public DAOT(ObjectifyOpts opts)
	{
		super(opts.setBeginTransaction(true));
	}
	
	/**
	 * Executes the task in the transactional context of this DAO/ofy.
	 */
	public void doTransaction(final Runnable task)
	{
		this.doTransaction(new Transactable() {
			@Override
			public void run(DAOT daot)
			{
				task.run();
			}
		});
	}

	/**
	 * Executes the task in the transactional context of this DAO/ofy.
	 */
	public void doTransaction(Transactable task)
	{
		try
		{
			task.run(this);
			ofy().getTxn().commit();
		}
		finally
		{
			if (ofy().getTxn().isActive())
				ofy().getTxn().rollback();
		}
 	}
}

This is the actual code used in Similarity to run all transactions.

Interesting discussions related to Objectify

  • IBM developerWorks' Twitter Mining with Objectify-Appengine, part 1 and part 2
Comment by chemamol...@gmail.com, Jan 10, 2011

Clazz stands for the class name of the Class/Kind of the entity you are looking for. If you are looking for a Car instance identified by id, you would write:

Car found = ofy().find(Car.class, id);
The signature of the method is:
<T> T find(Class<? extends T> clazz, long id);

Comment by CharmsSt...@gmail.com, Jan 13, 2011

Hi,would be grateful if you could include some examples on handling multiple entity transactions.

Comment by project member lhori...@gmail.com, Jan 13, 2011

Look at the second example here:

http://code.google.com/p/objectify-appengine/wiki/IntroductionToObjectify#Transactions

Just substitute a second beginTransaction() for the begin() call and you have two separate transactions.

Comment by project member lhori...@gmail.com, Mar 30, 2011

This is not a good place for support requests - please post that question to the Google Group.

Comment by m...@tthew.org, Aug 22, 2011

Hmm, this article suggests "Avoid @Parent", but if you're using High Replication Datastore (which you will do by default) then non-ancestor queries may return stale results.

So instead of "Avoid @Parent", I'd say "Use @Parent if you want strongly consistent data.

What gives?

Comment by project member lhori...@gmail.com, Aug 23, 2011

You can get a strongly-consistent read by doing a get(). It's a bit of apples-and-oranges, but in most cases where you can design for one or the other, you'll be better off using get() and without using @Parent and ancestor queries.

Comment by Ib.Ros...@gmail.com, Aug 26, 2011

Ye I don't get the aversion to @Parent either. if we are not using parent, doesn't this mean that we are also not assigning our entities to entity groups. And doesn't this mean that we forgo the ability to do transactions?

I would have thought that transactions are pretty critical in all but the most trivial applications.

cheers

Comment by project member lhori...@gmail.com, Aug 27, 2011

Any single entity group can handle one write per second. If you can live with that kind of throughput, by all means go ahead and group your entities together.

Comment by theb...@emanueleziglioli.it, Nov 11, 2011

About 'Reflections', now they can scan classes offline and load that data at startup from a file: http://code.google.com/p/reflections/wiki/UseCases That's interesting because it should eliminate all the cold start impact due to classpath scanning, leaving only the problem of the dependency on additional jars.

Comment by Ib.Ros...@gmail.com, May 17, 2012

I had some major problems getting your transactions class above to work. i eventually created my own DAO implementation from the ground up. works now.

cheers


Sign in to add a comment
Powered by Google Project Hosting