My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
PagedQuery  
PagedQuery homepage
PagedQuery
Updated Feb 18, 2011 by bendavie...@gmail.com

About

PagedQuery is a paging abstraction that is applied to appengine Query or GQLQuery objects.

PagedQuery consists of a single class. This class is a facade to a db.Query object that offers additional functionality to enable paging operations on query datasets. This class uses the cursor functionality introduced recently into Google App Engine to provide a full paging abstraction. Note that support for all Query and GQLQuery methods is provided, although executing a method not supported by GQLQuery will raise an error on PagedQuery objects instantiated with a GQLQuery object.

Of course, the cursor() and with_cursor() methods should only be used rarely since most uses of cursors duplicates the functionality (and defeats the purpose) of this facade. The cursor methods are provided for completeness.

Note: There is some indication that PagedQuery does not work with the SDK when using SQLite as a backend. This is related to open issues with that implementation regarding cursors.

Where

You can find the PagedQuery class in he3/db/tower/paging.py

Usage

Creating the PagedQuery

Instantiate a PagedQuery with an existing db.Query or db.GQLQuery and a page size:

myPagedQuery = PagedQuery(myEntity.all(), 10)

PagedQuery supports the filter and ordering methods of db.Query if you instantiate the object with a db.Query (not db.GQLQuery). You can apply these methods before or after instancing the PagedQuery. Eg.

myQuery = myEntity.all().filter('myPropName >', my_prop_value)
myPagedQuery = PagedQuery(myQuery, 10)
myPagedQuery.order('-myPropName')

This is fine.

Fetching Pages

To fetch the first page of the results:

myResults = myPagedQuery.fetch_page()

To fetch any particular page, use a page number:

myResults = myPagedQuery.fetch_page(3)

On a subsequent request, recreate the same query and PagedQuery object, and request another page:

myResults = myPagedQuery.fetch_page(4)

Getting Other Information

To determine whether a particular page exists:

nextPageExists = myPagedQuery.has_page(5)

To get a count of the number of pages available with the dataset:

num_pages = myPagedQuery.page_count()

Some necessary Implementation Details

Cursor Limits

This class works using the Cursor features introduced in the Google App Engine SDK 1.3.1. All cursor restrictions apply. In particular, pages will not re-order if changes are made to the query results prior to current page. Some query features (IN and != filters) will not work and sorting on multi-value fields will be unreliable.

See the cursor documentation for more information

Efficient Use

The most efficent way to use PagedQuery is to retrieve one successive page after another. Access to any previous page is just as efficient. Avoid calling the page_count() method or requesting pages more than one in advance of the highest page yet requested.

Memcache

Internally PagedQuery persists information to memcache. The information cached includes a query identifier and a hash of pages and cursors. Due to the unreliable nature of memcache, persistence can not be ensured. PagedQuery will handle memcache misses, at a reduced performance profile.

Data Updates

Because of the cached nature of the internal cursors, if you need to ensure the most up to data is retrieved, clear all cached data:

myPagedQuery.clear()

myPagedQuery.fetch_page() (which returns the first page) also clears the cached data.

Mutating the query in any way (using .filter(), order() or similiar) also clears the cache.

Note that when retrieving a page for a second time, the internal cursors are checked for changes. If changes exist, the cursors corresponding to all subsequent pages are cleared from the cache.

Unit Tests

PagedQuery has UnitTests located in the repository at trunk/src/test/test_paging.py .

Comment by redliner...@gmail.com, Aug 12, 2010

Will this work on really large data sets? For example thousands of results in possible data set? Is it possible to use it together with PageLinks if I do not know the total number of results exactly? For example 1000000 ? And I want to avoid calling the .count on the dataset because it could be very large.

Comment by project member bendavie...@gmail.com, Aug 14, 2010

Hi Redliner.cz. No, it is probably not something you would use for large datasets, unless you wanted to force the users to travel through the recordset sequentially.

In the worst case PageLinks performs like a simple query with an offset to find the correct page. In large datasets, this could be bad. In the best case (users only ever travel to sequential or previous pages) performance should be reasonable depending on your page size.

FYI PageLinks will always perform at least one .count() on your dataset. If a single count() must be avoided at all cost, do not use PageLinks. Assuming memcache is not flushed, however, all subsequent calls are avoided.

Hope this helps.

Comment by ramsys, Sep 16, 2010

Hi, I reported an isse some days ago: http://code.google.com/p/he3-appengine-lib/issues/detail?id=10

Do you think there's any chance to correct it shortly? Any advice about how to avoid it will be welcome too.

Thanks.

Comment by ramsys, Sep 23, 2010

Hi! I guess you're pretty busy, but... do you think it's possible for you to work on this issue? Otherwise I will modify the code by myself

Thanks.

Comment by project member bendavie...@gmail.com, Sep 23, 2010

Sorry for the delay in responding - please see my comment in the issue log - Thanks!


Sign in to add a comment
Powered by Google Project Hosting