My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members
Links

scouchdb has moved to Github : http://github.com/debasishg/scouchdb/tree/master

0.4.1 is the latest here. There have been some incompatible changes in 0.5, which is hosted on Github


scouchdb : Persisting Scala Objects in CouchDB

scouchdb offers a Scala interface to using CouchDB. Scala offers objects and classes as the natural way to abstract entities, while CouchDB stores artifacts as JSON documents. scouchdb makes it easy to use the object interface of Scala for persistence and management of Scala objects as JSON documents.

Motivation

The primary motivation for making scouchdb is to offer a form of CouchDB driver to manipulate objects in a completely non-intrusive manner. The Scala objects are not CouchDB aware and remain completely transparent of any CouchDB dependency. Incorporating CouchDB specific attributes like id and rev take away a lot of reusability goodness from domain objects and make them constrained only for the specific platform.

Sample Session

Suppose I have a Scala class used to record item prices in various stores ..

case class ItemPrice(store: String, item: String, price: Number)

and I would like to store it in CouchDB through an API that converts it to JSON under the covers and issues a PUT/POST to the CouchDB server.

Here is a sample session that does this for a local CouchDB server running on localhost and port 5984 ..

// specification of the db server running
val couch = Couch("127.0.0.1")
val test = Db("test_db")

// create the database
couch(test create)

// create the Scala object
val s = ItemPrice("Best Buy", "mac book pro", 3000)

// create a document for the database with an id
val doc = Doc(test, "best_buy")

// add
couch(doc add s)

// query by id to get the id and revision of the document
val id_rev = couch(test by_id "best_buy")

// query by id to get back the object
// returns a tuple3 of (id, rev, object)
val sh = couch(test by_id("best_buy", classOf[ItemPrice]))

// got back the original object
sh._3.item should equal(s.item)
sh._3.price should equal(s.price)

Suppose the price of a mac book pro has changed in Best Buy and I get a new ItemPrice. I need to update the document that I have in CouchDB with the new ItemPrice object. For updates, I need to pass in the original revision that I would like to update ..

val new_itemPrice = //..
couch(doc update(new_itemPrice, sh._2))

The Scala client is at a very early stage. All the above stuff works now, a lot more have been planned and is present in the roadmap. The main focus has been on the non intrusiveness of the framework, so that the Scala objects remain pure to be used freely in other contexts of the application. The library uses the goodness of Nathan Hamblen's dispatch (http://databinder.net/dispatch) library, which provides elegant Scala wrappers over apache commons Java http client and a great JSON parser with a set of extractors.

Property filtering through Annotations

Very often we need to have different property names in the JSON document than what is present in the Scala object. Sometimes we may also want to filter out some properties while persisting in the data store. The framework uses annotations to achieve these functionalities (much like the ones used by jcouchdb (http://code.google.com/p/jcouchdb/), the Java client of CouchDB) ..

case class Trade(
  @JSONProperty("Reference No")
  val ref: String,

  @JSONProperty("Instrument"){val ignoreIfNull = true}
  val ins: Instrument,
  val amount: Number
)

When this class will be spitted out in JSON and stored in CouchDB, the properties will be renamed as suggested by the annotation. Also selective filtering is possible through usage of additional annotation properties as shown above.

Handling aggregate data members for JSON serialization is tricky, since erasure takes away information of the underlying types contained in the aggregates. e.g.

case class Person(
  lastName: String
  firstName: String,

  @JSONTypeHint(classOf[Address])
  addresses: List[Address]
)

Using the annotation makes it possible to get the proper types during runtime and generate the proper serialization format.

CouchDB Views

One of the biggest hits of CouchDB is the view engine that uses the power of MapReduce to fetch data to the users. The current version of the framework does not offer much in terms of view creation apart from basic abstractions that allow plugging in "map" and "reduce" functions in Javascript to the design document. There are some plans to make this more Scala ish with little languages that will enable map and reduce function generation from Scala objects.

But what it offers today is a small DSL that enables building up view queries along with the sea of options that CouchDB server offers ..

// fetches records from the view named least_cost_lunch
couch(test view(Views.builder("lunch/least_cost_lunch").build))

// fetches records from the view named least_cost_lunch 
// using key and limit options
couch(test view(
  Views.builder("lunch/least_cost_lunch")
       .options(optionBuilder key(List("apple", 0.79)) limit(10) build)
       .build))

// fetches records from the view named least_cost_lunch 
// using specific keys and other options for deciding output filters
couch(test view(
  Views.builder("lunch/least_cost_lunch")
       .options(optionBuilder descending(true) limit(10) build)
       .keys(List(List("apple", 0.79), List("banana", 0.79)))
       .build))

// temporary views
val mf = 
  """function(doc) {
       var store, price;
       if (doc.item && doc.prices) {
         for (store in doc.prices) {
           price = doc.prices[store];
           emit(doc.item, price);
         }
       }
  }"""
      
val rf = 
  """function(key, values, rereduce) {
       return(sum(values))
  }"""
      
// with grouping
val aq = 
  Views.adhocBuilder(View(mf, rf))
       .options(optionBuilder group(true) build)
       .build
val s = couch(test adhocView(aq))
s.size should equal(3)
      
// without grouping
val aq_1 = 
  Views.adhocBuilder(View(mf, rf))
       .build
val s_1 = couch(test adhocView(aq_1))
s_1.size should equal(1)

Attachment Handling

val att = "The quick brown fox jumps over the lazy dog."
    
val s = Shop("Sears", "refrigerator", 12500)
val d = Doc(test, "sears")
var ir:(String, String) = null
var ii:(String, String) = null

// create a document from an object    
couch(d add s)
ir = couch(d ># %(Id._id, Id._rev))
ir._1 should equal("sears")

// query by id should fetch a row
ii = couch(test by_id ir._1)
ii._1 should equal("sears")

// sticking an attachment should be successful
couch(d attach("foo", "text/plain", att.getBytes, Some(ii._2)))

// retrieving the attachment should equal to att
val air = couch(d ># %(Id._id, Id._rev))
air._1 should equal("sears")
couch(d.getAttachment("foo") as_str) should equal(att)

Attachments can also be created for non-existing documents. In that case, the document also gets created along with the attachment. Have a look at the test suite for the spec.

Handling Bulk Inserts / Updates / Deletes

Documents can be manipulated in bulks. Through one single POST, new documents can be simultaneously added, existing ones updated and deleted. Here is a sample session ..

// should insert 2 new documents, update 1 existing document and delete 1 

// a scala object
val s = Shop("Shoppers Stop", "refrigerator", 12500)
val d = Doc(test, "ss")

// another scala object      
val t = Address("Monroe Street", "Denver, CO", "987651")
val ad = Doc(test, "add1")
      
var ir:(String, String) = null
var ir1:(String, String) = null

// create a new document    
couch(d add s)
ir = couch(d ># %(Id._id, Id._rev))
ir._1 should equal("ss")

// create another new document      
couch(ad add t)
ir1 = couch(ad ># %(Id._id, Id._rev))
ir1._1 should equal("add1")
 
// new scala objects     
val s1 = Shop("cc", "refrigerator", 12500)
val s2 = Shop("best buy", "macpro", 1500)
val a1 = Address("Survey Park", "Kolkata", "700075")

// a dsl that adds s1 and s2, updates s and deletes t      
val d1 = bulkBuilder(Some(s1)).id("a").build 
val d2 = bulkBuilder(Some(s2)).id("b").build
val d3 = bulkBuilder(Some(s)).id("ss").rev(ir._2).build
val d4 = bulkBuilder(None).id("add1").rev(ir1._2).deleted(true).build

couch(test bulkDocs(List(d1, d2, d3, d4), false)).size should equal(4)

Scala View Server

The default implementation of the query server in CouchDB uses Javascript running via Mozilla SpiderMonkey. However, language aficionados always find a way to push their own favorite into any accessible option. People have developed query servers for Ruby, Php, Python and Common Lisp.

scouchdb gives one for Scala. You can now write map and reduce scripts for CouchDB views in Scala. Here is a usual session using ScalaTest ..

// create some records in the store
couch(test doc Js("""{"item":"banana","prices":{"Fresh Mart":1.99,"Price Max":0.79,"Banana Montana":4.22}}"""))
couch(test doc Js("""{"item":"apple","prices":{"Fresh Mart":1.59,"Price Max":5.99,"Apples Express":0.79}}"""))
couch(test doc Js("""{"item":"orange","prices":{"Fresh Mart":1.99,"Price Max":3.19,"Citrus Circus":1.09}}"""))

// create a design document
val d = DesignDocument("power", null, Map[String, View]())
d.language = "scala"

// a sample map function in Scala
val mapfn1 = 
  """(doc: dispatch.json.JsValue) => {
    val it = couch.json.JsBean.toBean(doc, classOf[couch.json.TestBeans.Item_1])._3; 
    for (st <- it.prices)
      yield(List(it.item, st._2))
  }"""
    
// another map function
val mapfn2 = """(doc: dispatch.json.JsValue) => {
    import dispatch.json.Js._; 
    val x = Symbol("item") ? dispatch.json.Js.str;
    val x(x_) = doc; 
    val i = Symbol("_id") ? dispatch.json.Js.str;
    val i(i_) = doc;
    List(List(i_, x_)) ;
  }"""

Now the way the protocol works is that when the view functions are stored in the view server, CouchDB starts sending the documents one by one and every function gets invoked on every document. So once we create a design document and attach the view with the above map functions, the view server starts processing the documents based on the line based protocol with the main server. And if we invoke the views using scouchdb API as ..

couch(test view(
  Views builder("power/power_lunch") build))

and

couch(test view(
  Views builder("power/mega_lunch") build))

we get back the results based on the queries defined in the map functions. Have a look at the project home page for a complete description of the sample session that works with Scala view functions.

Reduce functions work similarly in Scala. The current implementation does not support rereduce .. watch out for more updates .. Here is a sample view query using map and reduce functions in Scala ..

// create the design document
val d = DesignDocument("big", null, Map[String, View]())
d.language = "scala"

// map function in Scala
val mapfn1 = """(doc: dispatch.json.JsValue) => {
  val it = couch.json.JsBean.toBean(doc, classOf[couch.json.TestBeans.Item_1])._3; 
  for (st <- it.prices)
    yield(List(it.item, st._2))
}"""

// reduce function in Scala   
val redfn1 = """(key: List[(String, String)], values: List[dispatch.json.JsNumber], rereduce: Boolean) => {
  values.foldLeft(BigDecimal(0.00))((s, f) => s + (f match { case dispatch.json.JsNumber(n) => n }))
}"""

// create the view    
val vi_1 = new View(mapfn1, redfn1)

Create the document as shown earlier. Then use the view to do the query ..

val ls1 = 
  couch(test view(
    Views.builder("big/big_lunch")
         .build))
ls1.size should equal(1)

reduce, by default returns only one row through a computation on the result set returned by map. The above query does not use grouping and returns 1 row as the result. You can also use view results grouping and return rows grouped by keys ..

val ls1 = 
  couch(test view(
    Views.builder("big/big_lunch")
         .options(optionBuilder group(true) build) // with grouping
         .build))
ls1.size should equal(3)

Have a look at the test suite ScalaViewServerSpec for more details.

Views and Scala Objects

scouchdb views are now seamlessly interoperable with Scala objects. The view API offers capabilities to construct Scala objects either as a post-process on the JSON results of the view or as a direct output from the view query itself.

Here is an example ..

The following map function returns the car make as the key and the car object as the value ..

// map function
val redCars =
  """(doc: dispatch.json.JsValue) => {
        val (id, rev, car) = couch.json.JsBean.toBean(doc, classOf[couch.db.CarSaleItem]);
        if (car.color.contains("Red")) List(List(car.make, car)) else Nil
  }"""

The following query returns JSON corresponding to the car objects being returned from the view ..

val ls1 = couch(carDb view(
  Views builder("car_views/red_cars") build))

On the client side, we can have a simple map function that converts the returned collection into a collection of the specific class objects .. Here we have a collection of CarSaleItem objects ..

val objs =
  ls1.map { car =>
    val x = Symbol("value") ? obj
    val x(x_) = car
    JsBean.toBean(x_, classOf[CarSaleItem])._3
  }

But it gets better than this .. we can also have direct Scala objects being fetched from the view query directly through scouchdb API ..

// ls1 is now a list of CarSaleItem objects
val ls1 = couch(carDb view(
  Views builder("car_views/red_cars") build, classOf[CarSaleItem]))

Note the class being passed as an additional parameter in the view API. Similar stuff is also being supported for views having reduce functions. This makes scouchdb more seamless for interoperability between JSON storage layer and object based application layer.

Have a look at test case for details.

Setting up the View Server

The view server is an external program which will communicate with the CouchDB server. In order to set our scouchdb query server, we need to have the appropriate configurations set in initialization files.

  • The common place to do custom settings for couchdb is local.ini. This can usually be found under /usr/local/etc/couchdb folder. There has been some changes in the configuration files since CouchDB 0.9 - check out the wiki for them. In my system, I set the view server path as follows in local.ini ..
  • [query_servers]
    scala=$SCALA_HOME/bin/scala -classpath <path to scouchdb jar> -DCDB_VIEW_CLASSPATH=<path to scouchdb jar> couch.db.VS "/tmp/view.txt"
    1. scala is the language of query server that needs to be registered with CouchDB. Once you start futon after registering scala as the language, you should be able to see "scala" registered as a view query language for writing map functions.
    2. The classpath points to the jar where you deploy scouchdb.
    3. couch.db.VS is the main program that interacts with the CouchDB server. Currently it takes as argument one file name where it sends all statements that it exchanges with the CouchDB server. If it is not supplied, all interactions are routed to the stderr.
    4. another change that I needed to make was setting of the os_process_timeout value. The default is set to 5000 (5 seconds). I made the following changes in local.ini ..
    5. [couchdb]
      os_process_timeout=20000
    6. The passed in setting for CDB_VIEW_CLASSPATH should point to scouchdb jar as well as any additional jar that need to be passed to the Scala interpreter in order to execute the map/reduce functions.

Dependencies

As mentioned above, the framework uses the goodness of Nathan Hamblen's dispatch APIs for JSON manipulation and HTTP wrapping. It also uses the ideas of annotation based property filtering from jcouchdb. Additionally the project has the following dependencies of jars:

Changelog and Version Info

  • Initial version 0.1 (available as tags/release-0.1)
  • Version 0.2 (available as tags/release-0.2)
    1. added support for bulk document uploads
    2. added support for temporary views
    3. added support for attachments
    4. enhanced test cases
  • Version 0.3 (available as tags/release-0.3)
    1. added support for Scala View Servers (map only)
    2. enhanced test cases
  • Version 0.3.1 (available as tags/release-0.3.1)
    1. added support for reduce in Scala View Servers
    2. rereduce not yet implemented
    3. map in View Server optimized for performance
    4. enhanced test cases
  • Version 0.3.2 (available as tags/release-0.3.2)
    1. fixed bug in aggregation handling of reduce in View Server
    2. map functions can now output Scala objects
    3. added test case for getting map results as Scala objects
  • Version 0.4 (available as tags/release-0.4)
    1. Views are now seamlessly interoperable with Scala objects
    2. added view api for building Scala objects directly from couchdb views
  • Version 0.4.1 (available as current trunk)
    1. Fixed bug related to Boolean serialization
Powered by Google Project Hosting