|
|
We talked a couple of days ago about writing a flat XML file connector for Jangle as the first demo to go out. I (townxelliot) think that an XML connector isn't the way to go. Here are the different options for types of connector we could go with, and my thoughts on them:
1. XML connector
Scenario:
We write an connector which accepts some generic XML; a P-dev can write an export script to write an XML file in the same format, push it to Jangle, and see their LMS data via REST services.
Imagine we do this for borrower info. first. We create an XML reader for that, which reads from the XML file and presents it via REST. Easy.
Pros:
- Simple to use, as you just put the files in the right place and away you go.
- Works well for read situations, as you can easily parse the XML document to pull out the element(s) you want.
- Works well if you have a single XML source you want to parse.
Cons:
- Beyond a single source file, things get tricky. How do you relate two XML documents together?
- Using another document which imports them as namespaces? I'd argue P-devs don't care about XML Schema and don't want to learn it. I would say they are typically happier with SQL.
- Have separate source XML documents (e.g. borrowers in one file, loans in another) and write some application code to relate them together. At this point, you are starting to write an ILS.
- Use an XML database to manage the relationships (e.g. something like eXist). This makes managing relationships easier, and also provides for writes to the database. However, you lose the benefit of flat XML files, as the data exported from the ILS needs to go into the XML database.
- Flat file XML doesn't scale well. I imagine it's going to be fairly slow if you have a huge number of records in the file (unless you're using an XML database). The alternative (writing stream-based parsers) makes the application code more complicated; DOM is probably not practical for large files.
- XML makes writes harder (unless you have an XML-native database). We would need to be manipulating the DOM then writing it back to the file. This could cause IO problems if you're writing a lot, which again means it won't scale.
2. Relational database connector
Scenario:
We write an adapter which works with a generic back-end database; a P-dev can write an export script to write SQL statements into that database, and see their LMS data via REST services.
Imagine we do this for borrower info. first. The P-dev just has to work out how to translate their ILS to our generic database schema.
Pros:
- Developing an connector requires SQL knowledge, rather than XML. For P-devs, I'd say this is more natural. Rather than export their ILS data to XML, they export it into an embedded Jangle database (e.g. Derby, SQLite, HSQLDB). Doing this is as easy as exporting to XML (I'd argue).
- Relationships are supported. We don't have to work out how to relate tables together, while we would have to work out how to relate XML files.
- We have excellent tools to abstract the database tables and relationships (e.g. Hibernate). We'd have to find alternative if we wanted to relate XML files, or write our own.
- We can easily support write operations in the demonstrator, e.g. placing reservations.
- It scales well. The demonstrator could handle thousands of records without noticeable loss of speed.
Cons:
- If we go beyond read operations, we're writing an ILS again.
- We need to bundle the database with Jangle - but with embedded dbs I'd say this is not a problem, as there's no need to get a server running.
- Attractiveness to developers - if I'm a developer trying to use Jangle with my ILS, I might have a lot of work to do to get the data into the right format to go into this connector's database, unless the connector is really simple. (Though you have the same issue with XML.)
3. Existing ILS database connector
Scenario:
We write an connector which works with an existing open source ILS (e.g. Koha); a P-dev with that ILS can connect Jangle onto the front of it with a few configuration switches.
We could even write this in collaboration with LibLime from the word go.
Pros:
- We're working with a real ILS, rather than our platonic ideal ILS. This could tease out Talis-specific issues which have sneaked into Jangle.
- We can work out to a full set of features. If using our own generic database back-end, we would have to add a lot of logic to support this; if using an existing ILS, much of the logic is already there.
Cons:
- These ILSs need a lot of infrastructure around them. We'd have to distribute this with Jangle somehow to make it easy to run from a download. But something like [XAMPP] could make this possible.
We would have to learn the ILS pretty thoroughly.
4. Open source Talis ILS
Scenario:
Talis ILS database is open sourced, and ported to something like MySQL (I think some of this work is already done).
(This one is pie in the sky, but I thought it was worth mentioning.)
Pros:
- We develop an open source connector which will directly feed into our commercial products.
- We're working with an ILS which has business logic already (e.g. in Alto). We're not re-inventing the wheel and coding another ILS database.
Cons:
- Getting data into the Talis database (so they can use Jangle) doesn't make sense for a developer with a different ILS. Or does it?
- Potential revenue stream loss - but Talis makes money from solutions, not software licences, anyway.
Making the choice
So, the choice of XML or database boils down to: what do we want the Jangle demonstrator to do?
- Read data from a single source (e.g. borrower info.). In this situation, XML is probably easiest. But that's the point: this task is so trivial that there's no value in showing Jangle doing it. I could write a Rails application (as could any P-dev) to do this in literally an hour.
- Read from multiple sources (e.g. borrower and their loans). Here, you run into the issue that relating two XML files together is harder than relating two SQL tables together. If we were using SQL, we could leverage existing tools we know.
- Read and write (e.g. place reservations). XML would be bad for this; for our generic embedded database, we'd have to add logic to support this and end up writing an ILS; for an existing ILS, we could leverage the workflows already in place.
Developers who want to pump data into Jangle would either be writing exports from their ILS data to XML or SQL. It doesn't make much difference to them. But it could make a significant difference to us, and to the flexibility and performance of a Jangle demonstrator.
(townxelliot): We should start with (b): write a connector for a generic ILS database back-end which P-devs export their data into. This would provide the capability to do more than just read from a single source (e.g. we could eventually do reservations and link tables together). It would scale. It would be easy to deploy and get up and running.
Sign in to add a comment
