My favorites | Sign in
Project Home Downloads Wiki
Search
for

DISCLAIMER: This site is NOT an official JISC site but rather the notebook-wiki for the jiscEXPO Programme Manager (David) to keep track of the various projects taking place within this programme. As this is a notebook it is only intended to make sense to its author and should be read with the proviso that it is in notes and therefore incomplete. Please contact d.flanders@jisc.ac.uk to get context for the various notes.

luceroproject  
establishing the procedures to expose linked educational and research data into university's practice & policies.
Progress-Archived, User-Teacher, User-Researcher, Theme-Arts, Theme-Humanities, Theme-Administration, User-Administrator, Theme-Library, User-Librarian, Interoperabilitiy, OpenTechnologies, ResourceDiscovery, ToolsandTechniques
Updated Oct 31, 2011 by luceropr...@gmail.com

This page was archived on 31st October 2011. This project is complete.

Project Overview

  • Full Name of Project: LUCERO – Linking University Content for Education and Research Online
    • Project Tag: luceroproject
  • Project Descriptions:
    • short: LUCERO will create the procedures to include the exposure of linked educational, research and administrative data in the university's practices.
    • long: Working with groups of learners, researchers and administrative practitioners based at the Open University, LUCERO will scope, prototype, pilot and evaluate reusable, cost-effective solutions relying on the 'linked data' principles and technologies for exposing and connecting educational and research content. Core sets of resources considered within LUCERO are institutional repositories of research and educational material, including collaborations with the Faculty of Arts to scope, pilot, prototype and evaluate specific content exposure and linked data requirements of researchers working within the Arts and Arts History domains, providing experience on the exposure and connection of research data outputs, and demonstrating their concrete benefits. On the basis of such concrete experience, the project will aim to document business process changes required to achieve successful integrated institutional approaches to exposing educational, research and central administrative content as linked data.
  • Project Outputs/Products/Deliverables (what thing are you producing?):
    1. A "Toolkit ABout Linked Open Institutional Data" (TABLOID). TABLOID is an evolving toolkit made of code, documentation and examples in various places, and trying to address the people with various roles involved in the deployment of linked data: from managers who want to quickly understand the benefits, to developers who are expected to work with it, develop applications and integrate it into their technical workflow.
    2. technical infrastructure via data.open.ac.uk to facilitate the creation, exposure and use of linked data in a HE/FE organisation
    3. experience reports and guidelines on the processes necessary to integrate linked data in the Univeristy’s practices and workflows
    4. demonstrators of the benefits of exposing educational and research data as linked data through applications improving access to such data in the domain of Arts.

Project Details

  • Name of Host Institution: The Open University
    • Department: Knowledge Media Institute, postcode = MK7 6AA
  • JISC Programme/Strand: INF11 / jiscEXPO
  • Length of Project: 12 Months
    • Project Start Date: 01 June 2010
    • Project End Date: 31 May 2011
  • Grant Awarded to Project: £100000.00

Project Team

Please see "about" page for project: http://lucero-project.info/lb/about/

  • Project Director: Mathieu d'Aquin / skype: dacmat / +44 (0)1908 655526 / skype: dacmat / +44 (0)1908 655526
  • Project Manager: Owen Stephens / o.stephens@open.ac.uk / Owen (PM): skype: owen.stephens / +44 (0)1908 658701
  • Developers: Fouad Zablith, Salman Elahi, Carlo Allocca
    • Marketing Manager: Stuart Brown s.brown@open.ac.uk
  • Account Manager: Jane Whild - KMi Administration Manager
  • Partners: Academics in the Faculty of Arts, The Open University / Library Specialist at the Open University's Library
    • Consultants: n/a
    • Users: School or Fine Arts, Library Repository
  • OTHER (please describe their function and role in the project)
    • Enrico Motta, Professor (Steering committee chair)
  • Users: At least 100 users (including staff, students, researchers, developers at the Open University)

Project Team Emails: Mathieu d'Aquin <m.daquin@open.ac.uk>, Owen Stephens <owen@ostephens.com>, Fouad Zablith <f.zablith@open.ac.uk>, Enrico Motta <e.motta@open.ac.uk>, Non Scantlebury - Contact in Library <n.l.scantlebury@open.ac.uk>, luceroproject.daquin@gmail.com

Project Documentation

Licensing

  • All code within the project (unless otherwise stated) is licensed as: LGPL v3 - EPL v1
  • All project documentation including presentations are licenses as: Creative Commons Attribution UK 2.0: England and Wales
  • All data produced by the project are licensed as: Creative Commons Attribution UK 2.0: England and Wales

Review of Final Product

http://lucero-project.info/lb/2011/07/final-product-post-tabloid/

  • Final Project Product: A Toolkit (aka Cookbook) for How Universities and Colleges can Manage, Develop and Support their Institutional Data as Linked Data.
  • People who can use the Final Product: IT Managers, Developers, and Data Managers at Universities and Colleges
  • Rating from Review Panel of Final Product: 3 of 5 Stars

Quotes from Review Panel on the Value of the Final Product

  • "This project was new to me and I was very pleased to see that it was both funded and expertly carried out. The final project post points to a number of rich and informative posts that contextualized the use of Linked Data within the HE sector, and offer some useful examples of the benefits of the LD approach. Having read the final project post, as well as the 'tabloid page' and the 'what is linked data?' page and a number of other posts, I am keen to start a serious discussion around the advocacy of Linked Data at our institution and feel like the toolkit provides good support for doing so. I was also pleased to see the spin-off 'Linked Universities Portal', which promises to be a useful resource (although may be deprecated in favour of offering such information on data.ac.uk?)"
  • "One aspect that might be improved upon is guidance on achieving overall institutional buy-in. The 'What is Linked Data?' page is a very good overview, but as the project makes clear, many of the problems that Linked Data presents relate to obtaining the data in the first place (i.e. they are people/organisational problems, not technical). It would be useful if a section of the website addressed this more clearly, perhaps offering an FAQ for Registrars, an FAQ for ICT Managers, an FAQ for Developers, an FAQ for IP Managers, etc. - much of this is addressed by the project website, but could be more usefully presented, because as it stands, the project provides a compelling case for going down the LD route, but could improve on offering support to enthusiasts who are hitting institutional brick walls."
  • "Tabloid proposes a useful framework (multi-tiered matrix) of components and activities aimed at providing managers, those who create/provision data, and technology types with a guide to the whys, whats / hows, plus potential benefits of applying linked data to their environs and needs. The web site provides a valuable mixture of models for analysis of workflows and data transformation coupled with access to experience gained over the last several months. Also included are examples of applications that help illustrate the creation, use, and benefits of various flavors of linked data."
  • "Clear product strategy and selective in terms of the target audience. Comprehensive coverage of introductory resources for developers and managers, although not for end-users."
  • "As with each Linked Data project, the most interesting and re-useable outcome of Lucero is the data itself. I can already see many interesting links with other datasets, even beyond the ones the team has already linked into."

How this Final Product Might be Further Developed

  • "As stated above, a more clearly defined set of FAQs aimed at managers, a set of key re-usable slides - summary info (extracted from the longer pages/posts) which clearly points to the value that Linked Data brings to the running of the institution. Did the OU create a formal business case in order to embark on LD? If so, this should contain much of the information that other institutions would benefit from. Please share it."
  • "The 'matrix graphic' raises great expectations. I wanted to click “here” (e.g., Vocabularies and Ontologies) and find information about schema in general as well as recommendations for vocabularies that would be useful in one or another setting in Open University environs. Understanding full well the magnitude of the effort required to populate the graphic doesn’t lessen the value of having documentation, reference implementations, and code examples available in the well ordered form hinted at in the Tabloid matrix. Too, one wonders if the OU crew has a sense of how much use is being made of their work at other academic institutions. Could, for example, work with a few well chosen colleagues increase the density of information and examples that populate the matrix while improving the useability of resources by rounding off any OU-only edges and corners?"
  • "Despite omitting end users as the target audience, the project developed demonstrators and tools that may entail new and enhance user experience. For example, the wayOU app attempted to source and connect use (e.g. "check-in") context and building data. These tools could be developed further so that they could be useful to the end users. The toolkit may also be useful as learning resources for students, researchers and courses related to linked data."
  • Perhaps the toolkit as a whole could have been better broken into a few different products, to be fair. The reasons for collating it into one product is clear, but it does make for a slightly disparate experience."

Correspondence (below as comments)

Please see below in comments section for any and all correspondance by Programme Manager with the Projects. Also all edits that need to be made to this page please place in comments section and the author will correct.

  • Email, Phone calls, twitters, links sent, etc.

Comment by project member owen.patel, Aug 12, 2010

I (Owen Stephens) have started as Project Manager (taking over from Richard Nurse who was covering while I finished my previous project)

Comment by project member dff.j...@gmail.com, Sep 23, 2010

DFF requested SiteVisit

Comment by project member dff.j...@gmail.com, Oct 25, 2010

DFF did SiteVisit on 22 October 2010

Comment by project member dff.j...@gmail.com, Oct 31, 2010

Executive Summary

"Everytime we talk with people who have data around the institution we find very exciting possibilities for what can be achieved..." - Matheiu d'Aquin (Principle Investigator of the luceroproject)-

The above quote is the first thing I wrote down during my SiteVisit to the luceroproject at the Open University, and it turned out to be spot on as the entire site visit was exciting. I don't often call meetings 'exciting', but this project is exciting and if you are working in an IT department within a University you better read this report, as soon your department will need to be making these same decisions.

So what is the problem that the #luceroproject is going to solve? Almost too simply it is making the central data of the institutions (research, learning, teaching and administration) available for anyone to use. Of course, there is nothing new about this as all the data is already published on the Web in hundreds of different sites and with thousands of different editors. The difference being (thanks to the #luceroproject) that you can now go to a central site where all the data has been collected, cleaned and listed so anyone can reuse the data right then and there (without spending days, if not weeks searching for it and copying and pasting it into a spreadsheet in a comprehensible struture). This new centralised-data-watering-pump is the first launched of its kind in UK Universities (and should be celebrated accordingly!):

http://data.open.ac.uk <-- this one link makes my job all worth it - well done guys!

So yes this is a very pretty URL, but why would you want to reuse the data being made available at this central place? Well, here are just a couple of end user perspectives from around the OU that might be applicable to your institutions as well:

  • Pro-Vice Chancellor: "Can someone please give me a list of all the researchers we have working in the area of Green Science?"
  • Estates Department: "Can I get a list of all the equipment we have placed around the campus and in which rooms this equipment is located?"
  • Marketing Department: "We need to advertise a new course that is related to latest hot news topic, can you tell me all the previous lecturers, events and resources we have on this topic?"
  • Course pack publishing: "We need to print a new course pack with all the library resources we own that are related and available"
  • Teacher: "I'd like to know how many students I have taught over the last ten years and what their grades have been"
  • Learner: "I've got a visual impairment and I'd like to know all courses you offer that have course material that meets my accessibility needs?"
  • Librarian: "I'd like to list all the research that has been published over the past 4 years so I can submit it to the upcoming UK's Research Excellence Framework (REF)"
  • Researcher: "I've had dozens of smaller research council funded projects that do relate but I can't easily link all of the data together can someone please put these disparate pieces of knowledge together so I can reuse for future funded projects".

What I find amazing is that while Universities in this country have achieved a great deal in terms of putting their content publically online, they are not able to answer anyone of the above questions easily. And yet, these are essential (and often legally obligatory) questions!

How do I get a "central data watering pump" at my institution ?

So are you convinced yet? Ready to get your University to launch a centralised data.foo.ac.uk data watering pump yet? If so, here are a couple of arguments that you might want to present to your senior management:

Internal argument: We have all this data and we can't easily use it. If I want to launch a new website for a course or a research project we have to always start from scratch and do all this copying and pasting before we are able to have a site that has enough content to retains someone's interest for any period of time once they have found us. Let alone make ourselves discoverable by having enough content to appear high in the Google search results (more structured content equals better search results). If we had a central data watering hole we could just download all the people, events, courses, research and resources that the University has and then import that data into the website thereby making it immediately usable as well as linked to all the other websites across our .ac.uk domain. Your SEO experts can tell you what this will achieve for your website.

The OU is taking this internal argument very seriously especially since it relates to accessibility and their ability to fulfil their mission statement of "open to people, places, methods and ideas". What is your institutions "vision statement" and how can you tie #linkeddata to it? Or rather more importantly how can you tie #linkeddata to the core business argument of getting more students in seats?

External argument: If BBC, Gov't and Universities like the OU, Oxford are starting to do this stuff so as to enable their own websites to be better linked then perhaps we should consider competing for our users interests as well, especially since linking our data in this way not only enables better internal data sharing efficiency but also for Universities to better import data into new sites thereby making them better linked with the wider web.

Again the OU is looking to incorporate this external argument into the workflow of what they can achieve as they currently use Drupal which via the http://semantic-drupal.com/ project is only a step away from being able to seamlessly move data in and out of websites so they can automatically aggregate the most valuable content to where it is needed. Again the OU is used to having to re-invent themselves to present new and interesting courses to their students. Being able to shift the data graph to fit the perspective of the end user is a very strong argument for making websites that are compelling enough to get users to buy a course from the OU.

The OU cited the recent efforts by the BBC Wildlife website and its ability to enable users to "follow their nose" as they move from page to page with new links automatically making themselves available to as the users' perspective. Keeping people on your .ac.uk website is a valuable argument that any senior manager will understand.

Yet the #luceroproject doesn't stop there, the next set of workpackages for the #luceroproject will be to incorporate the data that exists out on the wide Web to help enrich and keep users on their sites longer. By incorporating linkeddata from the BBC, Wikipedia (DBpedia) and data.gov.uk the OU will be enriching their data to an extent that no single university no matter how large could ever achieve.

This "One Web" view is a powerful one as more content will be integrated that will assure that the OU is taking advantage of the most popular websites across the Web.

The Lucero Project Part Deux

The #luceroproject has achieved a great deal already by winning over the hearts and minds of multiple central departments (library, marketing, estates, research, etc). However having data for use in easy to use lists is not enough for most, -after all- spreadsheets can be passed around to each department to achieve the same kind of sharing. However what spreadsheets cannot accomplish easily, is the ability to enable reusable "apps" that will actually improve the lives of end users.

The #luceroproject over the next six months be making several key decisions to decide which app they are going to produce and how that will convince senior managers that centralising and linking data is a good thing to invest in (especially when we need to do 'more with less'!).

The options discussed for various potential prototype apps (for specific end users) included:

  • An app for the Pro-vice chancellor that would list key figures and statistics as they are made live on data.open.ac.uk. A suggestion that a perminatley installed flatscreen could go into the PvC's office that would list the total numbers of staff and students, the total numbers of courses, the total number of podcasts and so forth and so on. The kind's on numbers that would directly benefit the PvC in making decision as well as giving speeches (to all those benefactors we are going to need in the future).
  • An app for teachers that helps produce courses, an app for pulling together the resources, past events, teachers, lecturers, podcasts, research and tv shows that relate to a certain topic so that those trying to build up a new course can quickly create a website that would demonstrate would would be possible given the current resources that the OU owns. This would also give an immediate vision to central departments like marketing and the library (let alone the PvC) for what the course might look like and how they might sell the course to and for how much.
  • Perhaps the most significant opportunity is the OU's efforts around the "social learning platform" which would focus on the specific viewpoints and voices of teachers and learners. This social learning platform could be priceless alongside the OU course materials, resources and people as it would provide the third leg to the teaching and learning that the OU can provide online anytime-anywhere and as a participatory social experience.

I think the leading candidate for the final prototype demonstrator app of the project will work around the social learning efforts that the OU has begun to invest within, however time will tell as the #luceroproject team will need to manage both the additional exposing of data via data.open.ac.uk as well as brainstorm how the data will make a difference to the lives of end users in real ways.

Also of significance is the sustainability of data.open.ac.uk after the end of the project. Will the University see this site as a core tool across the institution and be willing to invest accordingly? Time will tell.

Technical

Perhaps the most enjoyable part of the day (IMHO) was in discussing the technical architecture with Mathieu. Their use of the OWLIM triplestore by ontoText was the first I've seen of in a live University production environment. OWLIM comes in two flavours: swiftOWLIN and bigOWLIM (though if you are interested in how big, bigOWLIM is then do check out the OWLIM benchmark tests). The OU is using swiftOWLIM which claims to be the fastest triplestore in the world. There is still further discussion to be had around the value in indexing the triplestore to achieve the same speed as an SQL database, but I'm hoping to pull Mathieu along with some other gurus down to JISC HQ for our informal test of triplestores including, Talis' platform, Neo4J, 4Store, Jena, Mulgara, OWLIM and some others as I feel the speed is not the primary issue, but rather breadth of "play" that each triplestore offers (e.g. features baked in, ease and variation of installation, community friendliness, reputation, etc). I've seen several architectures now that use triplestores as a low level layer that can then be indexed to expose indices on a nightly basis that are as fast and often faster (reddis, key-value stores) than SQL databases.

The above is the machine workflow for getting data into the system. Moving from left to right, the frontend of the systems ingests data via RSS/XML that is exposed via systems such as the libraries repository (ORO) and their iTunes podcast feed. They are also ingesting data from spreadsheets as it is provided from centralised departments. Once the data is "Collected" there are then a series of data triage filters run over the data that clean it, assign URIs to the various vocabulary items and assure the data matches the core data architecture within the triplestore. Once the data has been "Extracted" in this way it is ready for dispatch where there is a nightly build process that adds the new URI "Links" to the triplestore. From there the data is indexed and made available via the data.open.ac.uk website including a SPARQL query (baked into OWLIM) and a general search engine (courtesy of lucene).

Perhaps most significantly, the OU has invested in training their developers and other staff in how easy linkeddata can be. Talis has been paid to provide a couple of training days that have better enabled all to understand what can be achieved by using linkeddata. This is of significance as can be seen by the number of people involved in the exposing of local institutional data via data.open.ac.uk:

Summary

The achievement of open.data.ac.uk is mixture of both technical and process change in that a new technology has not only been adopted, but been widely adopted by multiple central departments who see the potential value in exposing their data to everyone (both internally and externally).

Going forward, it will be interesting to see if the business process of linkeddata is "baked in" enough for the OU to continue to support the central aggregation of data for reuse. I think this will primarily depend on the app that is created on top of the exposed data. If this app is able to demonstrate even a tenth of the potential in structuring data this way for reuse then I have confidence that open.data.ac.uk will survive (and that hopefully this is the first of many data.foo.ac.uk to come!).

Comment by project member dff.j...@gmail.com, Nov 4, 2010
Comment by project member dff.j...@gmail.com, Jan 14, 2011

Presentation by Carlo???

  • podcast browser across faceted terms (easy to produce whereas taken much longer before), facebook course profiles app (displaying and linking students with the courses they are taking and the related courses that other students are on...
  • suggest a friend aka suggest another student taking a similar course to you)... showing the local groupings and informal research groups that form across the university (able to show tag cloud to PvC of whom is working with whom - that are not defined via formal departments structures: informal helps define formal departments (a notorious problem for PvCs? trying to decide how to organise their organisation from the top down)...
  • wayOU: an iPhone app to track what you do during your life of being an OU student... going to class today -> learned this from the class -> a kind of learning diary (portfolio)...
  • integrating course RDFa data (google rich snippets and products - good relations) - the course splash page is drupal where additional data about the course is provided from across the institution...

Comment by project member dff.j...@gmail.com, Jan 14, 2011

More money for teaching and research... sure (in the medium term) in the short term...reducing the cost of development... running costs... such as the above.

Comment by project member dff.j...@gmail.com, Jan 18, 2011

not Carlo above, was Fouad Zablith


Sign in to add a comment
Powered by Google Project Hosting