What's new? | Help | Directory | Sign in
Google
                
Search
for
Updated Jun 09, 2008 by ewlarson
CitationImportProcess  
Describes the Citation Import Process for BibApp 1.0 (Not fully implemented)

Introduction

Citation management starts with the import of new citations. This page details how new citations are parsed, processed and loaded into BibApp.

Import Formats

BibApp currently accepts the following citation import formats:

BibApp also supports manual entry of citations through a Citation web form. There are already plans to support further formats, so feel free to make suggestions!

The citation parsers and importers for the above formats were developed specifically for BibApp. They are available in the /vendor/plugins/ directory when you download BibApp. Once they are more stable, they will likely also be available as Rails Plugins.

Citation Status

The status of a citation is an important part of the lifecycle and visibility of the citation in BibApp. Citations currently undergo the following status values:

Status Description Visibility
processing Citation has been successfully parsed, but it waiting on further processing Citation is not visible
duplicate The system has reason to suspect this citation is a duplicate. It will be removed from the system during next cleanup. Citation is not visible
accepted Citation is complete and has been accepted into BibApp Citation is visible as normal
incomplete Citation is missing information and requires review. Citation is visible, but somehow marked as incomplete
deleted Citation has been marked for deletion, and will be removed during the next cleanup. Citation is not visible.

Import Process

The citation import process is detailed graphically in the below Citation Import Flow Chart. This section describes this step-by-step process in a little more detail.

  1. Citation Parsing - Based on the import format (see above), the citation is parsed into its various components. The parsed citation is only available in system memory.
  2. Completion Check - The parsed citation is checked for completeness. For BibApp, "completeness" is defined as having all the necessary fields which BibApp uses to check for duplicates.
  3. Determine Contributorships - As one of the more complex processes within BibApp, the decision process of how we determine or "guess" contributorships is detailed on the AuthorAuthorities page.
  4. Extract External Identifiers - Citations which are pulled from external systems often come with unique system identifiers (e.g. a PubMed ID, OCLC #, LSSN, etc.) Rather than discard these external system identifiers, we save them for later processing. The hope is that "incomplete" (or potentially incorrect) citations can be filled out in more detail by querying these various external systems using the unique identifier(s).
    • For example, if we have an "incomplete" citation but we were given its PubMed ID, we can potentially kick off a behind-the-scenes process to pull down any missing citation information from PubMed using that unique PubMed ID.
  5. Extract Publication Information - Publication (e.g. Journal Title) and Publisher information is also a complex problem in BibApp. As anyone who has seen many citations can attest, publication and publisher names never seem to appear the same way twice! Dealing with "Authorities" for Publication and Publisher names is being detailed on the PublicationPublisherAuthorities page
  6. Save Citation - Assuming all of the above steps have completed without errors, the new citation (and related extracted information) is saved into the BibApp database. If any processing resulted in an error, the user who entered the citation is informed and the citation is not saved.

Citation Import Flow Chart

The below diagram attempts to layout the processing that occurs whenever a new citation is imported into BibApp. Click on the diagram to view a larger version.


Sign in to add a comment