Announcement: BagItPHP Library

The Scholars’ Lab is pleased to announce the initial release of a PHP library implementing BagIt 0.96. BagIt is a specification from the Library of Congress for bundling and transmitting multiple files along with their meta-data. You can check out the project page at http://github.com/scholarslab/BagItPHP/.

Our work on BagItPHP stems from the open source “Omeka + Neatline” project, a collaboration of the Scholars’ Lab with the Roy Rosenzweig Center for History and New Media. “Omeka + Neatline” is supported by the Library of Congress.

Downloads

You can download the library either as a ZIP or tarball, or you can clone the repo with git:

git clone git://github.com/scholarslab/BagItPHP

Use: Creating Bags

To create a bag, simply instantiate a new BagIt object with the name of a directory that doesn’t exist, add files to it, and package it into a tarball with the name of the bag:

require_once 'lib/bagit.php';

$bag = new BagIt('./new-directory');

$bag->addFile('./exhibit/index.html', 'index.html');
$bag->addFile('./exhibit/imgs/1.png', 'imgs/1.png');
$bag->addFile('./exhibit/imgs/2.png', 'imgs/2.png');

$bag->package('./new-directory');
// The bag package will be created named ./new-directory.tgz.

Use: Reading Bags

To read a bag, simply open an existing back, validate it (optional), fetch remote resources, and iterate over the files, copying them or processing them in some other way.

require_once 'lib/bagit.php';

$bag = new BagIt('./existing-bag.zip');

$bag->validate();
if (count($bag->getBagErrors()) == 0) {
    $bag->fetch->download();

    foreach ($bag->getBagContents() as $filename) {
        copy($filename, 'final/destination/' . basename($filename));
    }
}

For more information about the methods that are available, please see the documentation.

Let Us Hear from You

If you’re using this library or have any feedback on it, we’d love to hear from you! We are relying on the GitHub issues tracker for code feedback, so you can file bugs or other issues there. If you have a more general question, feel free to post here.

My interests include text processing, text mining, and natural language processing, as well as web-development and general programming. Studied medieval English literature and linguistics at UGA. Dissertated on lexicography. Now I program in Haskell and write when I'm not building cool stuff for the Scholars' Lab. Also, husband and parent. Do you notice that sleep…

4 Comments

  1. Hi Sarah,

    We wrote this library and an accompanying plugin (https://github.com/scholarslab/BagItPlugin) in to help people pull data into Omeka (http://omeka.org). I don’t off-hand know of anyone who’s using it here at UVa.

    Of course, there could easily be someone who is using it — I just don’t know of them.

    Sorry I wasn’t more helpful,
    Eric

  2. I’m working on a project to determine the level of BagIt adoption within the library community at-large. I was wondering what level of usage you’ve seen so far within your institution? Any other comments you may have about who is using it and for what types of projects would be a huge help. Thanks so much! -Sarah

  3. Hi Joris,

    Glad to hear you’re using this.

    Currently, according to the spec, the bag-info.txt file is optional, and to keep the API simple, it doesn’t contain a way to set the fields in this file (source-organization, organization-address, contact-name, etc.). Because of this, the library just creates the file, but doesn’t output any content to it.

    If there’s enough interest, in a future version of the library we may add an API for providing this information and writing it out to the bag-info.txt file. Please fill out an issue on Github about this to cast your vote.

    Thanks again,
    Eric

  4. Thanks for this great library. Is there a reason why in the following code bag-info.txt is empty? Thanks!

    Joris

    setHashEncoding($encoding);
    $bag->addFile('./marc.xml', 'marc.xml');
    $bag->addFile('./vrouw_jg1-01_18930715.pdf', 'vrouw_jg1-01_18930715.pdf');
    $bag->addFile('./vrouw_jg1-02_18930815.pdf', 'vrouw_jg1-02_18930815.pdf');

    $bag->update();

    $bag->package('./AMS_de_vrouw');

    ?>

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Archives