Digital Library Center Blog | UF

Chronicling work on the UF Digital Collections, SobekCM, & the Digital Humanities

Archive for the ‘SobekCM’ Category

NewspaperCat

without comments

The story below is from the most recent Library News from the UF Libraries for UF Faculty. NewspaperCat is powered and hosted within the UF Digital Collections (powered by SobekCM). This has been a great project for the UF Digital Collections to support because it was an active and valid use case for a record-only collection portal, which is often a wanted option for different research, teaching, and public service needs. There’s more in the story below and check out NewspaperCat to see it in action!

Libraries create catalog of digital historical newspapers

The Catalog of Digital Historical Newspapers (NewspaperCat) is available at www.newspapercat.org. NewspaperCat is an online database providing links to over 1,000 full-text digital newspapers in the United States and Caribbean. The project’s current coverage, which began with the Southeastern United States, is growing rapidly and will soon cover all fifty states.

The purpose of NewspaperCat is to improve access to historical newspapers digitized by libraries, archives, historical societies and other non-profit organizations that remain buried within search engine returns such as Google PageRank. These newspapers represent a rich source of primary research material for researchers, students and the general public. The project to build NewspaperCat was funded by the George A. Smathers Libraries and developed with the cooperation of the Digital Library Center of the University of Florida.

As a free-standing online resource with a unique web address, it is hoped that NewspaperCat will improve access to these important primary resource materials by collectively improving their Google PageRank for online researchers seeking historical newspaper content.

Over the summer semester the project has benefited from the work of master of library and information science graduate students from Florida State University and their coordinator, Dana Loving, who have added material and improved access to even more online digital newspaper content. Through their contributions, in the weeks ahead, NewspaperCat will become the first index of its kind to ingest and provide access to the Google News Archive as well as newspaper titles from across the United States and Canada.

– Matthew Loving Romance Languages/ Area Studies Librarian

Written by Laurie N. Taylor

September 15th, 2011 at 10:32 am

Posted in SobekCM,UF,UFDC

UFDC/SobekCM Tracking System

without comments

The UF Digital Collections System, SobekCM, is always being enhanced to better meet user and internal needs.

Normally the vast majority of time is spent on the user side because user support is the priority. With dozens of partners who use the online and locally installed tools to manage their digitization work and to contribute digitized items to the collaborative digital collections hosted on SobekCM, user support also includes many of the internal tools.

Most recently, however, the very-internal users received a major boost in support through the addition of a tracking system within SobekCM. Before, we had a legacy tracking system that was riddled with problems, couldn’t generate reports, and wouldn’t track the location of physical materials among other problems. Now, that legacy system is gone and it’s been replaced with tracking functionality within SobekCM. This tracking functionality includes tracking milestones, a work log for all work, reports, private/public flags, born digital/analog flags, internal notes, ticklers, internal fields on physical box location for item tracking during production, and more. It’s fabulous and there’s more on it here: http://ufdc.ufl.edu/sobekcm/tracking

Ideas, feedback, and suggestions are always welcome.

Written by Laurie N. Taylor

March 13th, 2011 at 2:53 am

SobekCM supports Flipbooks, active for Baldwin Collection in the UF Digital Collections

with one comment

We’ve frequently heard requests from internal users and patrons for a flipbook style view. In researching options, the Gnubook’s javascript-based page turner looked best and the Library of Congress came out with a clean implementation in October that was easy to emulate. Mark Sullivan, programmer for SobekCM, reviewed it and added the view in under a day. The new view is enabled within SobekCM so all collections, including the UF Digital Collections and the Digital Library of the Caribbean, have it enabled.

The flipbook view is active for the entire Baldwin Library of Historical Children’s Literature Digital Collection and this is an example: http://ufdc.ufl.edu/UF00089012/00001/pageturner

The view is accessible from the “page turner” tab on all Baldwin items. This page shows the tab: http://ufdc.ufl.edu/UF00088831/00001
This is the direct link format for the page turner: http://ufdc.ufl.edu/UF00088831/00001/pageturner Notice the wonderfully simplified URL structure remains even with the newly added feature.

The flipbook view will expand to other collections later, but it was most requested for and the best fit for the Baldwin books (because of the haptic nature of children’s books, beautiful images, and lack of small print overall). This sort of view is also very important to ensure that SobekCM easily supports the next generation of touch interfaces.

Written by Laurie N. Taylor

November 13th, 2010 at 6:26 pm

Finding Guides in SobekCM

without comments

SobekCM – the system powering the UF Digital Collections, the Digital Library of the Caribbean, and many other rich collections – will soon have advanced support for finding guides in EAD. This has been in process as a complete solution for the full workflow and it’s nearing completion. Check out the EADs we’re testing with here.

The benefits from fully supporting EAD within the same digital library system supporting digital objects is enormous:

  • Finding guides can be displayed, searched, and used within the same system as the digital objects they reference (increases usability from consistent navigation, ease of searching a single system, additional benefits from any applicable system enhancements)
  • Finding guides benefit from existing automation. For SobekCM this includes the automatic creation of MARC records from the EADs and the automatic record feed of MARC records into the library catalog.

SobekCM’s support for EADs has been enabled through programming by Mark Sullivan. The programming created an EAD reader for importing the data into the standard SobekCM digital resource object and then reading the description and container list and importing as much information as possible into the digital resource object. Sections in the EAD are autodetected to create the table of contents.

With the support for importing, SobekCM supports the EADs as digital resources that can be searched for within the digital collections. When a user selects any digital resource to view in SobekCM, the METS file is read.  This provides some basic information like wordmarks and the type of digital resource it is. If the digital resource is a finding guide (defined by being Archival/collection and having an EAD listed as one of the downloads ) the EAD is then read into the SobekCM digital resource object.  While the container list will be read identically, the top portion of the EAD is pulled into the display and stored as one large block of text/xml with the XSL transform applied to display the description.

The auto-created table of contents is a bit different from any of the existing table of contents because it floats to the left constantly (scrolling down, it floats down to stay onscreen at all times), and this is needed for reading longer HTML-style documents that have a lot of scrolling, as opposed to our normal page-turner model.

When EAD results show after a search, the search terms are highlighted. This is still being refined, but it’s active in test already and will soon be fully active. After that, the final steps are for handling the container list.

To see it in test (which will only be active for awhile, since this will soon be live):

Written by Laurie N. Taylor

October 31st, 2010 at 10:16 pm

New UFDC Features: Browse By, Admin Header, and Export to Excel

without comments

Browse By Metadata (i.e., list of all publishers in an item aggregation)

The UF Digital Collections (UFDC) have more new features. These are all in progress, as is the norm with the perpetual beta of growing and evolving systems, but the “Browse By” feature is already publicly viewable here for the Baldwin Library of Historical Children’s Literature Digital Collection.

This is still in process as we test to see how to be display so much rich metadata with significant distinctions, as when an author is also an editor and printer – should it all be collapsed into one, if so then should all types be listed at the end, should they all remain, what about when there are multiple types and then another unclear not on the affiliation? We’ll be working through these questions and more, and we’ll be doing so with abundant feedback from users.

Further Simplification for Simplified URLs

UFDC’s already simplified URLs are even simpler, with the base now http://ufdc.ufl.edu instead of http://ufdc.uflib.ufl.edu. The longer version is still fully supported.

Features for Internal Users

Internal Header

Internal Header

Internal Header

UFDC now has an internal header (internal meaning it’s only for internal folks who are logged in). It allows internal users to easily search by BibID. Right now, this can also be done using the main search box, but the internal header will eventually allow for specialized internal searches and for searching the records for items in process that are not yet online. This is part of the merging of offline workflows into the SobekCM system.

Export to XLS or CSV

Export to XLS or CSV

Export to Excel

This is also internal-only and it allows internal users to export a list of items directly from SobekCM/UFDC. This complements an update to the UFDC_CM (currently an offline-only tool) which can now pull MARC records for items online. Both of these changes are part of the work to add reporting to SobekCM and the work to integrate existing tools into one system (for greater efficiency for supporting and using the tools).

Related

Like these, other seemingly internal-only enhancements also benefit external users by increasing SobekCM’s capabilities as a system and the Digital Library Center’s ability to work more effectively in digitizing materials and adding them to the UF Digital Collections.

Written by Laurie N. Taylor

October 3rd, 2010 at 7:07 pm

Super Secret, or infrastracture is awesome with directory views

with one comment

In addition to great added functionality like item-level statistics for external users, the UF Digital Collections‘ underlying SobekCM system is always improving in terms of internal infrastructure. One recent major enhancement is the “Directory” view.

The Directory View is very important to ensure we can easily and quickly check and verify all files, including all metadata files, and locate, copy, and send any files per patron request. This was a very small technical change, with significant day-to-day operational benefits. In keeping with principles for smart design, the internal Directory View is built within the same external user views – ensuring that we trouble-shoot and verify all systems are operating properly simply as part of doing our work.

Plus, it’s neat to see the different files that enable each item to be displayed in so many ways and to be interoperable with so many other systems.

directoryview_dirtab1

directoryview_filesmetadata1

directoryview1

Written by Laurie N. Taylor

September 22nd, 2010 at 1:34 am

SobekCM, weighing in at 113,643 lines of code (plus comments)

without comments

Mark Sullivan, the UFDC/DLC/dLOC programmer, recently shared this information. It’s exciting to see that SobekCM (our digital asset management system, digital library system, and digital production tool set) is such a streamlined solution with so much functionality. There are seven projects which make up the SobekCM solution. In those projects, there are:

113,643 lines of code ( not comments or empty lines )
23,452 lines of comments
420 files
60 folders
544 classes ( 55 abstract classes, 1 windows form, 5 ASPX pages )
14 interfaces

The main two projects are:

1) SobekCM_Bib_Package which has all the code to represent digital objects, read metadata, write metadata, etc.. This is used throughout all the DLC/UFDC/dLOC applications.

36,554 lines of code
5,796 lines of comments
121 files
19 folders
165 classes ( 3 abstract classes, 1 windows form )

2) SobekCM_Library which does all the rendering, navigation, authentication, etc for the SobekCM library. This relies heavily upon the above library for reading and displaying of digital resources and is utilized by both the builder and the customization manager.

68,803 lines of code
13,825 lines of comments
251 files
30 folders
328 classes ( 52 abstract classes )
11 interfaces

This does not include the 22 separate javascript files of which eight are written by me and include 3951 new lines of code and 702 lines of comments.

3) While the main SobekCM web project is not strictly a library, it is the third project in the SobekCM solution. It is the first project which a user interacts with when entering the library. This project is actually very small, containing only about 1300 .NET lines. It does house the five web forms used in the application, although these forms are quite small and are just basically skeletons into which the SobekCM_Library renders HTML or controls.

4) SobekCM_URL_Rewriter is a tiny library which is essentially just a HttpModule for rewriting and translating the URL to allow for cleaner URLs.

5) SobekCM_Tools is a small library ( about 4000 code lines ) which contains additional classes for logging, interacting with the tracking database, and interacting with the Florida Dark Archive (DAITSS). This is kept seperate from the general library since this is not strictly involved in rendering the HTML but is used by some modules and is used with the SobekCM_Library and SobekCM_Builder libraries for building collection text indexes and loading new items through the Builder.

6) FileUploadLibrary ( written by Darren Johnstone ) is about 3000 .NET code lines and 5500 lines of javascript used for uploading data via HTTP with a real-time upload progress bar. Quite useful and cool library which was adopted with very few changes and worked quite simply. Highly recommended… ( http://darrenjohnstone.net/ )

7) SobekCM_Builder. In addition to these libraries/projects used by the digital library, this library is employed (along with the SobekCM_Bib_Package, SobekCM_Library, and SobekCM_Tools) for the builder software which runs constantly in the background on another server, loading new items which are deposited into network folders or FTP folder. It also updates and builds all static pages, OAI feeds, RSS feeds, and builds the text indexes. Additionally, it reads and loads all of the FDA ingest reports from DAITSS.

6793 lines of code
858 lines of comments
26 files
3 folders
37 classes

Written by Laurie N. Taylor

August 16th, 2010 at 12:48 pm

UFDC facets, citation links, & descriptions! More coming soon!

without comments

The University of Florida Digital Collections (UFDC) are always improving. Most of our current improvements at the moment are from moving servers to newer, more stable equipment (and making updates required from the server move). Despite the time that the server move requires, listed below are some of our recent and particularly great new enhancements to share!

Facets

The University of Florida Digital Collections (UFDC) now has facets to help refine searches and browses by language, subject terms, and more. See this page for an example http://ufdcweb1.uflib.ufl.edu/ufdc/?a=fdnl1&m=lbball

Citation Links

Key components of the citations for each item are now also linked for easy searching, as in this example.

User Contributed Descriptions

UFDC now allows for user contributed descriptions. The descriptions can be activated for any collection (they’re turned off by default). When activated, this allows any logged in user to add a description to an item. The description is added in a note field, and the username and date that the description was added are automatically added as well.

Coming Soon

The next project (aside from the server move and related updates) is EAC/EAD integration. That’s expected soon and more details will be available as it gets closer to implementation.

Written by Laurie N. Taylor

June 1st, 2010 at 1:55 am

Posted in SobekCM,UFDC

iPhone Statistics

without comments

With three iPhone apps out, downloads have increased, with 45 downloads of the main SobekPH App from 3/1-3/7/2010, 14 downloads of the Baldwin SobekPH app, and 5 downloads of the UF Archives SobekPH App. Given that the Baldwin and UF Archives apps were only out for 1/2 of the week, 19 downloads in just a few days means we’re already showing great results for sharing the UF Digital Collections more widely.

Hopefully all of the folks downloading the apps are also showing the apps and sharing with friends!

Written by Laurie N. Taylor

March 11th, 2010 at 6:59 pm

Posted in app,iphone,SobekCM,UFDC

UF Digital Collections, system improvements

without comments

As usage of the self-submittal and online metadata editing systems for the UF Digital Collections have continued to increase, new supports were needed to support the additional users. To provide those supports, the former UFDC_CM application has been integrated into UFDC/SobekCM and additional functionality has been added.

These improvements are releasing next week, but most users won’t notice any changes. For internal users these are immensely helpful, and worthy of announcing and celebrating.

With this upgrade, UFDC will now include administrator options so that:

  • Admin users can adjust permissions on existing UFDC users (help page)
  • Admin users can add new aggregation aliases for forwarding purposes
  • Admin users can add new item aggregations (collections, subcollections, institutions) and edit basic information on existing aggregations.  (help page)
  • Admin users can add new HTML interfaces and edit existing interfaces (help page)
  • Admin users can add new projects and edit the complete metadata for project METS files online (help page)
  • Admin users can add/edit wordmarks and delete wordmarks not linked to any digital resources (help page)

Thanks once again to Mark Sullivan for designing and programming these enhancements, and for writing the supporting help pages as well!

Written by Laurie N. Taylor

February 20th, 2010 at 6:53 pm