Archive for the ‘statistics’ Category
UF Digital Collections, in November, over 4 million hits!
Just once month after announcing the highest ever usage for the UF Digital Collections (UFDC) with 3.2 million hits in October 2011, we saw another dramatic increase with 4 million hits for November 2011!
The UF Digital Collections (UFDC) have seen continuous, steady increases in usage thanks to the abundance of amazing content and ongoing search engine optimization work. November was another milestone with nearly 4.1 million human hits to the UF Digital Collections (UFDC) and associated collections and libraries, as with the Digital Library of the Caribbean (dLOC).
- October usage: 3,196,063 views
- November usage: 4,076,673 views
Here’s to upcoming months of increased exposure, usage, and impact for the UF Digital Collections and for all those who work with and support open access to digitized materials as well as to digital scholarship!
UF Digital Collections, in October, over 3 million human hits!
The UF Digital Collections have seen continuous, steady usage thanks to loads and loads (and loads) of wonderful content and ongoing search engine optimization work. October was another milestone with nearly 3.2 million human hits to the UF Digital Collections (UFDC) and associated collections and libraries, as with the Digital Library of the Caribbean (dLOC).
October usage: 3,196,063 views. Here’s to the coming months with increased exposure, usage, and impact for the UF Digital Collections and all who work with and support open access to digitized cultural and historical materials as well as to digital scholarship.
Announcement: DataCite Summer Meeting – Data and the Scholarly Record: the Changing Landscape
DataCite will hold its second Summer Meeting on August 24th and 25th at the historic Shattuck Plaza Hotel in Berkeley, California. The Summer Meeting will be a 1.5 day event and you can register at: http://datacite2011.eventbrite.com/ .
The Summer Meeting brings together people from research organisations, data centers, government, and information service providers to hear about the latest developments in data science, data citation, discovery, and reuse. It also provides opportunities to exchange experience and influence the next generation of data citation services.
This year’s program will include sessions on data citation, data publishing, and discussions on the new challenges that come with increased access to scientific data.
The 2010 DataCite summer meeting brought together a strong programme of speakers and participants (http://www.datacite.org/datacite_summer_meeting_2010). Highlights were published in D-Lib (http://dx.doi.org/doi:10.1045/january2011-contents).
DataCite helps researchers find, access, and reuse data. It is an international not-for-profit association founded in 2009 with members across the globe.
CFP: DDI Workshop: Managing Metadata for Longitudinal Data – Best Practices
DDI Workshop: Managing Metadata for Longitudinal Data – Best Practices
September, 19-23, 2011
Leibniz Center for Informatics, Schloss Dagstuhl, Wadern, Germany
Goals
This symposium-style workshop will bring together representatives from major longitudinal data collection efforts to share expertise and to explore the use of the DDI metadata standard as a means of managing and structuring longitudinal study documentation. Participants will work collaboratively to create best practices for documenting longitudinal data in its various forms, including panel data and repeated cross-sections.
Description of the workshop
Longitudinal survey data carry special challenges related to documenting and managing data over time, over geography, and across multiple languages. This complexity is often a barrier to building efficient systems for data access and analysis. DDI (Data Documentation Initiative) Lifecyle, a metadata standard that addresses the full life cycle of social science research data (formerly referred to as DDI 3), is designed to provide an efficient structure for the documentation of complex longitudinal data. In this workshop, participants involved in longitudinal data projects around the world will work together on issues involved in documenting longitudinal data.
Intended audience: Individuals with expertise in longitudinal social science data; knowledge of DDI is desired but not required. The intent is to have a mix of participants with substantive and technical skills. Participants should provide access to materials describing their projects, which can serve as use cases in applying DDI. The workshop is in English. This is the second Dagstuhl workshop on the topic; the first took place in October 2010. The upcoming workshop will continue the in-depth discussion begun last year, expanding into additional topics.
Expected Results
Participants will write best practice papers, to be published in the DDI Working Paper Working Paper Series. Last year’s workshop produced a series of best practice papers on longitudinal data.
Possible Topics
Documenting comparison, harmonization, and the relationship among concepts, questions, and variables over time, as well as the relationship of respondent types (person, household) are typical issues for longitudinal data. Other topics not specific to longitudinal data:
- Classifications (e.g., ISCO, ISCED)
- Data collection details
- Qualitative data, other types of data sources beyond surveys
- Quality of metadata and data
- Data management planning
- Relationship to the Open Archival Information System (OAIS)
- Extension of DDI for specific needs
These topics are often more salient for longitudinal data, making it even more critical manage these metadata in a structured form over time and countries. The current possibilities of DDI Lifecyle will be explored and areas for future extensions identified. Additionally, participants can suggest their area of interest.
Venue
The workshop will take place at the Leibniz Center for Informatics, Schloss Dagstuhl, Wadern, Germany. The non-profit center is a member of the Leibniz Association and is funded jointly by the German federal government and a number of state governments. The venue provides an intense working atmosphere in a nice remote region. Several seminar rooms and cafeteria while the day, and leisure rooms like wine bar and billiard room while the evening promote intense discussion and communication. Accommodation costs at Dagstuhl including full board is 60 Euro/day/person (subsidized rate).
Sponsors
This workshop is sponsored by the DDI Alliance, GESIS – Leibniz Institute for the Social Sciences, Minnesota Population Center (MPC), and Open Data Foundation (ODaF).
Contact
The names of interested organizations and individuals should be sent to ddi-expert-workshop@icpsr.umich.edu. Please provide contact information, area of interest, and area of expertise for each individual, information regarding DDI Lifecyle implementation, and a statement of what each individual can contribute to the workshop. Direct questions to ddi-expert-workshop@icpsr.umich.edu. Twenty-one participants will be accepted.
Links
Related Web page: http://www.dagstuhl.de/11382
Best practice papers on longitudinal data: http://www.ddialliance.org/resources/publications/working/BestPractices/LongitudinalData
DDI Working Paper Working Paper Series: http://www.ddialliance.org/resources/publications/working
Further information on “How to get to Dagstuhl”: http://www.dagstuhl.de/en/about-dagstuhl/arrival/
Pictures of Dagstuhl: http://www.dagstuhl.de/en/about-dagstuhl/press/downloads/
DDI Alliance: http://www.ddialliance.org/
GESIS – Leibniz Institute for the Social Sciences: http://www.gesis.org/
Minnesota Population Center (MPC): http://www.pop.umn.edu/
Open Data Foundation (ODaF): http://www.opendatafoundation.org/
The organizers would appreciate hearing soon from interested people.
Mary Vardigan, Director DDI Alliance
Wendy Thomas, Chair DDI Technical Implementation Committee
Joachim Wackerow, Vice Chair DDI Technical Implementation Committee
Arofan Gregory, Technical Consultant
(Organizers)
GESIS – Leibniz Institute for the Social Sciences
Department: Monitoring Society and Social Change
Unit: Social Science Metadata Standards
Visiting address: B2 1, 68159 Mannheim, Germany
Postal address: P.O. Box 122155, 68072 Mannheim, Germany
Phone: +49 (0)621 1246 262
Fax: +49 (0)621 1246 100
E-mail: joachim.wackerow@gesis.org
www.gesis.org/en/institute/
UFDC/SobekCM Tracking System
The UF Digital Collections System, SobekCM, is always being enhanced to better meet user and internal needs.
Normally the vast majority of time is spent on the user side because user support is the priority. With dozens of partners who use the online and locally installed tools to manage their digitization work and to contribute digitized items to the collaborative digital collections hosted on SobekCM, user support also includes many of the internal tools.
Most recently, however, the very-internal users received a major boost in support through the addition of a tracking system within SobekCM. Before, we had a legacy tracking system that was riddled with problems, couldn’t generate reports, and wouldn’t track the location of physical materials among other problems. Now, that legacy system is gone and it’s been replaced with tracking functionality within SobekCM. This tracking functionality includes tracking milestones, a work log for all work, reports, private/public flags, born digital/analog flags, internal notes, ticklers, internal fields on physical box location for item tracking during production, and more. It’s fabulous and there’s more on it here: http://ufdc.ufl.edu/sobekcm/tracking
Ideas, feedback, and suggestions are always welcome.
Item Level Usage Statistics
The UF Digital Collections (UFDC) greatly benefit from the statistical tracking of items, pages, and usage. Most recently, those statistics were augmented at the item level for easy item-level statistics for each and every item. The item-level statistics, like all of the usage statistics for UFDC, are cleaned for robots and other automated systems to ensure that the usage is actual usage by humans and not simply machine checks.
One of our favorite examples is a particular version of The 3 Little Kittens, which has been viewed over 15,000 times since it loaded in 2008. The full item-level statistics for the item are shown in the screenshot below and online in UFDC here.
“On the Cost of Keeping a Book”
A new CLIR Report, The Idea of Order: Transforming Research Collections for 21st Century Scholarship includes a report, “On the Cost of Keeping a Book,” by Paul Courant and Matthew “Buzzy” Nielsen. This report examines the costs of keeping physical books (pbooks) and electronic books (ebooks) and finds a significant cost savings in ebooks over print-based libraries.
Particularly worth noting is the statement on the overall cost savings when digitizing pbooks and then storing them as ebooks:
If the cost of digitization is less than the difference in present value between print storage and digital storage, adding back in the cost of maintaining a shared print archive, there will be a net gain to the university sector of digitizing print collections and using the digitized versions for access. For most of our estimates of the cost of ebook and pbook storage, these conditions would hold. If another party, for example, Google or the Internet Archive, undertakes the digitization and provides the access, the argument becomes all the stronger.
Over 5 million pages!
The University of Florida Digital Collections (UFDC) now have over 5 million pages!
The more than 5 million pages – maps, aerials, audio, video, books, historic documents, museum objects, herbarium specimens, photographs, newspapers, oral histories, and more – are all openly and freely online for the world!
Check out the collections: www.uflib.ufl.edu/ufdc
UF Digital Collections: Usage Statistics Online
The usage statistics for January 2010 for the UF Digital Collections are now online here: http://www.uflib.ufl.edu/ufdc/?m=htu
The top collections continue to be the Digital Library of the Caribbean, the Florida Digital Newspaper Library, and the Baldwin Library of Historical Children’s Literature, with nearly 100.000 hits each in January alone.
The most used collections, and the total numbers with over 10 million hits to the UF Digital Collections since March of 2006, are always impressive. However, my favorite statistics are the most popular items by collection (available for all collections here). For instance, the Digital Library of the Caribbean’s most popular item is Sus mejores poemas by Rubén Darío. It’s been online since April 2008. In that time, it’s had over 30,000 hits. Similarly, An A B C, for baby patriots has been online only since September of 2008 and it’s already had over 47,000 hits.
The usage statistics for January 2010 are posted alongside all of the prior usage statistics, back to when the UF Digital Collections began in March 2006. The statistics provide a nice quantification of the extensive known usage from the increasingly more frequent patron emails, requests, and compliments. It’s great to see exactly how many more people the UF Digital Collections are reaching, and how much more the UF Digital Collections are assisting with research and creative inquiry.
Appropriate Metaphors for Collection Scopes and Sizes?
The University of Florida Digital Collections (UFDC) has grown from September 2007′s 1 million pages (pages of books, newspapers, archival materials, maps, posters, audio, video, photos, and more) to 2 million in July 2008, 3 million in December 2008 (thanks to ingesting microfilm digitized by a vendor) and then to 4 million in July 2009. Right now – and UFDC is loading so this will be higher by morning – UFDC has 4,134,392 pages.
Four million, one hundred and thirty-four thousand, three hundred and ninety-two pages.
It sounds impressive because it is. Yet, it’s so much more than that even when only on a quantity level. Page counts are helpful for a general sense of “big-ness” because they prove critical mass. It’s a way of saying “if you’re not sure this is the digital collection you’re looking for, this collection is big enough to have something you’re interested in”. Page counts aren’t helpful in dealing with multiple formats. For instance, right now 1 page =
- 1 page of a journal article, born digital and submitted electronically
- 1 10 ft. x 12 ft. blueprint from 1905
- 1 video, one hour long, digitized from VHS
- 1 audio interview, one hour long, converted from reel to reel tapes
These aren’t equivalent in terms of the work to create them, the interface variety and sophistication to present them, or the use-value to patrons and for preservation.
Page counts aren’t perfect, and neither are item counts, but is there an easy and accurate way to explain any complex, diverse, and varied collections with 4 million + pieces?
The value created from having critical mass makes the entire scenario more complicated. There’s no good way to explain the value of being able to search for an illustrator and seeing examples of the work in multiple books, finding reviews of the illustrator’s work in a literary journal, seeing articles by the illustrator’s peers in newspapers from the same time period, and more without a narrative-style example, and that’s not short or easy.
Given the size, scope, and wealth contained in UFDC, I’m at a loss for words to explain just how wonderful UFDC is. For now, I plan to focus on adding more materials and adding more contextual guides (exhibits, highlighted items of the week, guides, and then on to authority records). I’d still like to have something short and quick to explain UFDC like the page counts, but perhaps those were never good enough explanations and I was just more comfortable with them when they were less blatantly inaccurate. Whatever the case, UFDC continues to grow at an astonishing rate by any measure I can imagine.
