Digital Library Center Blog | UF

Chronicling work on the UF Digital Collections, SobekCM, & the Digital Humanities

Archive for the ‘data sets’ Category

Announcement: DataCite Summer Meeting – Data and the Scholarly Record: the Changing Landscape

without comments

DataCite will hold its second Summer Meeting on August 24th and 25th at the historic Shattuck Plaza Hotel in Berkeley, California. The Summer Meeting will be a 1.5 day event and you can register at: http://datacite2011.eventbrite.com/ .

The Summer Meeting brings together people from research organisations, data centers, government, and information service providers to hear about the latest developments in data science, data citation, discovery, and reuse. It also provides opportunities to exchange experience and influence the next generation of data citation services.

This year’s program will include sessions on data citation, data publishing, and discussions on the new challenges that come with increased access to scientific data.

The 2010 DataCite summer meeting brought together a strong programme of speakers and participants (http://www.datacite.org/datacite_summer_meeting_2010). Highlights were published in D-Lib (http://dx.doi.org/doi:10.1045/january2011-contents).

DataCite helps researchers find, access, and reuse data. It is an international not-for-profit association founded in 2009 with members across the globe.

Written by Laurie N. Taylor

June 30th, 2011 at 9:28 pm

CFP: DDI Workshop: Managing Metadata for Longitudinal Data – Best Practices

without comments

DDI Workshop: Managing Metadata for Longitudinal Data – Best Practices
September, 19-23, 2011
Leibniz Center for Informatics, Schloss Dagstuhl, Wadern, Germany

Goals

This symposium-style workshop will bring together representatives from major longitudinal data collection efforts to share expertise and to explore the use of the DDI metadata standard as a means of managing and structuring longitudinal study documentation. Participants will work collaboratively to create best practices for documenting longitudinal data in its various forms, including panel data and repeated cross-sections.

Description of the workshop

Longitudinal survey data carry special challenges related to documenting and managing data over time, over geography, and across multiple languages. This complexity is often a barrier to building efficient systems for data access and analysis. DDI (Data Documentation Initiative) Lifecyle, a metadata standard that addresses the full life cycle of social science research data (formerly referred to as DDI 3), is designed to provide an efficient structure for the documentation of complex longitudinal data. In this workshop, participants involved in longitudinal data projects around the world will work together on issues involved in documenting longitudinal data.

Intended audience: Individuals with expertise in longitudinal social science data; knowledge of DDI is desired but not required. The intent is to have a mix of participants with substantive and technical skills. Participants should provide access to materials describing their projects, which can serve as use cases in applying DDI. The workshop is in English. This is the second Dagstuhl workshop on the topic; the first took place in October 2010. The upcoming workshop will continue the in-depth discussion begun last year, expanding into additional topics.

Expected Results

Participants will write best practice papers, to be published in the DDI Working Paper Working Paper Series. Last year’s workshop produced a series of best practice papers on longitudinal data.

Possible Topics

Documenting comparison, harmonization, and the relationship among concepts, questions, and variables over time, as well as the relationship of respondent types (person, household) are typical issues for longitudinal data. Other topics not specific to longitudinal data:
- Classifications (e.g., ISCO, ISCED)
- Data collection details
- Qualitative data, other types of data sources beyond surveys
- Quality of metadata and data
- Data management planning
- Relationship to the Open Archival Information System (OAIS)
- Extension of DDI for specific needs

These topics are often more salient for longitudinal data, making it even more critical manage these metadata in a structured form over time and countries. The current possibilities of DDI Lifecyle will be explored and areas for future extensions identified. Additionally, participants can suggest their area of interest.

Venue

The workshop will take place at the Leibniz Center for Informatics, Schloss Dagstuhl, Wadern, Germany. The non-profit center is a member of the Leibniz Association and is funded jointly by the German federal government and a number of state  governments. The venue provides an intense working atmosphere in a nice remote region. Several seminar rooms and cafeteria while the day, and leisure rooms like wine bar and billiard room while the evening promote intense discussion and communication. Accommodation costs at Dagstuhl including full board is 60 Euro/day/person (subsidized rate).

Sponsors

This workshop is sponsored by the DDI Alliance, GESIS – Leibniz Institute for the Social Sciences, Minnesota Population Center (MPC), and Open Data Foundation (ODaF).

Contact

The names of interested organizations and individuals should be sent to ddi-expert-workshop@icpsr.umich.edu. Please provide contact information, area of interest, and area of expertise for each individual, information regarding DDI Lifecyle implementation, and a statement of what each individual can contribute to the workshop. Direct questions to ddi-expert-workshop@icpsr.umich.edu. Twenty-one participants will be accepted.

Links

Related Web page: http://www.dagstuhl.de/11382
Best practice papers on longitudinal data: http://www.ddialliance.org/resources/publications/working/BestPractices/LongitudinalData
DDI Working Paper Working Paper Series: http://www.ddialliance.org/resources/publications/working
Further information on “How to get to Dagstuhl”: http://www.dagstuhl.de/en/about-dagstuhl/arrival/
Pictures of Dagstuhl: http://www.dagstuhl.de/en/about-dagstuhl/press/downloads/
DDI Alliance: http://www.ddialliance.org/
GESIS – Leibniz Institute for the Social Sciences: http://www.gesis.org/
Minnesota Population Center (MPC): http://www.pop.umn.edu/
Open Data Foundation (ODaF): http://www.opendatafoundation.org/

The organizers would appreciate hearing soon from interested people.

Mary Vardigan, Director DDI Alliance
Wendy Thomas, Chair DDI Technical Implementation Committee
Joachim Wackerow, Vice Chair DDI Technical Implementation Committee
Arofan Gregory, Technical Consultant
(Organizers)

GESIS – Leibniz Institute for the Social Sciences
Department: Monitoring Society and Social Change
Unit: Social Science Metadata Standards
Visiting address: B2 1, 68159 Mannheim, Germany
Postal address: P.O. Box 122155, 68072 Mannheim, Germany
Phone: +49 (0)621 1246 262
Fax: +49 (0)621 1246 100
E-mail: joachim.wackerow@gesis.org
www.gesis.org/en/institute/

Written by Laurie N. Taylor

June 25th, 2011 at 2:03 pm

Data Documentation Initiative 3 (DDI 3) Data Extraction Tools from Colectica Awarded an NIH Grant

without comments

The Data Documentation Initiative 3 (DDI 3) standard is a simply fabulous and full standard for metadata (data about data) as well as for the data contents, making it a full payload standard.

DDI 3 is such an exciting standard because it allows for the possibility of true and full computational support for data harmonization and for really working with longitudinal data. It’s the type of data standard I’d been waiting for because it gets it. Data standards need to be able to support documenting, containing, expressing, and computing (analysis, harmonization, limitations on disclosure, everything we now do with less than ideal systems and methods). DDI 3 does this and that’s why groups like ICPSR are already using it.  DDI 3 is already on its way to becoming ubiquitous, but more tools for it are needed.

News of others using and supporting DDI 3 is always good. Thus, it’s wonderful news that Colectica has been awarded an NIH Grant for DDI 3-based data extraction tools. From the Colectica website:

The award is a Phase I grant that provides supplemental support of Algenta’s research on an “Open Standards-Based Data Extraction Web Tool for Complex Longitudinal Datasets”. This Phase I feasibility study aims to analyze to data preparation and metadata creation workflow needed to prepare a study for online data extraction, to validate the use of the Data Documentation Initiative’s DDI 3 standard for the basis of such a tool, and to create prototype web-based data extraction software. While the focus is on longitudinal surveys, the proposed system would also handle cross-sectional, time-series, and non-repeated studies. The aim is to improve research methodologies through a simplification of the process used for discovering, retrieving, and analyzing data relevant to a researcher’s investigation and to improve data citations, aiding in reproducible research. The research includes consultation with researchers from ICPSR at the University of Michigan-Ann Arbor and the Mid-Life in the United States Longitudinal Study at the University of Wisconsin-Madison.

Written by Laurie N. Taylor

April 5th, 2011 at 5:18 pm

Data Sets in the UF Digital Collections

without comments

The UF Digital Collections has many data sets, but most are historical -  as in pre-digital historical, printed on paper, and then digitized. In digitizing the materials, text is created by OCR so the data is all available, but it’s not as exciting, born-digital format.*

Recently, born-digital data sets for energy consumption in Gainesville were added and this data is in an exciting format and it’s exciting data.

Check it out here: http://ufdc.ufl.edu/IR00000242/00001

Also, check out the Gainesville-Green.com site, which uses the data set in a visual comparison tool that lets you compare your energy consumption over time, with neighbors, and more!

* At least on a dealing-with-files level; data-as-content is always exciting for what it contains.

Written by Laurie N. Taylor

October 30th, 2010 at 6:43 pm

Posted in data sets

“Linked Data is Not Enough!”

without comments

“Why Linked Data is Not Enough for Scientists” is an excellent article dealing with the very real and very complicated factors, over and above access, that impact data reuse.

“Abstract—Scienti?c data stands to represent a signi?cant portion of the linked open data cloud and science itself stands to bene?t from the data fusion capability that this will afford. However, simply publishing linked data into the cloud does not necessarily meet the requirements of reuse. Publishing has requirements of provenance, quality, credit, attribution, methods in order to provide the reproducibility that allows validation of results. In this paper we make the case for a scienti?c data publication model on top of linked data and introduce the notion of Research Objects as ?rst class citizens for sharing and publishing.”

Read the whole article here. >

Written by Laurie N. Taylor

September 30th, 2010 at 12:42 am

Posted in access,data sets

Word of the Day (or maybe even year): autotechnogeoglyphics

without comments

Autotechnogeoglyphics

I’m not sure how I came across the “Pruned” blog’s post on autotechnogeoglyphics, but it’s the most wonderful word I’ve seen in sme time. auto-techno-geo-glyphics sounds of steampunk, science fiction, fantasy, epic world building and world altering technology, histories of giants, and it holds so much promise, so much potential for exploration. While the definition speaks more to reality, the word speaks to fantasy worlds of stone like Shadow of the Colossus, science-fiction worlds of steel, and ancient worlds of myth and reality, of stone, sediment, and things long lost.

“Pruned” explains autotechnogeoglyphics from the CLUI newsletter as:

Among the many wonderful things worth noting, there is their aerial photographs of automotive test tracks — those concrete hieroglyphs, in the fringes of urban sprawls, recording “the condition of America, land of the automobile, a syndrome that transformed the landscape of the nation, and the world, more than any other.”

As an information addict, I normally value words by utility. However, there are those words that go beyond the possible into the impossible, seeking for more than they can possibly find and finding all that they can in the process. autotechnogeoglyphics is one of those; it speaks to what it is and what it could be, helping to define studies of large-scale, made-designs in the Earth, made only over time with parts intentional and parts their sum unforeseeable in their planning, and all seen only with enough correct distance. It only seems right in all lowercase, perhaps because weighting the first letter seems to give priority to the auto over the rest, or perhaps the font isn’t right for a word of this magnitude. Hopefully autotechnogeoglyphics will appear enough to find its fit for font and scale, and hopefully it will also find and share new words that similarly sing.

Written by Laurie N. Taylor

April 27th, 2008 at 7:27 pm

Codework : Opening Keynote Ted Nelson

without comments

Codework PosterI’m currently at the Center for Literary Studies (CLC) Codework: Exploring relations between creative writing practices and software engineering workshop, sponsored by the National Science Foundation, held at West Virginia University (and it’s April 3-6, 2008 and there’s more on it here). Ted Nelson, coiner of the word hypertext and media studies visionary spoke. Sandy Baldwin opened by introducing Nelson – describing Nelson as a luminary, and having him speak as astronomical – and then describing how Nelson influenced his own English practice and work.

Nelson began by explaining his preference for open ended speaking, and then introduced his new book-in-progress “geeks bearing gifts” on the false rhetoric surrounding current software. Nelson continued on, explaining that current software and applications aren’t about technology, but are really packages of conventions selected by someone, with an agenda, and mentioned OOXML as an example, that he’s been fascinated with making a document system and not the fake paper simulators we have now, and he showed latest version of the Xanadu Project (xanarama.net). Nelson’s reputation as a visionary and a great speaker are well earned, so well earned that I stopped taking notes after realizing that my notes would not do his presentation justice in the slightest. I believe the presentation was recorded, though, so once that’s posted I’ll add a link to it.

Written by Laurie N. Taylor

April 4th, 2008 at 4:07 am