Skip to main content
University of Oregon
UO Libraries

​Doing Digital Projects in the Open Workshop Series

This guide is a companion to the Doing Digital Projects in the Open Workshop Series

Doing Digital Projects in the Open Workshop Series - Upcoming Dates


Workshop 1: What are Open Digital Projects?
Instructors: Jonathan Cain, Gabriele Hayden, UO Libraries Data Services, and Kate Thornhill, UO Libraries Digital Scholarship Services
What will be covered? - Defining and developing your open digital project
Monday, April 1st 2pm-4pm
Knight Library - DREAM (Digital Research, Education, and Media) Lab

Workshop 2: Creating your Data Management Plan
Instructors: Jonathan Cain, Gabriele Hayden, UO Libraries Data Services, and Kate Thornhill, UO Libraries Digital Scholarship Services
What will be covered? – Creating and sharing a Data Management Plan using DMPTool
Monay, April 8th 2pm-4pm
Knight Library - DREAM (Digital Research, Education, and Media) Lab

Workshop 3: Using OSF to Manage your Projects
Instructors: Jonathan Cain, Gabriele Hayden, UO Libraries Data Services, and Kate Thornhill, UO Libraries Digital Scholarship Services
What will be covered? – Using the Open Science Foundation to Manage your Projects
Monday, April 15th 2pm-4pm
Knight Library - DREAM (Digital Research, Education, and Media) Lab

Workshop 2: Creating your Data Management Plan

DMP Tool: Build your Data Management Plan

This page, which accompanies the second workshop in the Doing Digital in the Open series, teaches you to use the DMPTool to create a Data Management Plan describing to funders how you will steward project data throughout the data lifecycle. Along the way, this page offers tools and best practices for storing, backing up, organizing, and describing your data. On this page you'll learn:

  1. Backup and file-naming best practices
  2. How to use README files to document your data
  3. How to use the DMPTool to create Data Management Plans that meet funder requirements

For more detailed information on how to create a Data Managment Plan, see the UO Guide to Data Management Best Practices.

Definitions, FAQs, How-tos, and Documentation

File Naming

1. Be consistent.

  • Have conventions for naming (1) Directory structure, (2) Folder names, (3) File names
  • Always include the same information (eg. date and time)
  • Retain the order of information (eg. YYYYMMDD, not MMDDYYY )

2. Be descriptive so others can understand your meaning.

  • Try to keep file and folder names under 32 characters
  • Within reason, Include relevant information such as:
  • Unique identifier (ie. Project Name or Grant # in folder name)
  • Project or research data name
  • Conditions (Lab instrument, Solvent, Temperature, etc.)
  • Run of experiment (sequential)
  • Date (in file properties too)
  • Use application-specific codes in 3-letter file extension and lowercase: mov, tif, wrl
  • When using sequential numbering, make sure to use leading zeros to allow for multi-digit versions. For example, a sequence of 1-10 should be numbered 01-10; a sequence of 1-100 should be numbered 001-010-100.
  • No special characters: & , * % # ; * ( ) ! @$ ^ ~ ' { } [ ] ? < > -
  • Use only one period and before the file extension (e.g. name_paper.doc 
  • NOT name.paper.doc OR name_paper..doc)

Example: Project_instrument_location_YYYYMMDD[hh][mm][ss][_extra].ext

The UK Data Archive has developed a Data Costing Tool, available here as a pdf.

The EPFL (École polytechnique fédérale de Lausanne) writes that "according to the High Level Expert Group on the European Open Science Cloud, on average about 5% of research expenditure should be spent to ensure a proper data management and stewardship." They offer an interactive costing tool that allows you to specify hosting platform, currency, and other considerations.

The UK data provider Jisc also has a site discussing data costing methods with links to other resources.

A README is one of a number of ways that you can add metadata (contextualizing information) to your project. Whether you use a README or some other metadata tool, you will want to think of what information would be needed to understand and analyze your data, and/or replicate your results, 20 years from now.

For a given research project, metadata are generally created at two levels: project- and data-level. Project-level metadata describes the “who, what, where, when, how and why” of the dataset, which provides context for understanding why the data were collected and how they were used.

Examples of project-level metadata are: 

  1. Name of the project
  2. Dataset title
  3. Project description
  4. Dataset abstract
  5. Principal investigator and collaborators
  6. Contact information
  7. Dataset handle (DOI or URL)
  8. Dataset citation
  9. Data publication date
  10. Geographic description
  11. Time period of data collection
  12. Subject/keywords
  13. Project sponsor
  14. Dataset usage rights

              Here is a template for a project-level README.
Dataset level metadata are more granular. They explain, in much better detail, the data and dataset. (perhaps not surprisingly).

Data-level metadata might include: 

  1. Data origin: experimental, observational, raw or derived, physical collections, models, images, etc.
  2. Data type: integer, Boolean, character, floating point, etc.
  3. Instrument(s) used
  4. Data acquisition details: sensor deployment methods, experimental design, sensor calibration methods, etc.
  5. File type: CSV, mat, xlsx, tiff, HDF, NetCDF, etc.
  6. Data processing methods, software used
  7. Data processing scripts or codes
  8. Dataset parameter list, including
    • Variable names
    • Description of each variable
    • Units

                     Here is a template for a data-level README.

Information in this section comes from the Oregon State University libraries' guide to metadata, which is licensed under a Creative Commons Attribution NonCommercial 4.0 International License. 

What is version control?

Version control, put simply, involves keeping copies of a project at various stages of completion, allowing you to roll back to a previous version of a project. 

The simplest way to keep versions of a project is to keep manual copies of your project files at key points in the process. Name your files using best practices and manually append a sequential numbered system: e.g. v01, v02. This works best for projects managed by only one person, where there will not be too many versions, and where the files are always accessed from only one location. Under some circumstances, you may find it helpful to use a file name manager such as Bulk Rename Utility, Renames, PSRenamer, or WildRename to update or modify names in bulk.

Simple Software Options

Google Drive software will automatically create new versions as you make changes to your document, spreadsheet, or presentation. Google keeps track of who made what changes and when, and allows to to see the changes and revert back to a previous version at any time.

Microsoft's OneDrive, if you have access to it, also offers simple version control.

The OSF platform discussed in the third part of this guide offers version control on projects, as well as integration with Google Drive, OneDrive, GitLab, and GitHub.

Version Control Software

git logo

If you are working with a group of people or with a more complex project that involves code and/or relatively small data sets that can be stored in text formates, you may wish to use the git version control system. Git lets us compare, restore, and merge changes to our "stuff," where stuff is any plain text file. You can use Git to version CSV files, for example, but not XSLX files -- CSV is plain-text while XSLX is a binary file, which means Microsoft has wrapped it in other layers so Git can't look through to see and version the actual content of the file well.

If you want to use GitHub, GitLab, or any other hosting platform, then you'll first need to learn how to use Git. It's a command line utility, so if you're not comfortable using the terminal, then that's a good first step as well.

Portions of this section re-use content from the NYU Libraries guide to Version Control, which is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Intellectual property rights (IPR) include copyright, patents, trademarks and design rights. IPR grant creators or owners of a work certain controls over its use. Some rights such as patents require registration, while others, such as copyright, accrue automatically upon the work’s creation. IPR affect the way both you and others can use your research outputs.

Failure to clarify rights at the start of the research process can lead to unexpected limitations to:

  • Your research
  • Its dissemination
  • Future related research projects, and
  • Associated profit or credit

Research funders often expect you to clarify IPR at the grant proposal stage in your data management plan.

Some common open licenses include the following:

Intellectual Property Rights

Data Management Plans do not change the way intellectual property has been handled under federal awards. Universities will still be able to hold copyright in works created under the award and obtain title to patents conceived or reduced to practice under the award. What research projects will need to do is to articulate how they are providing permissions/licenses to the data and this may or may not involve intellectual property rights depending on the type of data.

Bringing Data Into Your Research Project

It is possible that your project may need to arrange for access to third party data or associated research artifacts that may have specific limitations in how they can be distributed (based on IP or the agreement by which your project obtained the data or artifact). The Office of Innovation Partnership Services (IPS, formerly Technology Transfer Services) can help your project obtain permissions. Your research project may also have received data under confidentiality or other restrictions that will need to be identified and explained in your data management plan. IPS is also happy to assist you in handling these issues.

Facts alone are not copyrighted but their arrangement may be sufficient original expression to merit copyright. For databases, there may be a mix of copyright and data for your project to consider. Some countries recognize certain rights in databases (i.e. Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases). If you receive data that holds no copyright, the agreement or permissions you obtain the data under may be a simple donation or a bailment, conditioned on your respecting certain rules called out in the permission.

Using Intellectual Property Rights To Make Your Data Available

Your research project should consider what permissions are appropriate for users when you make the data and/or copyrighted works from your research project available. You should consider potential users other than the federal government because it already receives a non-exclusive, royalty free license for government purposes to copyrighted works and data created under federal awards (2 CFR 15). There are a number of factors for you to consider (e.g., attribution, notification of use, redistribution, quality control/standards, and risk).

In some instances, your research project will not be concerned about any of these factors and may effectively donate the data to the public. In other instances, you will want to create some type of "commons" with respect to sharing and use of the data that sets expectations for what community members agree to for sharing. You may also need to restrict certain portions of the data (or types of data) based on restrictions you agreed to in receiving data from a third party. Intellectual property rights such as a copyright are simply tools that are used as part of the permissions (or license) your research project provides access to the data under. You can find more information on how to create a permissions statement in the Constructing Data Permissions section of this website.

Inventions

In the event that your project recognizes that an invention has resulted from federally funded research, contact the University of Oregon's Office of Innovation Partnership Services to complete an invention disclosure and discuss your goals for the work. IPS works with research projects at any stage to help them think creatively about dissemination and knowledge transfer strategies. Ideally, this discussion proceeds the dissemination of data under the data management plan and allows your research project to execute on your plan for distribution of research artifacts created under your award. The university has an obligation to disclose each new invention created under the grant to the federal funding agency within two months after the inventor discloses it in writing to the university. It is not always clear when an "invention" has arisen, so feel free to contact IPS anytime you have a question.

 

The first part of this guide is excerpted from the University College Dublin guide to licensing.

 

FIND A DATA REPOSITORY

Repositories can help you:

  • manage your data
  • cite your data by supplying a persistent identifier
  • facilitate discovery of your data
  • preserve your data for the long-run

Data repositories sponsored by the University of Oregon

Inter-university Consortium for Political and Social Research (ICPSR) – The world’s largest archive of digital social science data. ICPSR staff can guide you in preparing your data for archiving and distribution. See their Guide to Social Science Data Preparation and Archiving and their page on Depositing Data.

Other data repositories

The Registry of Research Data Repositories provides a search tool to help find an appropriate repository for your data. In addition, the Libraries’ Data Services department (datamanagement@uoregon.edu) can help you to select a data repository suitable to your needs. Not all of these repositories take researcher-produced datasets or ensure long-term preservation of your data, so contact them for more details. Want a tool for comparing features of different data repositories in detail? Try MIT's data repository comparison template (.rtf file).

Using data from a repository?

Cite the data to give credit to the data producer, enable others to use the data, and meet journal requirements.

This page modifies material from the MIT Libraries Data Management page on Data Repositories, licensed under a Creative Commons Attribution Non-Commercial License.

Humanities

Sample DMPs in the humanities and sciences from the Digital Curation Centre.

Social Sciences

Sample data management plan for project with data to be submitted to ICPSR, The Inter-University Consortium for Political and Social Research. 

Sciences

Sample NSF General and NSF Bio data management plans from DataONE.

ICPSR DMP resources, including links to sample DMPs in various scientific disciplines.

What Not to Do

fictional AHRC technical plan developed and commented on by a reviewer to highlight common pitfalls.

Ways a UO Digital Scholarship Services Librarian can help you manage data

  • Think through how Data Management can support your research success
  • Make technical recommendations
  • Consult at any stage of building your Data Management Plan
  • Provide consultations on choosing a Data Repository or temporary data storage
  • Strategize with faculty on how to integrate data best practices into the classroom
  • Bring presentations on data best practices to a classroom, lab, or working group
Loading ...
University of Oregon Libraries
1501 Kincaid Street Eugene, OR
97403-1299
T: (541) 346-3053
F: (541) 346-3485
Make a Gift