This page, which accompanies the second workshop in the Doing Digital in the Open series, teaches you to use the DMPTool to create a Data Management Plan describing to funders how you will steward project data throughout the data lifecycle. Along the way, this page offers tools and best practices for storing, backing up, organizing, and describing your data. On this page you'll learn:
For more detailed information on how to create a Data Managment Plan, see the UO Guide to Data Management Best Practices.
1. Be consistent.
2. Be descriptive so others can understand your meaning.
The EPFL (École polytechnique fédérale de Lausanne) writes that "according to the High Level Expert Group on the European Open Science Cloud, on average about 5% of research expenditure should be spent to ensure a proper data management and stewardship." They offer an interactive costing tool that allows you to specify hosting platform, currency, and other considerations.
The UK data provider Jisc also has a site discussing data costing methods with links to other resources.
A README is one of a number of ways that you can add metadata (contextualizing information) to your project. Whether you use a README or some other metadata tool, you will want to think of what information would be needed to understand and analyze your data, and/or replicate your results, 20 years from now.
For a given research project, metadata are generally created at two levels: project- and data-level. Project-level metadata describes the “who, what, where, when, how and why” of the dataset, which provides context for understanding why the data were collected and how they were used.
Examples of project-level metadata are:
Here is a template for a project-level README.
Dataset level metadata are more granular. They explain, in much better detail, the data and dataset. (perhaps not surprisingly).
Data-level metadata might include:
Here is a template for a data-level README.
Information in this section comes from the Oregon State University libraries' guide to metadata, which is licensed under a Creative Commons Attribution NonCommercial 4.0 International License.
What is version control?
Version control, put simply, involves keeping copies of a project at various stages of completion, allowing you to roll back to a previous version of a project.
The simplest way to keep versions of a project is to keep manual copies of your project files at key points in the process. Name your files using best practices and manually append a sequential numbered system: e.g. v01, v02. This works best for projects managed by only one person, where there will not be too many versions, and where the files are always accessed from only one location. Under some circumstances, you may find it helpful to use a file name manager such as Bulk Rename Utility, Renames, PSRenamer, or WildRename to update or modify names in bulk.
Google Drive software will automatically create new versions as you make changes to your document, spreadsheet, or presentation. Google keeps track of who made what changes and when, and allows to to see the changes and revert back to a previous version at any time.
Microsoft's OneDrive, if you have access to it, also offers simple version control.
The OSF platform discussed in the third part of this guide offers version control on projects, as well as integration with Google Drive, OneDrive, GitLab, and GitHub.
If you are working with a group of people or with a more complex project that involves code and/or relatively small data sets that can be stored in text formates, you may wish to use the git version control system. Git lets us compare, restore, and merge changes to our "stuff," where stuff is any plain text file. You can use Git to version CSV files, for example, but not XSLX files -- CSV is plain-text while XSLX is a binary file, which means Microsoft has wrapped it in other layers so Git can't look through to see and version the actual content of the file well.
If you want to use GitHub, GitLab, or any other hosting platform, then you'll first need to learn how to use Git. It's a command line utility, so if you're not comfortable using the terminal, then that's a good first step as well.
Portions of this section re-use content from the NYU Libraries guide to Version Control, which is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Intellectual property rights (IPR) include copyright, patents, trademarks and design rights. IPR grant creators or owners of a work certain controls over its use. Some rights such as patents require registration, while others, such as copyright, accrue automatically upon the work’s creation. IPR affect the way both you and others can use your research outputs.
Failure to clarify rights at the start of the research process can lead to unexpected limitations to:
Research funders often expect you to clarify IPR at the grant proposal stage in your data management plan.
Some common open licenses include the following:
Data Management Plans do not change the way intellectual property has been handled under federal awards. Universities will still be able to hold copyright in works created under the award and obtain title to patents conceived or reduced to practice under the award. What research projects will need to do is to articulate how they are providing permissions/licenses to the data and this may or may not involve intellectual property rights depending on the type of data.
It is possible that your project may need to arrange for access to third party data or associated research artifacts that may have specific limitations in how they can be distributed (based on IP or the agreement by which your project obtained the data or artifact). The Office of Innovation Partnership Services (IPS, formerly Technology Transfer Services) can help your project obtain permissions. Your research project may also have received data under confidentiality or other restrictions that will need to be identified and explained in your data management plan. IPS is also happy to assist you in handling these issues.
Facts alone are not copyrighted but their arrangement may be sufficient original expression to merit copyright. For databases, there may be a mix of copyright and data for your project to consider. Some countries recognize certain rights in databases (i.e. Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases). If you receive data that holds no copyright, the agreement or permissions you obtain the data under may be a simple donation or a bailment, conditioned on your respecting certain rules called out in the permission.
Your research project should consider what permissions are appropriate for users when you make the data and/or copyrighted works from your research project available. You should consider potential users other than the federal government because it already receives a non-exclusive, royalty free license for government purposes to copyrighted works and data created under federal awards (2 CFR 15). There are a number of factors for you to consider (e.g., attribution, notification of use, redistribution, quality control/standards, and risk).
In some instances, your research project will not be concerned about any of these factors and may effectively donate the data to the public. In other instances, you will want to create some type of "commons" with respect to sharing and use of the data that sets expectations for what community members agree to for sharing. You may also need to restrict certain portions of the data (or types of data) based on restrictions you agreed to in receiving data from a third party. Intellectual property rights such as a copyright are simply tools that are used as part of the permissions (or license) your research project provides access to the data under. You can find more information on how to create a permissions statement in the Constructing Data Permissions section of this website.
In the event that your project recognizes that an invention has resulted from federally funded research, contact the University of Oregon's Office of Innovation Partnership Services to complete an invention disclosure and discuss your goals for the work. IPS works with research projects at any stage to help them think creatively about dissemination and knowledge transfer strategies. Ideally, this discussion proceeds the dissemination of data under the data management plan and allows your research project to execute on your plan for distribution of research artifacts created under your award. The university has an obligation to disclose each new invention created under the grant to the federal funding agency within two months after the inventor discloses it in writing to the university. It is not always clear when an "invention" has arisen, so feel free to contact IPS anytime you have a question.
The first part of this guide is excerpted from the University College Dublin guide to licensing.
ICPSR DMP resources, including links to sample DMPs in various scientific disciplines.
A fictional AHRC technical plan developed and commented on by a reviewer to highlight common pitfalls.
Ways a UO Digital Scholarship Services Librarian can help you manage data