Disclaimer: Nothing contained in this post should be construed as legal advice.
Ok, now that we’ve gotten that out of the way, let’s get to the interesting part. The first post in this series arose from a simple question: Is it legal for a student to publish their dissertation without the consent of their advisor? As I mentioned previously, I am ignoring for the moment any ethical issues (important as they may be), and just focusing on the legal ones. This question boils down to an issue of data ownership. Are the data on which the dissertation is based subject to copyright and, if so, who holds the copyright? I am not an expert on copyright, but fortunately, there are others who are. Charles Oppenheim (@CharlesOppenh) is a former professor of Information Science at Loughborough University. He has published on copyright and intellectual property rights, and advises companies and government organizations. A huge thanks to Dr. Oppenheim, who has patiently and generously answered all my questions.* With his permission, I have selectively reproduced portions of his answers below.
Let’s start with the issue of data ownership. Suppose the following scenario: A student has collected a set of microscope images, or a set of recordings in the form of binary files. The student places all the files on a hard drive, but only organizes them with respect to the date on which they were obtained. Are these data subject to copyright? In response to my query, Dr. Oppenheim answered:
…there is no copyright in individual facts, though there is copyright (in the USA and EU) in a COLLECTION of facts if there has been creativity in the selection and arrangement of the facts.
For those interested in the case law, this comes from the Supreme Court ruling in the case of Feist v. Rural. Let’s examine this further, because it seems to me there is a lot of confusion regarding how this applies to research data. I have heard people say that research data incur copyright just by putting them in ‘fixed form’. In fact, this is not true, since putting data in fixed form does not meet the minimal creativity requirement established by Feist. The Copyright Office, Guidelines for Registration of Fact-Based Compilations states that the arrangement of facts within a compilation must “go beyond the mere mechanical grouping of data as such, for example, the alphabetical, chronological, or sequential listings of data” (quoted in Patry, 1990).** There is an issue of physical ownership of the hard drive; the student (and the PI, in many cases) cannot take the hard drive from the laboratory without permission. However, there is nothing a priori preventing the student from making a DVD copy of the raw data files and sharing those with others because there is no copyright to infringe upon. There is one potential caveat. As Ian Holmes (@ianholmes) pointed out on Twitter, there may be cases in which students have signed a non-disclosure agreement, or their contract has a specific clause saying they cannot take copies of data and share them outside the laboratory. If this is the case, then the student must legally abide by the terms of the contract or NDA.
So, now we know some creativity in the compilation or presentation of the data must exist in order for the data to incur copyright. With specific reference to a student dissertation and the data compilation on which it is based, Dr. Oppenheim goes on to say:
I assume there has been such creativity, since no doubt the student and/or advisor decided which facts to present and which were uninteresting/irrelevant from a range of facts collected.
In other words, at the moment the student or advisor chooses a set of criteria by which to filter, rank, or otherwise selectively present the data, the resulting collection is now subject to copyright. For most dissertations, this means that the underlying data compilation is copyrighteable, since it is rarely the case that every piece of data is presented in the final written document without some element of creativity applied to the original data set.
Having established that the data comprising the dissertation are subject to copyright, who owns the rights? Dr. Oppenheim writes:
There are two possibilities – either the creator (i.e., the student – I am assuming the advisor did not prepare the collection), or the employer if the work was “work for hire”. … “Work for hire” (or “employee-created works” in EU law) are those works created by an employee who was paid to create those works. Unless the student had a contract of employment with the University, or the advisor, which stated “we will pay you so much, to do this work”, it is not a work for hire. It may well be that such a contract is embedded in the University regulations which the student signed up to when they started their research. Was there such a contract, or was it simply a grant without any strings attached? If a contract, then this is indeed a work for hire, and the University/advisor owns copyright in the outputs; If it was a grant, the student owns it.
Therefore, to determine who owns the rights to data in any particular case, we must know the terms under which the collection was created. If the conditions do not meet the requirements of ‘work for hire’, then the student owns the rights to the data and there would be no legal problem in publishing the data without the permission of the advisor. Let’s look at the more complicated case. In many universities, students sign a notice of appointment, which serves as an employment contract between the student and the university. I am not sure whether these NOAs typically state explicitly that the student is being paid to produce a data collection, but I imagine that in some cases the contract can be interpreted in these terms. If the data were collected under these terms, then the advisor/university would have the rights to the data, and the student would be guilty of copyright infringement in publishing the data without permission.
But is this where the story ends? I mentioned in my last post that there are cases in which the student holds copyright on the written dissertation. For example, some universities post student dissertations in their online repositories. These records show the student as the sole author, and may indicate that the author holds copyright. How do we reconcile this with what we have learned about data ownership? Dr. Oppenheim writes:
Whoever the original owner [of the data] was, that owner can choose to assign or licence the copyright in the work to anyone else – either by contract, or by custom and practice. …irrespective of whether the University/advisor did indeed initially own the copyright, they have assigned the student the copyright in the work by placing the material on the repository with that notice. In effect, the University has shot itself in the foot by doing that. To sum the position up in a nutshell: it is likely the University/advisor did own the copyright in the first place. But even if it did, it chose to grant the student that copyright. Ergo, the student is entitled to use any or all of the materials, including the data, in their thesis in any way they like, including writing it up for a journal article.
There you have it. The moment a university posts a dissertation with a note saying the student author has copyright, they forfeit any future claims to ownership and cannot legally prevent the student from publishing the work.
So far, we have only considered the legality of publishing without permission. For the next post in the series, I’d like to discuss ethical considerations and non-legal ramifications. We’ll look at particular case studies with very different outcomes. Coming soon…
1. Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991).
2. Patry, W. (1990). Copyright in compilations of facts (or why the white pages are not copyrighteable). Communications and the Law, 12: 37