[Opengenalliance] old bailey proceedings

Ben Brumfield benwbrum at gmail.com
Fri May 13 23:08:24 BST 2011


On Fri, May 13, 2011 at 4:23 PM, Guy Etchells <guy.etchells at virgin.net> wrote:
> On 13/05/2011 16:54, Javier Ruiz wrote:
>>
>> Thanks for the replies. The reason why I asked the original question is
>> because it seemed that photographs are clear but scans not. It only gets
>> greyer and farcical, at least in UK.
>>
>> Would it be worth challenging the copyright of scans of historical
>> public domain documents and clarify it once for all?
>>
>> Could the public domain aspect be considered in parallel with the
>> question of whether a scan is an artful photograph or a humble photocopy?
>>
>> Legal arguments aside. If ancient documents are digitised and then
>> locked back in their respective archives and the results of the exercise
>> are copyrighted, this is a loss for the public domain. Maybe we need to
>> distinguish between copies where the original remains and when the
>> analog copy is destroyed. I spoke to someone recently about the TNA
>> trying to burn IWW medal records after making microfiches, this would
>> mean the only records available would have been the new format. This
>> surely is some copyright reset.
>>
>> Then the next thing to clarify is whether the transcription of the text
>> in scans will owe some cumulative copyright to the images.
>>
>> I.e. can I just transcribe any originally public domain text from scans
>> if I can physically see them online or otherwise or can I be prevented
>> from doing so?
>>
>> Must I pay author rights to the owner of a scan if I transcribe and sell
>> a book with public domain information from scans?
>>
>> Then is clarifying the copyright of transcriptions.
>>
>> The copyright of the transcriptions themselves seems to depend on the
>> skill required, so that in theory a transcription of mediaeval texts
>> would be closer to copyright than modern type.
>>
>> What happens if I make a transcription that is identical letter by
>> letter to another transcription? The PRS in relation to folk music
>> transcriptions claims that identical transcriptions would have
>> independent copyrights:
>>
>> http://www.prsformusic.com/creators/wanttojoin/how_it_works/arrangements/Pages/arrangements.aspx
>>
>> How can this work in practice? Anyone can then copy and claim they have
>> an identical transcription. Maybe before computers this could be proved,
>> but it seems nonsense to me to try to sustain this argument in the
>> digital milieu.
>>
>> On another possible option, what if we OCR the scans and then corrected
>> the computer generated texts by looking up the existing transcriptions
>> only to check for some words? Or even better get a computer to do it?
>>
>> Then if the transcriptions are organised and indexed is the database
>> right.
>>
>> Ben Laurie was mentioning the other day that with computer power
>> nowadays it may be easier to put certain materials in a single text file
>> and use search tools, rather than databases. Could we just dump the text
>> into a file this way without replicating the database structure and
>> bypass the database right if the database contents are public domain
>> texts?
>>
>> TBH I seems apparent that copyright law may be a bit of a dead end in UK
>> as a tool to fight for digital versions of public records to remain in
>> the public domain, and this is mostly an issue of policy,
>> particularly as we are dealing with publicly funded institutions. It
>> would be good to exhaust all possibilities though.
>>
>> best, Javier
>>
>
> As someone who has transcribed records and digitises records I can see the
> argument from both sides.
>
> First unlike the USA a large amount of records in the UK are not in the
> Public Domain.
>
> It seems to me that many who demand free access to records forget that to
> provide records online costs money and even groups such as FreeBMD have
> benifited from the fact that commercial concerns have allowed them access to
> more than a few digital copies of the records they transcribe.
> Though I would like to see more records available online with free access we
> must remember that someone has to pay for that facility.
>
> If commercical concerns cannot claim copyright or as many do these day place
> conditions of use on their images and database they will not digitise the
> records and we the public will be the losers.
>
> Rather than fight these companies it would be better to try to find common
> ground where there is a balance between commercial access and free access.
> Cheers
> Guy
>

I disagree with your statement that commercial concerns will not
digitize if they cannot claim copyright.  Although I'm not a lawyer,
I'm pretty familiar with American law on the subject, and I believe it
is far friendlier to digitization efforts than (what I understand of)
UK law.

Digitization by private organizations and through public/private
partnerships happens quite successfully under US law, even when the
documents, images, and database are not copyrightable.

Imagine a Public Domain document that is scanned, transcribed, and
indexed by a private concern.  An end user (who is a member of the
public with some sort of relationship to the concern) searches the
resulting database for the document with the goal of publishing it in
facsimile and excerpt.  How does this work if the concern can't claim
copyright?

1) The physical document is public domain.
2) The resulting scan of the document is public domain (thanks to
Bridgeman v. Corel).
3) The transcription--being itself a reproduction of the public domain
document--is also public domain, however any mark-up like TEI
annotation may be copyrighted by the transcribing entity.
4) The index/database is public domain, as databases are not
copyrightable in the US.

However, nothing obligates the private concern to allow access to any
of 2-4.  That data is hosted on its own servers, and in order for
anyone to see it, they must enter into some sort of relationship with
the concern.  In this case, our end user has registered for an account
(perhaps even a free account), and accepted terms of service as a
condition of getting that account.  This places the relationship
between the user and the concern under a legally enforceable contract.

Generally those contracts impose restrictions on republishing their
data; Ancestry.com's TOS allows users to republish 200 documents per
year, for example.  So long as our user complies with these (again due
to contract law, not copyright enfocement), he may re-publish the
public-domain transcript as well as the facsimile of the scanned
document .

The only "hole" I'm aware of from the digitizing company's perspective
might be if a third party were to re-assemble all documents
re-published from the database with the purpose of assembling a
competing database.  I suspect that this would be legal--the images
and data being public domain and the third party having entered no
contract with the digitizing company--but I don't know of any cases
where it's happened.  The source databases are simply too large, and
the re-publications too sparse and fragmented for this to make sense.

There are certain incentives to the companies as well -- as a user, I
know that I will be able to do what I want with up to 200 documents
per year.  I am confident  that once they are in my hands, I may treat
them like any other public domain document, and no digitizing
intermediary is going to show up and claim copyright over some link in
the chain.  This confidence makes me much happier to use a commercial
provider like Ancestry, and less willing to investigate or invest in
non-commercial digitization efforts like the one OGA is trying to
start.  In addition to the end-user confidence, public domain scans
and transcripts simplify operations within the digitizing company,
since there's no need to go back and ask mother-may-I whenever
previously-digitized materials need to be converted to new image
formats or re-edited according to new transcription conventions.

If you look at their stock price, Ancestry.com has been quite
successful under the US copyright regimen.

Ben



More information about the Opengenalliance mailing list