Home Help About Profile Search Home
Guest: login

View Thread > ILAW Conference > Life, the Internet and Everything > How Viral is the GPL? And How Clean is the Clean Room?


Now that you're back to your regularly scheduled life (and, no doubt, Internet), what if anything looks different in light of your participation last week?

After thinking about one particular discussion at the conference, I've become more convinced of the importance of introducing computer programmers and technical people to basic intellectual property concepts.  

The discussion about the "viral" nature of the GPL manifests a widespread misunderstanding about the nature and extent of copyright protection.  I spoke briefly with Jason Matusow about his concern that working with GPL code "contaminates" one's own code.  We agreed that the basis for the GPL provision that source code must be included with any distribution of modified GLP programs is that such a modification constitutes a "derivative work" as defined in the copyright law.  I suggested that those concerned about overreaching of the GPL might have an overly broad interpretation of what constitutes a derivative work, and that the GPL could be circumvented in the following way.  

Start with a GPL program.  Modify the source code to add some desired functionality.  Compile the code.  At this point, the GPL envisions that if you distribute your modified program, you must distribute your source code with it, because the compiled version of the modified code is a derivative work of the original GPL program.  Instead of distributing the compiled version of the modified code, however, you could write a patch program that would convert the compiled version of the original GPL program into the compiled version of the modified program.  The patch program, since it would contain no expression from the original GPL program and would thus not be a derivative work, could be distributed without source code.  

Of course, one could argue with this characterization of a derivative work, or exactly what it means to include expression from a protected work.  Still, Jason's response, "Isn't it a derivative work if you use the IDEA (emphasis added) from someone else's code?" manifests a fairly basic lack of understanding of copyright law.  (See 17 USC 102 (b): In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.)

I'll grant that Jason, who is not a lawyer, can't be expected to understand all the nuances of the idea-expression dichotomy.  But as program manager of the Shared Source Initiative at Microsoft Corp., he should at least be aware that there is such a distinction in the statute.

The notion of "code contamination" is not unique to discussions of the GPL.  The term "clean room" is used to describe a software development method for copying the functionality of proprietary software while avoiding infringement.  (See, e.g., http://www.jli.com/services.htm#clean_room.)  Technicians and programmers familiar with this approach may actually infer that it is the only way to circumvent infringement.  Clean room development may be cheaper in the long run than a protracted infringement suit, but such exclusive focus gives a distorted understanding of copyright law.

In effect, clean room development constitutes a safe harbor.  The ideas embodied in a computer program-- that is, the elements NOT covered by copyright-- include MORE than just the functionality of the program.  In the sense that software engineers comprehend functionality, function is WHAT a program does, without regard to HOW it is done.  For example, if a program's function is to sort a list, any of a number of algorithms could be used to achieve that end.  How the sorting algorithm works would not be considered part of the functional specification of the program.  Nevertheless, it is quite clear under 17 USC 102(b) that how an algorithm works is not copyrightable subject matter.

Prof. Lessig suggested that the ambiguous reach of the GPL should be clarified.  In a broader vein, I think that technical education should include an introduction to intellectual property.  Policymakers would benefit greatly from the informed input of technically savvy individuals.

You raise several important issues:

First, you argue the "importance of introducing computer programmers and technical people to basic intellectual property concepts."  Although managers at some software companies have received training in this area, I agree that CS and MIS programs could and should include more of an introduction to this topic.  This may help, but it does not eliminate the lack of coherence in the law in this field.  The developers may become more knowledgeable...and more confused.

Second, you suggest that the fears over the "viral" nature of the GPL betray confusion about the "nature and extent of copyright protection."  This may be true and your reference to 17 USC 102(b) highlights some key exclusions to copyright protection, but I'm not sure your examples demonstrate the degree of freedom you suggest.

(a) Let's consider your scheme for circumventing the distribution with source code provisions of the GPL.  The idea of using a patch program (or its equivalent) has been proposed by others in discussions on how to break the GPL.  See for example this excerpt from a March 2000 discussion on license-discuss@opensource.org (www.mail-archive.com/license-discuss@opensource.org/msg01488.html).  It is not at all clear that the patch program scheme circumvents the requirements of the GPL (see the discussion in the above thread).  Furthermore, unless your proprietary modifications or enhancements are trivial, almost certainly any patch executable (or patch file to be used with a patch program) would include evidence (once decompiled and/or decrypted) of data structures, calls, etc. that could clearly tie it to the derivative copy of the GPL controlled software that you designed your patch program to recreate.  On face value, I would argue that your patch program is a "work based on the program" (language of GPL) since it would almost certainly contain a portion of it, either verbatim or with modifications and/or translated into another language (language of GPL).  In any case, since the GPL depends on the concept of derivative work in copyright law, the validity of your scheme under the GPL becomes an issue of whether the courts consider the patch program a derivative work.  This is an interesting area, that conceptually involves a host of other concepts such as software layering.  Richard H. Stern has an interesting site on the "Mona Lisa With a Moustache" and derivative work issues (http://www.law.gwu.edu/facweb/claw/lhooq0.htm).  I must also ask, why are we trying to circumvent the GPL?  It seems a violation of the copyleft ideal even if it is not a technical violation of the GPL.

(b) Even the clean room may not always be the safe harbor you claim. The critical issue here is the specificity of the design document that the clean room programmers use.  The safety of this approach would depend in part on applying (1) the test of Whelan v. Jaslow with its "gray area" between idea and expression and (2) the test of Computer Associates v. Altai with its abstraction and filtration phases.  The key point here is that programmers don't go into the clean room without a specification.  The desire for compatibility or interoperability with existing products may provide an incentive to replicate file layouts and screen designs, possibly resulting in a structure, sequence, and organization or "SSO" problem. (Whelan used a different language than Jaslow so the code itself didn't infringe copyright).  A specification with infringing specificity might taint the clean room work of the team.

Clarifying the "ambiguous reach of the GPL" sounds like a great topic for a session at a future iLaw conference!

Disclaimer: IANALawyer