INNOVATION POLICY FOR THE KNOWLEDGE SOCIETY
ROCHELLE COOPER DREYFUSS
DIANE LEENHEER ZIMMERMAN
 Part I explores claims for expanding the boundaries of intellectual property law. There are two ways in which such claims sound. One is as a demand to enlarge the scope of standard forms of protection, copyright, patent and trademark law. Courts, for example, have been asked to interpret such rights broadly: to construe patent claims to cover inventions that go beyond the inventor's exact embodiment to encompass other advances made with the same insight; to hold that a copyright is infringed by paraphrasing and other nonliteral copying; to find that even noncompetitive, nonconfusing uses of trademarked material violate a trademark holder's rights. The second kind of claim is for new rights, for rights that protect innovations that are not creative enough to be covered by the standard regimes. Both sorts of claims are explored here.
F.M. Scherer, of the Kennedy School, makes an empirical argument for expanding the traditional regimes of copyright and patent law. Those who oppose strengthening protection argue that declines in the cost of developing and distributing information products mean that investments in innovative activities can be fully recouped through first mover advantages and existing protection. Professor Scherer argues, however, that recoupment is not an adequate measure of the return necessary to spur the optimum level of creativity. His examination of data in both the technological and artistic sectors demonstrates that in the creative industries, the likelihood of commercial failure is extraordinarily high and difficult to predict. Furthermore, risks cannot easily be laid off through diversification. As a result, the contours of the law must insure that successful products generate high returns, returns sufficient to compensate for these effects.
The other chapters in this Part deal with the issue of protecting subpatentable and subcopyrightable works. J.H. Reichman uses a hypothetical research project (an attempt to produce new kinds of tulips) as a way of illustrating the problem of providing incentives to develop incremental innovations. He advocates a new form of protection which, he claims, will reward innovation without undermining the ability of developers to cumulate marginal developments and build on one another's insights. Jane Ginsburg focuses one question of database protection. Made up principally of publicdomain facts, databases represent the quintessential noncopyrightable (or weakly copyrighted) work that is costly to create, yet easy to copy. She uses current proposals to demonstrate how a regime to protect such works can provide innovators with the benefits they seek while preserving a creative environment that allows others to continue to push forward the frontiers of knowledge.
F. M. SCHERER[*]
During the past several years Dietmar Harhoff (University of Munich) and I have been collecting and analyzing quantitative data on the size distribution of rewards realized by individuals and organizations carrying out technological innovations. From our new evidence and insights by other scholars on the economics of individuals' gambling behavior, it is possible to propose a theory of incentives for innovation in technology and other creative endeavors. Our findings provide support for a half-century-old obiter dictum by the renowned Austrian (and later Harvard University) economist Joseph A. Schum peter. Characterizing the admixture of chance, ability, and energy that sustains innovation and technological progress in a capitalistic economy, Schum peter wrote:
Spectacular prizes much greater than would have been necessary to call forth the particular effort are thrown to a small minority of winners, thus propelling much more efficaciously than a more equal and more 'just' distribution would, the activity of that large majority of businessmen who receive in return very modest compens~tio~ or nothing or less than nothing, and yet do their utmost because they have the b1g pnzes before their eyes and overrate their chances of doing equally well.
Schumpeter's conjecture was based upon little more than casual empiricism. In this chapter I summarize the relevant new evidence and tread, perhaps too incautiously, into the realm of theory-building.
During the early 1960s, while seeking to understand the links between market structure and technological innovation, I happened onto some data from a small survey asking holders of US invention patents about the profitability of their inventions. Analyzing the profit data, I found their statistical distribution to be 'skew' and indeed highly skew, with most of the observations lying in the range of low profit values but 'with a very long tail into the high value side'. A simple test suggested that the data might conform to what is called a Pareto distribution, to whose peculiar properties the mathematician Benoit Mandelbrot was at the time directing attention.
A statistical size distribution is essentially a systematic array of numerical observations drawn through sampling, conventionally shown by plotting on the horizontal axis of a graph the range of observation values (eg, profits) and on the vertical axis the relative frequency with which those values occur in the sample. Figures l.l(a) and l.l(b) illustrate two such distributions created using my computer's random number generating program. The array in Figure l.l(a) is typical of a 'normal' distribution. It resembles a symmetric bell-shaped curve, with the largest and smallest observation values occurring with much lower frequency than those in the middle of the distribution (ie, near the sample mean, or average value). Figure l.l(b), on the other hand, reflects a particular kind of skew distribution known as a log normal distribution. By far the greatest mass of observations occurs at relatively low values, but the distribution has a long thin right-hand 'tail' of large values occurring with very low frequency.
The length and thickness of the tail determine the degree of skewness characterizing a distribution. The longer the tail, the more skew the distribution is. Some general classes of skew distributions are more skew than others. The Pareto distribution I believed I had found in the 1960s for patented invention profits is more skew, ie, with a longer and fatter tail, than the typical log normal distribution illustrated in Figure 1.1 (b). The differences between skew distributions can be characterized through a simple graph (Figure 1.2) popularized by Benoit Mandelbrot. The coordinates are logarithmic. On the
 FIG. 1.1: Comparison of normal and (skew) log normal distributions
 FIG. 1.2: Pareto plot of simulated log normal and Pareto distributions
horizontal axis, one identifies the values of the observations (eg, profits). The observations are ranked from most valuable to least valuable, and on the vertical axis, one plots the number of observations with values equal to or greater . than any given horizontal axis value.
The solid curve in Figure 1.2 traces the log normal distribution plotted in Figure 1.1(b). The most valuable observation (number 1 on the vertical scale) has a value of 570; 18 observations have values of 100 or more; and 172 observations have values of 10 or more. The dotted straight line shows a Pareto distribution fitted to the data underlying Figure 1.1 (b). The simplest Pareto distributions plot as straight lines on doubly logarithmic graphs, which is why such graphs are called Pareto graphs. The dotted Pareto distribution line conforms fairly closely to the solid log normal distribution curve over the less valuable observations ranked from 100 to 1 ,000. However, it diverges ever more sharply from the downward-bending log normal distribution in the right-hand tail of high-value observations. Indeed, the most valuable log normal observation of 570 is only the 30th most valuable observation on the Pareto line. If the Pareto line were extrapolated linearly, one would find the value of the most valuable observation (number 1 on the vertical axis) to be 2.94 million, and the second most valuable observation (number 2 on the vertical) 520,000. True Pareto distributions have extremely large 'outliers' relative to the mass of observations.
 Skew distributions exhibit ill-behaved sampling properties. Statisticians' 'law of large numbers' states that the mean (average value) of a sample of observations should converge ever more closely to some 'true' average value as one draws ever larger samples. But with skew distributions, this convergence operates at best slowly, and it may not happen at all. Pareto distributions are extreme in their non-conformity to the law oflarge numbers. Indeed, Pareto distributions with high but plausible degrees of skewness do not conform at all. Rather, as one draws ever larger sarpples, there is a chance that one will come up with an extremely large observation value (such as the 2.94 million value extrapolated from the Pareto distribution of Figure 1.2) that is so large relative to all previous observations that it 'blows' the sample mean to a much higher value. For the Pareto distribution illustrated in Figure 1.2, the sample average is 1,004 if the largest observation is excluded but rises to 2,947 if it augments the other 999 observations. What this means in practical terms is that it is very hard or maybe even impossible to secure stable average profit returns by pursuing portfolio strategies eg, pooling many research and development projects into a portfolio.
My discovery that the distribution of profits associated with patented inventions might exhibit sampling instabilities like those described in the previous paragraph has been gnawing at me for three decades. My interest was sustained in part by others' findings, from studies of the rates at which fees are paid to keep patents in force in some national jurisdictions, that the distribution of patent values is indeed highly skew. Returning to the scene of the crime, I joined Dietmar Harhoff to conduct surveys ascertaining through direct queries the estimated value of 776 German and 222 US inventions on which German patent protection was maintained for a full 18-year term following applications filed in 1977. We confirmed that even for patents considered valuable enough to warrant paying German renewal fees escalating over time to a cumulative total of DM 16,075 (roughly $6,825 at 1977 exchange rates), most of the inventions are of only modest value, while a few yield blockbuster rewards. Among the 776 German patents-already self -elected from the seven times larger cohort of contemporary patents not  renewed to full term-the most valuable 78 (ie, the top ten percent) accounted for an estimated 88 percent of all 776 patents' economic value to their holders. For the smaller but even more select group of US inventions, which were patented both in Germany and the United States, the top 22 accounted for 81 to 85 percent of the 222 patents' value. The best of the best conferred economic prizes measured in the billions of dollars or Deutschmarks.
Figure 1.3 plots the German patent value distribution on doubly logarithmic (ie, Pareto) coordinates. If the size distribution were Paretian, it would plot as a straight line on the chosen coordinates. In fact, downward concavity reveals that there is less skewness than for the Pareto case. A statistical analysis provides the strongest support for a log normal distribution like the solid-line distribution in Figure 1.2.
FIG. 1.3: Plot of German renewed patent values on Pareto coordinates [NOT SHOWN]
In a parallel survey of the license royalties obtained by six research-oriented US universities on their patent portfolios during four years of the early 1990s, I found that a single bundle, comprising the three Cohen-Boyer gene splicing patents, yielded 24 to 33 percent of the total royalties obtained from 350 to 486 individual bundles of licensed technology. The top six bundles, ie,  making up one to two percent of the sample members by number, generated from 66 to 76 percent of total sample patent royalties.
Confirmation of the rewards skewness phenomenon comes from information on the discounted present value of quasi-rents (ie, the discounted surplus of sales revenues over estimated production and marketing outlays) realized by companies from new pharmaceutical chemical entities marketed during the 1970s and early 1980s in the United States following Food and Drug Administration approval. The most profitable 10 percent of those product introductions contributed 48 to 55 percent of total sample quasi-rents. Among eight sets of data on which reward size distribution data were obtained, the two drug quasi-rent samples exhibited the least skewness. Their distributions were much less skew than Pareto distributions and somewhat less than the log normal distribution, but more than plausible alternative skew distributions such as the negative binomial or the Weibull.
Further insights emerge from evidence on the outcomes of investments in that most dynamic of US industrial sectors, new high-technology ventures. Surveys of 1,053 investments by venture capital funds in individual venture capital targets reveal a highly skew distribution of outcomes. Some 59 to 62 percent of the ultimately realized returns came from the most successful 10 percent of investments by number. A Federal Reserve Board staff study of 225 venture capital partnerships formed between 1980 and 1986 revealed a weighted-average internal rate of return on investment as of 1993 averaging 7.95 percent, with a weighted-average median of 5.58 percent. Finding the mean in excess of the median or mid-sample value, which occurred for every individual year of fund formation as well as in the aggregate, is another indication of distribution skewness. So also is the high variability over time of median rates of return, ranging from 1.6 percent (for partnerships organized in 1981) to 13.2 percent (for 1980- vintage funds). When the distribution of outcomes is skew, as we have seen, atrophied operation of the law oflarge numbers makes it hard to diversify away random sampling variability by forming portfolios of numerous investments. To sharpen insights on the performance of new high-technology ventures, an exhaustive sample was drawn of venture fund-backed companies that floated initial public common stock offerings (IPOs) in the 1983-86 period and that operated in specified high-technology fields. Typically, new  companies supported by venture capital partnerships attempt IPOs only after they have made appreciable progress toward achieving technical and market success, and so they tend to be more mature investments than those discussed in the previous paragraph. A hypothetical first-day investment of $1,000 was made in the common stock of the 110 sample companies whose stock continued to be traded on a public market after the IPO date. Dividends were reinvested and stock splits were tallied. By the end of 1995, 52 of the 110 companies remained as independent entities. The value of their stocks as of 31 December 1995, along with proceeds (reinvested temporarily in the NASDAQ index) from the liquidation of drop-out company stocks, was as follows:
52 surviving companies $417,002
23 acquired companies $96,400
35 delisted companies $21,178
Total terminal value $534,580
Had the same $110,000 been invested in the NASDAQ index in January 1983, investors would have had $501,908. Thus, investors in our sample of high-technology companies increased the value of their initial investment by a substantial multiple, earning on average a 12.2 percent return on their investment, but fared only slightly (and not statistically significantly) better than investors in a complete portfolio of NASDAQ company stocks.
For our present purposes, the key point is this: as the years progressed from the time of the initial investments analyzed here, the distribution of outcomes became more and more skew. By the end of 1995, the best-performing 11 companies (10 percent by number of the initial sample) contributed 62 percent of the investors' total portfolio value. Figure 1.4 shows how the value of individual company investments changed over time. It is limited to 10 companies, including the five most successful full-term survivors and five others selected randomly. The share values for the random choices cluster so tightly below $2,000 that they are largely indistinguishable. Had one invested $1,000 in Adobe Systems, Concord Computing, and Amgen, on the other hand, one would have had shares valued at $77,565, $74,130, and $55,980 respectively by the end of 1995. The cross-sectional distribution of full-term survivors as of December 1995, plotted in Figure 1.5 on doubly logarithmic coordinates like those used in Figures 1.2 and 1.3, exhibits a concave log normal scatter of values not unlike the one observed for German patents, except that there is no single extremely large outlying value. Skew-distributed samples commonly exhibit such erratic behavior in their right-hand tail.
 FIG. 1.4: Evolution of the value of $1,000 investments in 10 IPOs
FIG. 1.5: Pareto plot of 52 1983-86 IPO investments' value in December 1995
Thus, we conclude that there are striking regularities in the distribution of rewards to technological innovation. A minority of 'spectacular winners' appropriate the lion's share of total rewards, as Schumpeter predicted. The size distribution of rewards is highly s·kew, with a long right-hand taiL The distributions do not appear to be Paretian, contrary to my 1965 hypothesis; the log normal distribution characteristically provides a better fit. The  rewards from individual patents exhibit more skewness, ie, the top 10 percent capture a higher total share of payoffs, than the rewards from innovations such as new drugs, which may be covered by numerous product and process patents, or investments in high-technology startup companies, which sometimes market multiple innovations and hence average out some of the variability associated with individual innovations.
Not all innovation is technological. Innovation also occurs in cultural domains, eg, in the composition or performing of music, the production of motion pictures, and the writing of books. We supplement our insights here by analyzing fragmentary data from the first two of these branches.
From Billboard magazine, data were obtained on the US sales of the 70 bestselling popular music albums and single recordings in 1997. The best-selling album, with. sale of 5.3 million estimated by sampling, was by the Spice Girls; the best-selling single, With sales of 8.1 million units, was Elton John's 'Candle m the Wind', commemorating the death of Britain's Princess Diana. It is known (also from Billboard) that approximately 613 million music albums and 134 million singles records were sold in 1997. The best-selling album accounted for 0.86 percent of total album sales; the best-selling 70 albums (all with sales of 1 million or more) for 21.0 percent. The best-selling single recording gamed 6.1 percent of singles sales; the 70 best-sellers (with sales of 500 000 or more) 55.0 percent of singles sales. It is not known how many albums and singles were offered in total during 1997. A crude extrapolation from the (nearly log normal) distribution of data on the top 70 albums suggests that 1,000 albums accounted for 60 percent of total album sales, with average sales for individual albums in the bottom decile of this best-selling group averaging approximately 123,000. A similar extrapolation reveals (with somewhat larger uncertainty) that 800 records sufficed to account for 98 percent of total singles sales, with the least successful hundred of those 800 averaging sales of roughly 10,000 each. Even though individual winners did not capture the lion's share of sales, the top 10 winners surpassed the sales of records ranked 800th in the distribution by approximately 25 times on average for albums and 250 times for singles. Clearly, it is much more lucrative to be a top winner.
Figure 1.6 plots on doubly logarithmic coordinates the size distribution of sale for the top 70 records of 1997. For both albums and singles, some concavity is evident, although the extreme 'Candle in the Wind' outlier reverses the curvature for singles, just as an extreme value does in the distribution of German patents (Figure 1.3).
 FIG. 1.6: Pareto plot of leading popular record sales in 1997
These statistics are for individual records. Another article in Billboard arrays the cumulative total record sales outside the United States over 50 years through December 1997 for the 48 most successful recording groups affiliated with one of the leading popular record companies, Atlantic Records. The highest-selling group, Led Zeppelin, achieved career sales of 29.6 million records, more than twice those of the second-ranked group and 34 times those of the 48th-ranked artist. Led Zeppelin's sales comprised 17.6 percent of the sales of all 48 leading groups. This and other evidence reveals more skewness in career record sales than for the distributions across individual records, but less than what was observed for any group of technological innovations.
The success of classical music composers, tallied not in lifetime financial rewards but in the attention posterity has paid them, also appears to follow a skew distribution. From the Schwann Reference Guide to Classical Music for Fall 1996, the number of records currently available was estimated for 686 composers born between 1650 and 1840. The measure used was linear centimeters of record listings, with five lines of type per centimeter on average and the typical record entry running from one to three lines. A list of the eight leaders yields no major surprises: 
W. A. Mozart 1,656 em
L. Beethoven 1,262 em
J. S. Bach 1,190 em
J. Brahms 644 em
P.I. Tchaikovsky 568 em
F. Schubert 553 em
F. Chopin 497 em
F. J. Haydn 461 em
Mozart alone commands 11.3 percent of the recordings; the top eight composers 46.8 percent, and the 69 composers comprising the top 10 percent by number 87.8 percent. These values, reflecting a high degree of skewness are broadly consistent with those observed for the profits or royalties from technological innovations. Figure 1.7 plots the distribution function on doubly logarithmic coordinates. For the vast majority of composers, it is close to a
Fig. 1.7: Pareto plot of 686 composers' recordings available in 1996 (composers born between 1650 and 1840)
straight line, consistent with a Pareto distribution, departing from the Pareto form and bending downward only for 13 leaders.
Arthur DeVany and W. D. Walls compiled statistics on the distribution of revenues for motion pictures (most of them readily forgotten) appearing among the Top 50 list in Variety magazine between May 1985 and January 1986. Over the entire nine-month period surveyed, the top 10 percent of market-leading films by number generated approximately 60 percent of total sample revenues. Again, a skew distribution is revealed. The distribution was strongly concave relative to doubly logarithmic coordinates and hence inconsistent with a Pareto law, but more skew than a log normal distribution.
From the compilation of evidence reviewed here, it would appear that Schumpeter was correct at least statistically: the big prizes from innovation are thrown to a small minority of winners, while the majority of innovative efforts confer only modest rewards. For new drug chemical entities and investments in high-technology startups, on which the evidence is most complete, the median project yields less than the capitalized cost of capital funds invested, but losses on the majority of projects are more than offset by gains from the most successful projects. We ask now whether Schumpeter might also have been correct in his conjecture that a skew distribution of rewards motivates innovative activity 'more efficaciously' than a more 'just' system of rewards, which presumably would bestow returns more or less closely proportional to the investments seeking them.
Skew reward distributions are particularly risky because it is difficult to make random deviations cancel each other out and converge on some stable average value by forming diversified portfolios containing numerous investments. Most of received investment theory assumes that investors are riskaverse. If innovation requires investment and investors are risk-averse, how can a highly skew distribution of rewards be conducive to innovation?
The assumption that investors are risk-averse is difficult to square with the widespread and long-standing popularity of sweepstakes lotteries in the  United States and other nations. The larger in absolute value the prize, captured at infinitesimal odds, the more enthusiastic individuals' participation appears to be. With an actuarial value of payoffs well below the sum of players' bets-e.g., for state sweepstakes lotteries, roughly half the stakes on average-such lotteries, like most other forms of gambling, are not 'fair' gambles. Yet people flock to play.
Investing in high-technology startup companies is similar to (but less extreme than) lotteries in the skewness of rewards and the relatively low probability (well below 0.1) of a really big payoff. But they are strikingly different in the sense that venture investment tends, given reasonable diligence, to yield discounted returns exceeding actuarially the value of the stakes invested. Thus, at least historically, they have been 'fair' gambles. Perhaps that is difference enough to distinguish between buying sweepstakes tickets and investing in the innovation lottery. (Whether high-technology stock returns will continue to exceed investments in the future as more and more naive investors plunge into the high-technology game remains to be seen.) But let us probe further to see whether more can be discerned.
That people embrace high-stakes gambles but engage more or less simultaneously in the risk-averse behavior associated with paying an administrative premium for insurance has long fascinated economists. In a seminal paper, Milton Friedman and L. J. Savage showed that a consumer may rationally choose both to buy insurance against risks that would reduce her wealth and accept unfair gambles on the off chance of a sizeable wealth increase. Their argument is illustrated in Figure 1.8, which assumes that 'utility' (whose measurement problems need not detain us) is derived from consumption and that consumption is facilitated by wealth. Friedman and Savage postulated that the utility function, ie, the mathematical relationship between an individual's utility and her wealth, has the peculiar ogive shape shown by the heavy solid line U(W) in Figure 1.8. Suppose the consumer's initial wealth position is W0 , yielding (read horizontally over to the vertical axis) utility realization U0 • Now suppose the consumer can invest in a 'fair' lottery ticket whose cost, if with relatively high probability she fails to draw the winning combination, leaves her with lower wealth WL. But if she wins the lottery (with low probability), she is propelled to much higher wealth Ww (yielding correspondingly higher utility Uw). The straight line segment ACB shows a locus of expected utility values given by the probability of winning Pw times the utility Uw from winning plus (1 - Pw) times the utility UL from being at the loser's  wealth level WL. If the odds are such that the probability-weighted expected wealth outcome is W, the expected utility (read horizontally from point Con the expected value line ACB to the vertical axis) of that outcome is exactly the same as the utility U0 from not entering the lottery and remaining with certainty at wealth level W0 , then our consumer will be indifferent between playing and not playing. But if the 'cost' of a lottery ticket (eg, in the form of less than 'fair' odds) is less than W0 - W, so that the probability-weighted expected value of the consumer's post-lottery wealth is greater than W, the consumer will realize higher expected utility participating in the lottery than not participating. This happens because substantial increases in wealth yield disproportionately large increases in utility (ie, marginal utility is increasing with increases in wealth), outweighing the actuarial value of the modest decrease in utility associated with buying a ticket and losing. By similar reasoning, it can be shown that from an initial wealth position W0 , the consumer faced with some small probability of losing much of her wealth ( eg, because her house burns down) but able to insure against that contingency by paying a premium for insurance will, within some range of insurance costs, choose to buy the insurance. The reason is that, in the concave-downward (low-wealth) segment of the consumer's utility function, wealth losses entail disproportionately large sacrifices of utility, the avoidance of which is worth paying more than the actuarial value of insurance.
FIG. 1.8: Utility function consistent with buying insurance and betting in lotteries
 Friedman and Savage speculate that real-world individuals' utility functions often have upward-bending curvature at wealth levels much higher than those currently attained, implying increasing marginal utility of wealth, because a big wealth increase moves the consumer to a new, much higher material and social status. Such a large change in effect fulfills their optimistic, or perhaps wildly optimistic, dreams. Certainly, something like that explains the propensity of lower-income citizens to spend disproportionate amounts of their income playing the lottery, in effect gambling on an escape from poverty. It is also consistent with anecdotal evidence about the motives of high-technology firm entrepreneurs, who see a successful startup as the principal means of becoming truly wealthy. In this respect, sweepstakes players and technological entrepreneurs may be more similar than their conventional demographic characteristics imply.
Lotteries, of course, are not the only form of gambling to offer the possibility of large payoffs at low odds. Horse racing provides a rich environment for analyzing risky choices. There is evidence that the actuarial returns from bets on 'long shots' tend to be lower than on better-rated horses, suggesting (given the way parimutuel odds are set) that some bettors have a positive preference for long shots. Noting that the same individuals place bets on multiple races and often bet on more than one horse per race-portfolio strategies that pool and hence reduce risks, conventionally defined—Golec and Tamarkin question previous studies' inference that such bettors are risk lovers. They argue that one must distinguish between two commonly confounded aspects of risk: the variance (technically, the second moment of a statistical distribution), which measures the average variability of sample observations around their mean, and the skewness (the third moment), which, as we have seen, measures asymmetries associated with very long tails on one side of the distribution. Golec and Tamarkin test the hypothesis that it is skewness (the third moment), not variance (the second moment, and the more conventional measure of risk) that the long-shot bettors embrace. In a statistical analysis of 2,309 races, they find persuasive evidence that bettors are simultaneously variance-averse and skewness-loving.
This discovery, which is consistent with the utility function shape exhibited in Figure 1.8, might equally well explain the behavior of high-technology  entrepreneurs, inventors, and (to the extent that financial motives play a role) creators of popular culture. In high technology, those who commit their fortunes and most of their waking hours to pioneering a new venture forego the option of placing multiple bets enjoyed by horse race bettors. But high technology 'angels', like horse players, simultaneously bet on long shots and pursue variance-reducing multi-investment portfolio strategies. At this stage, the proposition that those who invest in high technology derive positive utility from the skewness of rewards is offered as hypothesis, not as a demonstrated phenomenon. The most one can say is that it is intuitively plausible. Testing it rigorously is likely to be more difficult than with horse racing, since all of the skewness evidence presented in this chapter was derived ex post, and it is unclear whether the ex ante skewness potential of specific high technology investments or classes of bets can be measured empirically .
In the evidence that the rewards from innovation are skew-distributed and that at least some risk-takers are skewness lovers there are implications for intellectual property policies. However, several caveats need to be recognized.
For one, especially in the arts but to some extent also in the realm of technology, creative activity is often driven by non-pecuniary motives. To be sure, an artist, author, or inventor must keep body and soul together, but as long as that condition is met and the incremental costs of creation are not large, the uncertain prospect of spectacular payoffs may be more of a 'nice to have' fringe benefit than a necessary incentive. On this, preferences undoubtedly differ. Franz Schubert first composed a work for money at the age of 19. Ten to 12 years later he was still offering publishers his work for 'a moderate remuneration'. Beethoven, on the other hand, was notoriously avaricious, playing publishers off against one another to gain desired publication fees for specific compositions and segmenting his markets geographically, with different publications for England, France, and the various German-speaking territories. Franz Joseph Haydn toiled most of his career on a handsome salary that provided no special incentive compensation for his prodigious creative output, but went for the big prizes when his patron Prince Nicolaus Esterhazy died and he was invited to London by impresario Johann Peter Salomon. Aaron Copland is said to have testified in a deposition on copyright royalties that he would pay people to listen to his music. As an author who has often purchased copies of my work at appreciable out-of-pocket cost and distributed them free to colleagues, I empathize. My daughter helps support  her six children with the modest fees she receives playing bluegrass fiddle professionally, but dreams of the affluence a hit record would bring.
Although individual high-technology entrepreneurs may, like racetrack bettors and sweepstakes lottery players, be skewness-lovers, it seems less likely that decision-making in well-established corporate organizations conforms to the hypothesis. For the employed inventor, a particularly successful contribution can bring a sizeable bonus and promotion to a higher income bracket, but nothing remotely approximating the rewards populating the right-hand tail of the profit distributions Dietmar Harhoff and I have been surveying. For the research and development manager who drives an invention to commercial success, the fraction of benefits appropriated personally is similarly modest. When corporate hierarchies must decide whether or not to invest substantial sums on the commercial development of an invention, risk aversion, not skewness affinity, is almost surely the behavioral norm. Such motivational differences may have much to do with the propensity observed during the first half of the twentieth century for a disproportionate share of the boldest technological innovations to originate outside the laboratories of large corporations, and, more recently, for high-technology startups to be among the most prolific contributors to US technological dynamism.
To the extent that investments in technological and artistic creation are motivated by the longshot hope of a very large reward, intellectual property policies should sustain and reinforce that incentive system, not undermine it. This implies a role for strong patents and copyrights. But from that basic precept two more nuanced implications can be drawn.
First and most obvious, patents and copyrights ought not to be revoked or weakened simply because an innovator has made 'too much' money from his creation, for the prospect of a large reward is a crucial feature of the skewness-based incentive system. This does not mean that any weakening of intellectual property rights is necessarily counter-productive, for an expectation that patents or copyrights can be enforced fully has more relevance for some actors-notably, those who conform to the skewness-loving paradigm- than for others. There is compelling evidence that patent rights are not very important as a prerequisite for research and development investments in many (but not all) well-established industries. The early market occupancy, reputational, and learning curve advantages of being a first mover are in  many instances much more important. Because of these non-patent advantages, the enforcement of compulsory patent licensing in more than 100 US antitrust settlements does not appear to have had much, if any, adverse impact on the target companies' investments in research and development. Fine-tuning is needed in remedying antitrust abuses and avoiding impediments to further innovation caused when one inventor has a powerful position blocking the advances of others who could otherwise make significant improvement inventions. In particular, policymakers should be particularly cautious in avoiding the early erosion of rights for that class of innovators or creators for whom the uncertain hope of large rewards was a plausibly significant motivational factor.
A second implication follows. Although exceptions exist, there is reason to believe that the enforcement of intellectual property rights is biased in favor of large, well-established organizations, whose behavior conforms least well to the skewness paradigm, and against the independent innovators who conform most closely. David does occasionally slay Goliath in the courtroom, but more often than not, a well-heeled corporation can afford to persevere in costly litigation until the financially weaker party submits-if not in absolute defeat, then in a settlement that yields less than the spectacular reward that might otherwise have been achieved by an independent innovator. If the skewness hypothesis is anywhere near correct, there ought to be policy adjustments to remedy the imbalance.
There may also be implications for international economic policy. Just as the larger payoffs available with statewide (and perhaps also nationwide) lotteries encourage more intensive participation than smaller city-based lotteries, the larger rewards attainable by selling innovative products in a global marketplace probably strengthen incentives for both skewness-loving and risk-averse actors. There is much to be said therefore for continuing efforts to harmonize national intellectual property systems and, as under the single-filing- locus system endorsed by many European Union nations, to reduce the costs of securing protection outside one's home nation.
[*] Aetna Professor of Public Policy Emeritus, John F. Kennedy School of Government, Harvard University; visiting professor, Princeton University.
 JOSEPH A SCHUMPETER, CAPITALISM, SOCIALISM, AND DEMOCRACY 73-74 (1942).
 F M. Scherer, Finn Size, Market Structure, Opportunity, and the Output of Patented Inventions, 55 AM. EcoN. REV. 1098 (1965).
 Benoit Mandelbrot, New Methods in Statistical Economics, 71 J PoL. ECON. 421 (1963).
 It is called log normal because if one takes logarithms of the observation values, the logarithmic values exhibit a symmetric normal distribution pattern. Indeed, Figure 1.1(b) was created by taking the 1,000 values used in Figure 1.1(a) (from a random number program generating normally distributed values with mean zero and a standard deviation of 1) and using them as exponents in the expression 1 On, where n is the value of a Figure 1.1(a) observation. Since the logarithm to base 10 of 1 On is n, the distribution of n is normal while 1 on is skew-distributed. The mean (average) value of the distribution in Figure l.l(b) is 10.50; the median value is 1.11.
 Specifically, a straight line was fitted by least squares to the observations, yielding the equation log rank = 2.585 - 0.3996 (log value of observation). The slope value 0.3996 is called the Pareto alpha or the Pareto slope coefficient.
 This is true when the so-called Pareto slope coefficient has an absolute value less than 10. The slope coefficient for the Pareto distribution in Figure 1.2 is shown in n. 5 above to be 0.3996. Thus, asymptotically (as ever larger samples are drawn, and the straight dotted line in Figure 1.2 shifts in parallel upward), the sample mean diverges toward infinity.
 Acquaintances of long standing might observe that quite a lot of me has been gnawed away in the interim
 Dietmar Harhoff et al., Exploring the Tail of Patent Value Distributions (working paper), Center for European Economic Research, Mannheim, Germany (1998).
 For the log normal data underlying Figure 1.2, the 100 largest observations (ie, the top 10 percent) accounted for 76.9 percent of total sample values. By this concentration measure, the Figure 1.2 data are somewhat less skew than the German patent data.
 F. M. Scherer, The Size Distribution of Profits from Innovation, 49/50 ANNALES D'ECONOMIE ET DE STATISTIQUE 495 (1998).
 See Henry Grabowski & John Vernon, A New Look at the Returns and Risks to Pharmaceutical R&D, 36 MGMT SCI. 804 (1990); Returns on New Drug Introductions in the 1980s, 13 J. HEALTH EcoN. 383 (1994).
 See Scherer, n. 10 above.
 GEORGE W. FENN ET AL., THE ECONOMICS OF THE PRIVATE EQUITY MARKET (1995). The weighted-average standard deviation was 9.60 percent.
 A more complete analysis is presented in F M. Scherer et al., Uncertainty and the Size Distribution of Rewards from Technological Innovation, 10 J. EvoLUTIONARY EcoN. 175 (2000).
 Twenty-one IPO companies left no discernible trace of stock trading in stock exchange records To standardize the investments for timing differences, $1,000 was 'parked' in the NASDAQ index as of 1 January 1983, with the proceeds being invested on the date of any sample company's IPO.
 A posthumous record count is analogous to the publication citation counts used to assess the influence of scholars. See DEREK J. DE SOLLA PRICE, LITTLE SCIENCE, BIG SCIENCE (1963).
 Mozart's leadership is even stronger when composers are arrayed in order of records per year of working life, with the working life defined (somewhat inaccurately for the case of Mozart) to begin at age 16. The top eight composers by this criterion accounted for 57.1 percent of the total distribution.
 In ongoing research, I am attempting to assemble data on the the size distribution of composers' earnings during their lifetimes. On Mozart, for which the documentation is particularly rich, see W. J. Baumol & Hilda Baumol, On the Economics of Musical Composition in Mozart's Vienna, 18 J. CuLTURAL EcoN. 171 (1994).
 Authur DeVany & W D. Walls, Bose-Einstein Dynamics and Adaptive Contracting in the Motion Picture Industry, 106 EcoN. J. 1493 (1996).
 How difficult it is was shown by computer simulations in which hypothetical random samples were drawn repeatedly from the skew distribution of new drug quasi-rents estimated by Grabowski and Vernon. Even when new product introductions were pooled to the full US industry- wide level (i.e., with a single entity introducing 18 new products annually, each yielding profits over a 21-year span), annual profit swings as large as plus-or-minus 25 percent of long-run industry averages remain. See F. M. Scherer & Dietmar Harhoff, Technology Policy for a World of Skew-Distributed Outcomes, 29 RES. POLICY 559 (2000).
 See, eg, CHARLES T. CLOTFELTER & PHILIP J. COOK, SELLING HOPE: STATE LOTTERIES IN AMERICA (1989); Matthew Breuer et al., State Lotteries: The Determinants of Ticket Sales, 8 WAGNER REv. 15 (1997); Gregory Bresinger, The Lottery Racket, 16 FREE MARKET 1 (1998). The irrationality of lotteries appears to be one of the few things on which economists of all ideological persuasions agree.
 A fair gamble is one in which the actuarial value of the prize, ie, the value of the prize times the probability of winning it, equals the value of the bets.
 See CLOTFELTER & COOK, n. 24 above, at 95-104. It is for this reason that their book is titled 'Selling Hope'.
 Statisticians distinguish four main moments, or mathematical characterizations, of sample distributions. The first moment, or mean, characterizes the sample's central tendency; it is the sum of observation values divided by the number of observations. The second moment, or variance, characterizes the degree to which observations deviate from the mean; the third moment, the skewness of the distribution of observations, was clarified earlier; and the fourth moment, the 'peakedness' or height of the distribution compared to the length of its tails.
 Joseph Golec & Maurry Tamarkin, Bettors Love Skewness, Not Risk, at the Horse Track, 106 J. POL. ECON. 205 (1998).
 FRANZ SCHUBERT'S LETTERS AND OTHER WRITINGS 30, 122-23, 135 (Otto Erich Deutsch ed., 1974).
 For an early statement of this managerial risk aversion hypothesis, see WILLIAM J. FELLNER, COMPETITION AMONG THE FEW 172-73 (1949).
 See, eg, JOHN JEWKES ET AL., THE SOURCES OF INVENTION (1959); F. M. Scherer, Schumpeter and Plausible Capitalism, 30 J. ECON. LITERATURE 1416 (1992).
 See Richard C Levin et al., Appropriating the Returns from Industrial Research and Development, Brookings Papers on Economic Activity (No.3) 783-832 (1987); Wesley J. Cohen et al., Appropriability Conditions and Why Firms Patent and Why They Do Not in the American Manufacturing Sector (working paper), Carnegie-Mellon University (June 1997).
 See F M. SCHERER, THE ECONOMIC EFFECTS OF COPMPULSORY PATENT LICENSING Ch. IV (1977).
 The nerd's prayer: 'Oh Lord, help me to capture the market with my killer app; but if Thou willst otherwise, let me at least sell out to Microsoft'.
 See Philip J Cook & Charles T. Clotfelter, The Peculiar Scale Economies of Lotto, 83 AM. ECON. REV. 634 (1993); The Massachusetts State Lottery, John F. Kennedy School of Government case study CI6--91-1025.0 (1992).