Project outline

Final report

Press coverage

related research




Research Priorities in Open Source Software Development
James D. Herbsleb, Carnegie Mellon University, Pittsburgh, PA, USA 15213

Open source software development is taking on an increasingly important role in our technical infrastructure. As commercial companies begin to invest heavily in developments that were traditionally carried forward in a purely voluntary and independent fashion by individuals (e.g., IBM and Linux), and commercially developed software is opened up to the public (e.g., Netscape and Mozilla), new forms of development and collaboration are emerging. Open source also serves as a critical mechanism providing software to people and institutions who have minimal resources, as well as those who, for business reasons, wish not to be locked into proprietary software.

Open source is not just a new method of software development. Rather, its challenge to traditional practice is much more fundamental, going to the basic motivations, economics, market structure, and philosophy of the institutions that develop, market, and use software. In comparison to its increasingly critical role, we know relatively little about the personal, organizational, economic, and technical factors that govern how the process works. As a consequence, we don’t understand the security risks it poses; the limitations in scale, structure, and domain; the motivations and decision-making of OSS developers; the new forms of virtual organization that may arise from patterns of patronage; or the potential for enhancing OSS with the deployment of more sophisticated collaboration technology than is currently used. I will briefly discuss each of these issues.

Security risks. Security in open source has received considerable attention lately e.g., [1], as many argue that open source poses security risks because potential hackers have access to the source code, and can more easily identify and exploit security holes. Open source advocates, on the other hand, dispute the degree to which source code availability benefits hackers, and argue that with many more “friendly” eyes searching the source code to identify and remove flaws, security holes in open source systems are much more likely to be fixed before they can be exploited. Clearly, embedded in this dispute are divergent assumptions about how quickly, effectively, and reliably OSS processes actually identify and respond to threats. Careful empirical research to test these assumptions is required in order to resolve this dispute and to formulate sensible policy with respect to the security aspects of OSS.

Limitations in scale, structure, and domain. The runaway successes thus far of open source systems have been infrastructure, such as Linux and Apache. In contrast, there has been relatively limited success in the area of open source applications such as Mozilla and OpenOffice. It is not at all clear at this point how far the potential of open source extends. Are there inherent limitations that dictate a modest place for open source, perhaps centered on a few key parts of network infrastructure? Or can nearly any type of software be developed effectively as open source? Is it essential that the developers be power users of the software, as is often suggested? What sort of critical mass of software must exist before an open source community can form around it? What are the architectural and other constraints on software such that new people can, with minimal training and little access to experts, quickly understand some part of it sufficiently well to contribute? We must have the answers to such questions in order to understand the potential and limitations of OSS development.

Motivations, decision-making of OSS developers. The vast majority of potential open source developments seem to attract little or no activity, as an examination of SourceForge usage statistics reveals. We now know a little about why developers choose to participate in open source [2, 4] – they appear, for example, to value the freedom from time pressure, intellectual stimulation, and skill-building opportunities it affords them as compared to the commercial development most of them participate in as a paid activity. But we know very little about why they choose one project over another, or why they choose to work on one feature or bug as opposed to another. Such decisions have enormous impact on OSS, of course, since they collectively determine the overall direction, feature set, de facto priority of defects, and response time for various types of problems. The OSS resource allocation mechanism does not function precisely like either a market or a hierarchy. Without a better understanding of these individual and collective mechanisms, it is unclear whether any particular OSS development will meet the needs of users (other than the developers) or the government and industry leaders who try to influence it.

Patronage and new forms of virtual organization. Many companies have, for a variety of reasons, opened up previously closed source developments, even while still contributing a large proportion of the overall resources (e.g., Mozilla). In other cases, companies have elected to expend significant resources on systems such as Linux that have always been open source. Rather than a self-governing community of individuals, such heavily supported OSS developments will now be bent, to an extent, to the commercial advantage of patrons. What will the result of these various forms of patronage be, in terms of the effect on the direction of development and the decision-making of volunteer developers? Will multiple sources of patronage result in branching and splintering of OSS developments, and perhaps the loss of external participation? To the extent that OSS software is relied upon, the death or redirection of the community that builds and maintains it poses a serious risk.

Collaboration technology. Commercial developments that span multiple geographic locations pose enormous problems, and in some ways clearly do not perform as well as co-located developments [3]. Collaboration technology holds promise for alleviating these problems, and perhaps even improving the performance of distributed developments over the current state of co-located projects. While OSS seems not to suffer from precisely the same problems [5], it may be the case that OSS could be substantially enhanced with the deployment of appropriate collaboration technology. For example, a tool providing a visualization of the change management system, currently used in commercial developments to locate experts, might provide a way to easily keep current on exactly where in the code work is going on, who is working where, and providing individual recognition by graphically displaying each participant’s contributions [6]. An examination of e-mail list archives and on-line discussions may reveal opportunities for the use of other types of communication media to enhance the process and coordination among developers. Such technology could potentially change some of the fundamental limitations of OSS by providing a means for more fine-grained coordination, and new ways of developing a tighter sense of community among developers.

Summary. The most important issues in research on open source software require a systematic program of empirical research to understand the individual, social, and system properties of open source software development, its limitations, and its potential.


[1] Brown, Ken. Opening the Open Source Debate (2002). Report, Alexis de Tocqueville Institution.

[2] Ghosh, R.A., Glott, R., Krieger, B., Robles, G. (2002). Free/Libre and Open Source Software: Survey and Study.

[3] Herbsleb, J.D., Mockus, A., Finholt, T.A., & Grinter, R.E. (2001). An Empirical Study of Global Software Development: Distance and Speed. In proceedings, International Conference on Software Engineering, pages 81-90, Toronto, Canada, May 15-18.

[4] Lakhani, K.R., Wolf, B., Bates, J. (2002). The Boston Consulting Group Hacker Survey.

[5] Mockus, A., Fielding, R., & Herbsleb, J.D. Two Case Studies of Open Source Software Development: Apache and Mozilla. To appear in ACM Transactions on Software Engineering and Methodology.

[6] Mockus, A. & Herbsleb, J.D. Expertise Browser: A Quantitative Approach to Identifying Expertise (2002). In proceedings of International Conference on Software Engineering, pages 503-512, Orlando, FL, May 19-25.