Contributing to Open Source Projects HOWTO
This page is aimed at programmers new to the Open Source
/ Free Software world, who
want to make a contribution, but aren't sure where to start.
The latest version of this document is at www.kegel.com/academy/opensource.html.
Triage is the fine art of looking at bug reports from users,
deciding if they're repeatable, and if so, passing the proper
information on to the developers. This is a great way to get
familiar with a project, and to earn karma points.
A guide to triage of OpenOffice bugs is online at kegel.com/openoffice
Other projects with bug report backlogs that could use triage
include
Most open source projects work like this: all the developers have
their own (not quite identical) copies of the source code.
When one developer has a change he wants to share with the
others, he emails them a patch.
To create a patch, you run a program called diff, and save
its output to a file. For instance, if the original source tree
is in directory "foobar.old", and your new sources are in directory "foobar.new",
the command
diff -Naur foobar.old foobar.new > blarg.patch
will create the file 'blarg.patch' containing your changes
in 'unified context diff' format.
(See Diff, Patch, and Friends for more info.)
You can also diff against individual files rather than directories,
but you still want to be in the directory above the project just
as if you were doing a directory diff. Also, to represent the addition
of a new file, diff it against /dev/null (since there was no original file).
If you're submitting a patch to a project that uses CVS, you may
be able to use cvsdiff to create the patch, but IIRC that doesn't
handle new files; you have to diff against /dev/null for those.
When creating a patch, remember that the people who receive it
will look at it very skeptically, and will probably reject
it if it contains lots of unrelated or unneeded changes.
To maximize the likelihood that other developers will bless your
patch, review your patch line by line to verify
that there are no extra changes included. If you see anything
extraneous (for instance, the unneccessary addition or deletion of whitespace),
you should go fix your edited file, regenerate the diff, and verify
that the extra changes are gone.
diff is part of the Gnu Project; to learn more, read the GNU project's
manual for diff and/or
the Linux Documentation Project's
Software Release Practice HOWTO.
To use a patch -- that is, to automatically carry out the changes described in a
patch file -- you run a program called patch. For instance,
if you're trying to apply the patch 'blarg.patch' to a package called foobar-0.17, you might say
cd foobar-0.17; patch -p1 < ../blarg.patch
That would merge the changes from blarg.patch into your source tree.
(The -p1 tells patch to ignore the first directory in filenames in the patch;
that way a patch generated against the directory foobar-0.11 will still
apply properly.)
patch is part of the Gnu Project; to learn more, read
"Merging with Patch"
in the GNU project's manual for diff.
Before you submit a patch, be sure you understand the basics
of software licensing, otherwise your patch might be rejected
for legal reasons.
Here are some rules of thumb:
- Respect the copyright of others. Don't use their code in ways they don't permit.
- Make sure you know what license the project you're
contributing uses.
- If posting a patch that contains code copied from
some other project, make sure both projects'
licenses are compatible.
- Preserve your own copyright.
When you post code, make it clear who wrote it, what year it was written, and
what license it's being made available under (preferably the
same license as the project you're contributing to).
See www.denniskennedy.com/opensourcelaw.htm for more info on licenses.
The culture of successful open source projects, e.g. the Linux kernel, is described in the following pages:
Many open source projects make use of the Gnu Autotools to generate
their Makefiles. The learning curve for the Autotools is steep,
but a few tutorials are available. Here are some places to start:
CVS is a source code control system that automates and hides the work of using diff and patch.
You'll get to know cvs later; when starting out, it's best to learn
how to use diff and patch. (But if you're curious, you can learn about cvs at
cvshome.org.)
Here's how one newcomer jumped in and started contributing.
- Since he didn't have Linux, he downloaded and installed
Cygwin to give his Windows system the tools needed to build Linux programs to run on Windows.
(Note: you have to pick the "Development" package in the installer to get the C compiler! Cygwin's installer's user interface takes some learning.)
- He went to freshmeat.net and browsed
around until he found a project that looked interesting. His
choice was Clex, a curses-based file manager (see http://www.clex.sk).
- He downloaded the clex source code from its home page's download page
(in particular, he downloaded the file www.clex.sk/download/clex-3.1.7.src.tar.gz),
and unpacked it with the command
tar -xzvf clex-3.1.7.src.tar.gz
Then he read the README and INSTALL files for instructions on how to build it, and tried building it, using the commands
cd clex-3.1.7
./configure
make
This failed with the compiler error
Can't find file term.h
-
At that point, I explained that term.h was a standard include
file that's part of Unix, but it lives at different places in
different versions of Unix. This is the kind of problem that
makes it hard to write programs that will compile on all versions of Unix!
Fortunately, since clex uses Gnu Autotools, and they're good
at handling this kind of problem, the solution was fairly
simple: figure out what subdirectory of /usr/include term.h
is in on Cygwin (turns out it's in /usr/include/ncurses), and
tell configure.in to look for term.h in both the normal place
and in the ncurses subdirectory. (I helped with this part,
since this is one of those things that's not obvious to beginners.)
Then run autoheader and autoconf to regenerate the Makefiles
and whatnot, and away you go.
(Annoyingly, Cygwin ships with two versions of Gnu Autotools,
and clex seems to prefer the older one, so we had to modify the
PATH environment variable. See the shell script below.)
- Having fixed the program, he now wanted to share his fix with the
Clex author and other developers. To do that, he needed to make a patch
containing his change. So he read a tutorial on how to use 'diff' to create patches, then
created a patch using the commands
make distclean
cd ..
mv clex-3.1.7 clex.new
tar -xzvf clex-3.1.7.src.tar.gz
diff -aur clex-3.1.7 clex.new > cygwin-clex.patch
- To make sure his patch really worked, he read a tutorial on how to use 'patch', then unpacked the clex source code
again into a new directory, applied the patch with the command
patch -p1 < cygwin-clex.patch
then made sure the new sources really worked.
- Because figuring out how to build the program after his
change was a challenge, he saved that knowledge for use by others
by
writing a shell script
that demonstrated the whole process of downloading
the sources, applying the patch, and building clex.
He verified that running that script actually produced a working
version of clex from scratch.
-
Finally, he was ready to contribute his changes. Clex doesn't have
a mailing list, so he emailed the patch and shell script to the Cygwin mailing list
(so others in his situation would be able to find his patch if
they looked for "clex" in the Cygwin mailing list archives)
and the author of clex (so he could fix the next version of Clex).
The fruit of his labor was a single short message to a mailing list
which contained exactly what other programmers had to do to
fix the bug easily. It said:
I ran into a compile problem while building clex under cygwin.
(Clex is a curses-based file manager; see http://www.clex.sk ).
A patch to fix it is attached, along with a little shell script
that demonstrates how to build clex under Cygwin.
Here's the patch he attached:
--- clex-3.1.7/configure.in.old 2003-02-12 12:09:34.000000000 -0800
+++ clex-3.1.7/configure.in 2003-02-12 12:25:56.000000000 -0800
@@ -12,7 +12,7 @@
AC_HEADER_MAJOR
AC_HEADER_SYS_WAIT
AC_HEADER_TIME
-AC_CHECK_HEADERS(ncurses.h fcntl.h unistd.h)
+AC_CHECK_HEADERS(ncurses.h fcntl.h unistd.h term.h ncurses/term.h)
AC_DECL_SYS_SIGLIST
AC_C_CONST
AC_TYPE_UID_T
--- clex-3.1.7/src/inout.c.old 2003-02-12 12:10:51.000000000 -0800
+++ clex-3.1.7/src/inout.c 2003-02-12 12:12:26.000000000 -0800
@@ -24,7 +24,11 @@
#else
# include <curses.h>
#endif
+#ifdef HAVE_TERM_H
#include <term.h> /* enter_bold_mode */
+#elif defined (HAVE_NCURSES_TERM_H)
+#include <ncurses/term.h>
+#endif
#include "clex.h"
#include "inout.h"
(See the GNU diff manual's description of unified diff format for an
explanation of what those +'s and -'s mean.)
And here's the shell script he attached:
wget http://www.clex.sk/download/clex-3.1.7.src.tar.gz
tar zxvf clex-3.1.7.src.tar.gz
patch -p0 < cygwin-clex.patch
cd clex-3.1.7
PATH=/usr/autotool/stable/bin:$PATH
rm missing
aclocal
touch NEWS README AUTHORS
autoheader
automake --add-missing
autoconf
./configure
make
Try installing Cygwin on a Windows system, and see if his shell script correctly downloads, patches, and
builds Clex! (You may need to apply the patch by hand, since it's hard to copy
and paste patches from web pages.)
Now that you've seen examples of how to contribute to an existing project,
here are some suggestions for picking projects to contribute to.
The best way to find projects to contribute to
is to simply use open source software for all your day to day
computing needs. As time goes on, you will find rough edges here
and there. Pick one of the smallest rough edges you can find, and
post a patch that makes things work better.
Some projects maintain "to-do" lists for people who want to
fix problems others have already identified. Here are a few:
Also, here are a few small project ideas:
Several of the following project
ideas have to do with testing. Here are a couple web pages on the subject that are worth reading:
All programs that process responses from remote servers should be able to
withstand invalid and possibly malicious responses without crashing.
Yet few programs are tested to make sure they can, let alone designed with this in mind.
See for instance:
- How to kill a web browser -- discussion of Michal Zalewski's "mangleme" tool and the bugs it found in many browsers (an updated version even found bugs in IE)
- Mozilla bug 264944 -- discussion of this class of bugs and their fixes for Mozilla
Your assignment, should you choose to accept it, is to write a tool
to do similar testing on some other program that accepts data
from strangers (e.g. OpenOffice, xpdf, Macromedia Flash plugin, Sun JVM)
and see if you can get it to crash in a nasty way. If you can,
submit your test program to the maintainers of the application,
and file a bug report in the application's bug tracking system.
-
Find some project that has a test suite (e.g. Wine,
ACE/TAO,
OpenOffice.org).
-
Figure out how to build the project with the compiler options needed for gcov.
-
If the project is autoconf-based, submit a patch to the project to add a --enable-gcov
configure option. (See e.g. Aaron Arvey's patch for Wine.)
-
Measure the coverage of the test suite for one of the project's source files or modules.
e.g. for Wine, measure the coverage for the source code of one DLL.
(lcov may come in handy if
you want an overview of a large project's test suite coverage, but you'll probably
want plain old gcov once you've picked what part of the project to focus on.)
-
Add cases to the test suite to increase the coverage.
-
Submit these test cases to the project.
Wine comes with its own copy of <windows.h>, to allow
compiling Windows programs for Linux. However, Wine's
version of windows.h isn't quite complete. As Dimitrie Paun
pointed out, it
uses macros when it should be using inline functions; that
would let it compile more programs. Try converting one
of the macros into an inline function, verify that wine still
builds, send your one-line patch to [email protected] and ask
for feedback, and if they like it, submit it to [email protected].
Wine comes with wcmd.exe, a clone of the
Windows Command Shell, cmd.exe.
wcmd.exe doesn't have help text for some of its commands.
It's fairly easy to add the missing text, once you
know what the command does.
Here's an example patch
that adds the missing help text for one command.
There are several continuous build systems that watch over a
source code repository, continuously building the project.
This catches new errors much sooner than if some developer had
to run into them, and increases the chances that bugs will
be fixed before they get in the way. Here are links to a few:
Read up on a couple of them, pick one, and try to set it up to
watch a popular open source project (e.g. wine) and send
you email whenever it finds problems. Once that's working,
document how you set it up, and post the description to
the mailing list for the project you set up the autobuilder for.
- Linux Weekly News - the best source of kernel information.
See their Kernel section; they have a good set of tutorials for kernel developers.
- "How to contribute code back to the open source community" by Kraft and Clavey, IBM DeveloperToolbox Magazine, 2001
- "Let your people code" by Russell Pavlicek, April 2002
- "Dealing with unhelpful comments on your open source software", by Ploppy, at Avagadro.org. (Read this for an idea of how the people receiving your patch feel about things.)
- Diff, Patch, and Friends by Michael K. Johnson, Linux Journal, 2003
- Contributing to the Linux Kernel by Joseph Pranevich, Linux Journal, 2003
Last Change 22 Nov 2004
(C) Dan Kegel 2003-2004
[Return to www.kegel.com]