Most Read Technology Reporter For More Than Two Decades

Maureen O'Gara

Subscribe to Maureen O'Gara: eMailAlertsEmail Alerts
Get Maureen O'Gara: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Article

Linus Torvalds: How the Kernel Group Can Prevent "SCO II" From Ever Happening

Linus Torvalds: How the Kernel Group Can Prevent "SCO II" From Ever Happening

Linus had his usual busy weekend, judging by the posting he made on Sunday to the Linux Kernel Mailing List (LKML), the worldwide group of developers who are so concerned about Linux kernel development that they will happily patch their kernel once a week, suffer through the oopses, bugs and the resulting time and energy losses - those proud to be members of the Order of the Great Penguin, and to be called "Linux geeks" for the rest of their lives.

Labeled a "Request for Discussion," the e-mail outlines a suggestion to his fellow kernel developers that - spurred by the SCO lawsuits - the time has come for a systematic way to document the origin of the code that gets included in each new version of the Linux kernel.

The suggestion includes a proposal for "signing off" on patches,  to show the path it has come through, and to document what Torvalds calls the "chain of trust."

Since no one can marry technical insight with brisk prose and passing wit in quite the way that Linus can, we make no apologies for bringing you the suggestion in his own words.

It all goes to show that even SCO (whom Linux in this message refers to as the "Smoking Crack Organization") might ironically end up serving Linux well, since if adopted rapidly these improvements would already be implemented for the development of Linux 2.7, the very next version of the kernel.

List:       linux-kernel
Subject:    [RFD] Explicitly documenting patch submission
From:       Linus Torvalds
Date:       Sun May 23 2004 - 01:48:04 EST

Hola!

This is a request for discussion..

Some of you may have heard of this crazy company called SCO (aka "Smoking Crack Organization") who seem to have a hard time believing that open source works better than their five engineers do. They've apparently made a couple of outlandish claims about where our source code comes from, including claiming to own code that was clearly written by me over a decade ago.

People have been pretty good (understatement of the year) at debunking those claims, but the fact is that part of that debunking involved searching kernel mailing list archives from 1992 etc. Not much fun.

For example, in the case of "ctype.h", what made it so clear that it was original work was the horrible bugs it contained originally, and since we obviously don't do bugs any more (right?), we should probably plan on having other ways to document the origin of the code.

So, to avoid these kinds of issues ten years from now, I'm suggesting that we put in more of a process to explicitly document not only where a patch comes from (which we do actually already document pretty well in the changelogs), but the path it came through.

Why the full path, and not just originator?

These days, most of the patches in the kernel don't actually get sent directly to me. That not just wouldn't scale, but the fact is, there's a lot of subsystems I have no clue about, and thus no way of judging how good the patch is. So I end up seeing mostly the maintainers of the subsystem, and when a bug happens, what I want to see is the maintainer name, not a random developer who I don't even know if he is active any more. So at least for me, the _chain_ is actually mostly more important than the actual originator.

There is also another issue, namely the fact than when I (or anybody else, for that matter) get an emailed patch, the only thing I can see directly is the sender information, and that's the part I trust. When Andrew sends me a patch, I trust it because it comes from him - even if the original author may be somebody I don't know. So the _path_ the patch came in through actually documents that chain of trust - we all tend to know the "next hop", but we do _not_ necessarily have direct knowledge of the full chain.

So what I'm suggesting is that we start "signing off" on patches, to show the path it has come through, and to document that chain of trust. It also allows middle parties to edit the patch without somehow "losing" their names - quite often the patch that reaches the final kernel is not exactly the same as the original one, as it has gone through a few layers of people.

The plan is to make this very light-weight, and to fit in with how we already pass patches around - just add the sign-off to the end of the explanation part of the patch. That sign-off would be just a single line at the end (possibly after _other_ peoples sign-offs), saying:

Signed-off-by: Random J Developer <random@xxxxxxxxxxxxx>

To keep the rules as simple as possible, and yet making it clear what it means to sign off on the patch, I've been discussing a "Developer's Certificate of Origin" with a random collection of other kernel developers (mainly subsystem maintainers). This would basically be what a developer (or a maintainer that passes through a patch) signs up for when he signs off, so that the downstream (upstream?) developers know that it's all ok:

Developer's Certificate of Origin 1.0

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or

(b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or

(c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it.

This basically allows people to sign off on other people's patches, as long as they see that the previous entry in the chain has been signed off on. And at the same time it makes the "personal trust" explicit to people who don't necessarily understand how these things work.

The above also allows for companies that have "release criteria" to have the company "release person" sign off on a patch, so that a company can easily incorporate their own internal release procedures and see that all the patches have gone through the right channel. At the same time it is meant to not cause anybody to have to change how they work (ie there is no "extra paperwork" at any point).

Comments, improvements, ideas? And yes, I know about digital signatures etc, and that is not what this is about. This is not about proving authorship - it's about documenting the process. This does not replace or preclude things like PGP-signed emails, this is documenting how we work, so that we can show people who don't understand the open source process.

Linus

More Stories By Linux News Desk

SYS-CON's Linux News Desk gathers stories, analysis, and information from around the Linux world and synthesizes them into an easy to digest format for IT/IS managers and other business decision-makers.

Comments (7) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
JamesLyle 07/27/04 02:21:39 PM EDT

It appears that a basic open source operating systems plus office are going to be free for just the cost of the disks. Only specialized high-powered business software will cost more. There is not only the Linnux varities, but FreeBSD, the other
BSDs and Darwin BSD which is Apple X without the good
Apple GUI, only a freware GUI. The next thing the ccomputer community needs is a new basic internet protocol, as the old one was written for the Arpnet, which had only trusted accessors, such as the Pentagon, US Military Bases, Universities,and Defense Contractors. Now, on the other end may be Osama BinLaden himself, from an internet cafe in the tribal terrorities in Pakistan, or one of his subordinates or allies, or Red Chinese or Russian or French or German Intelligence, none of whom may be particulary Well Wishers of the US. I once found from my firewall that someone in the Sudan wanted to break into my computer. There is a wild, dangerous and sneaky world out there. Please do not post my email address.

Randy Poznan 05/25/04 01:30:57 PM EDT

I think the concept is great, however it may put a burden and hamper the creative process. Therefore slowing down the overall process of linux.

An alternative method would be a legal solution. A contract that a contributer individual or corporate must sign with numerous terms that amount to them being accountable for their actions. Not in a security or bug sense, but that the code is not from somewhere else and that their employer (if applicable) is aware and supports them contriubting to Linux/GNU. Also it should make clear that they agree to submit all work under the GNU license.

No anonymous work should ever be accepted, because it risks the entire Linux system.

Short Circuit 05/24/04 12:02:00 PM EDT

So what happens to people who want to contribute code, but don't want their name attached to it, for various reasons?

  • Such as encryption development in France or China, where unauthorized encryption is illegal, IIRC.
  • Or some employee whose boss wants to own all his creative work, on and off the clock.
  • Or people who simply don't want to take the risk of being unfairly targeted by some software company for writing code that looks vaguely like the company's.
  • Or people who had a great idea, but couldn't possibly know someone else had come up with the idea and copyrighted or patented it.

    IMO, it has its ups and its downs. It allows a greater degree of delegate-the-blame (Good for any large project, Objectively speaking), but it will reduce contributions.

  • shawn_willden 05/24/04 11:58:12 AM EDT

    What Linus is doing is making the accountability easier and somewhat more complete, not adding it. As he pointed out in his LKML post, Linux developers have been able to find the origin of every bit of code they've needed to, but the process has been painful and has required a little guesswork, particularly for the oldest stuff.

    What he's proposing here is just a slight formalization and elaboration of the process that has been used for years. Currently, if I submit a patch to LKML to fix, say, a VFS bug, it will get poked, prodded and adjusted on the mailing list until people think it's clean and solid. Then the subsystem maintainer (Al Viro, in this case) will pick it up, probably tweak it some more, attach a "From" comment, stating that I am the author and forward it to Linus. Linus will review it, accept it, and his scripts will add my name into the changelog and the CREDITS file.

    Since all of this happens on the public, archived, mailing list, there's plenty of accountability, but figuring out the sequence of events requires digging through the archives, and there may not be any obviously ideal search criteria.

    Now, Linus wants me to attach my name myself, and to do it in a standardized format so that it's more searchable. Further, he wants everyone else who modifies the patch in any way to add their stamp as well, providing a change history in the patch itself. It's a weak change history, since it doesn't describe what changed, but it provides the starting point for searching the archives.

    So, what Linus is asking for isn't so much to create a better accountability trail as it is to make the existing trail easier to follow. It's an ease-of-use optimization.

    Well, there is one way in which this is perhaps a significant enhancement, and that is that Linus wants to formally define the legal commitment a contributor makes. In a reasonable world, this should be unnecessary, since if I contribute some code that I don't own, I should be the one held liable for the copyright infringement, not the others who used it in good faith. In the litigious world we live in, however, it's a good idea to formally spell it out, and make clear to everyone that by attaching their name to a patch, they're providing a certain warranty of their right to contribute it.

    Good Idea 05/24/04 08:43:55 AM EDT

    This (in my experience) is standard procedure in industry, having to sign off on
    design forms, have code reviewed, etc. It's only surprising that it hasn't come
    to open source before.

    anon 05/24/04 08:41:10 AM EDT

    The authentication needs to be done using GPG (GNU Privacy Guard) or PGP (Pretty
    Good Privacy). This will prevent anyone in the future from inappropriately
    placing code in the kernel.

    These two programs provide an excellent means of determining the authenticity of
    the author.

    Moreover, the origins of all code submissions can easily be tracked and
    catalogued using some open source software some friend of mine and I have been
    working on.

    Friend of the LinuxKernel 05/24/04 08:19:38 AM EDT

    wow, this is how fast the community now works: OSDL has already just announced official adoption of the tracking suggestion.