« RSS Corruption? | Main | Summary »

May 07, 2003

bundled software

Sterling has put some serious effort (to which I'm insanely jealous of the free time) into integrating libxml with PHP to the point of requesting that it now replace the expat library, thus requiring it to be bundled. In the past, one of my constant arguments on the internals@ (then php-dev@) list has been to stop this practice as it provides no added benefit to the system as a whole. The interesting bit is Brad and I had run this discussion before. The end result was to not bundled libxml, but did not really resolve any technical merits either. Let me explain my stance on this a bit.

- Advantage one: bundling ensures a specific base version of some library.

The problem isn't wanting to establish a base version of library code, but rather trying to keep in sync with an actively maintained library. One could argue that it need only be updated when the community as a whole requests/requires it. Unfortunately, this doesn't take into consideration the power of a squeaky wheel that will complain feature XYZ doesn't exist in your software, but does in product ABC. In the case of an actively maintained software it may become a full time position to have someone maintain concurrent source (libxml is actively developed). In the world of open source software requiring a developer to do anything is not an exceptionally prosperous route for any.

- Advantage two: bundling allows fixes/patches to the code

A very true idea, but shouldn't such patches be passed back to the maintainers? In the case where they have been ignored (i.e. libgd), shouldn't this be reason enough to branch the code base in question and provide a new venue of distribution? It makes little to no sense to create a proprietary version of a code base, when the work could be useful to others. This also ties into a problem found in Advantage 3.

- Advantage three: bundling allows my software to ensure functionality

Typically in a server based software, those installing it are cognizant of what they are trying to accomplish. With this in mind, an installer will (typically) take the time to read what is required and necessary for such functionality to be enabled, i.e. FAQs, configure help, and potentially the manual. I won't say that all installers do, as one can quickly scan the PHP mailing lists to prove me wrong. This type of reasoning though leads to a downward spiral that, in my mind, ends with an application of an enormous size.

Where does the line get drawn for what libraries get bundled and what libraries don't? If you're really worried about providing a base line functionality always, it surely would be best to bundle every external dependency, right? The point being this is what a configure script is designed to do. Locate, identify, and use the requested functionality. When the necessary support is not found, throw up an error and give the person installing a chance to correct the error or remove the configure option from their install.

Leaving the installation of external dependencies to the user base has a few added side benefits. First, it only requires that maintainers of the interface to the library keep up to speed with any API changes, which typically should be few and far between (given for any amount of usability).

Second it removes the onus on your software from the point of blame. As with all software today, bugs and holes are rather prevalent no matter what QA process is employed. The hope is to keep these to a minimum and their impact even less. By bundling external software you are now (potentially) adding functionality to a system without the sysadmins knowledge. This is a dangerous path to take with respect to system security, and is ( in some cases) considered a trojan horse. If a vulnerability is discovered in this software, it now becomes the library bundlers fault for including the piece of vulnerable software. More importantly if a sysadmin does not realize that this software is installed, they may not upgrade a vulnerable library leading to system compromise. In either case, the onus of a non-secure software is placed upon a project. This is a stigma that is near impossible to remove once the seed has been planted.

Third if a vulnerability is discovered in the bundled software, getting users to upgrade an entire product installation for one library provides no added benefit. Wasn't this the point of using shared libraries in the first place? If you have already made customizations to the bundled software, the end user cannot just simply grab the latest version and update it.

So why not just unbundle everything?

Realize that unbundling an already established element becomes increasingly difficult. Within the PHP project there is a mindset that can be summed up with the idea of 'if it exists now it should exist always'. AKA the breaking of user perceived rules (i.e. expat bundled) is something that shouldn't be broken, and this is typically a good mantra to take for providing backwards compatibility.

On the one hand I agree with this, an established user base is hard to steer towards a new goal. On the other hand, I'd rather see PHP be minimalist. :) In either case I think libxml should not be bundled...

Posted by Dan at May 7, 2003 07:52 AM

Trackback Pings

TrackBack URL for this entry:
http://www.deadmime.org/cgi-bin/cgiwrap/dank/mt.cgi/mt-tb.cgi/3

Listed below are links to weblogs that reference bundled software:

» Bundling in PHP from Bitflux Blog
There's a rather big discussion on php-dev (again..), if libxml2 should be bundled or not with PHP5 itself. Sterling did put the whole libxml2 source code (which is approx 3 MB large) into PHP CVS, which not everyone was happy about it and one of th... [Read More]

Tracked on May 12, 2003 12:41 AM

Comments

you have poopy teeth.

Posted by: zaq at May 12, 2003 12:32 AM