blog dds

2009.09.16

Applied Code Reading: Debugging FreeBSD Regex

When the code we're trying to read is inscrutable, inserting print statements and running various test cases can be two invaluable tools. Earlier today I fixed a tricky problem in the FreeBSD regular expression library. The code, originally written by Henry Spencer in the early 1990s, is by far the most complex I've ever encountered. It implements sophisticated algorithms with minimal commenting. Also, to avoid code repetition and increase efficiency, the 1200 line long main part of the regular expression execution engine is included in the compiled C code three times after modifying various macros to adjust the code's behavior: the first time the code targets small expressions and operates with bit masks on long integers, the second time the code handles larger expressions by storing its data in arrays, and the third time the code is also adjusted to handle multibyte characters. Here is how I used test data and print statements to locate and fix the problem.

Continue reading "Applied Code Reading: Debugging FreeBSD Regex"

2008.10.27

Monitor Process Progress on Unix

I often run file-processing commands that take many hours to finish, and I therefore need a way to monitor their progress. The Perkin-Elmer/Concurrent OS32 system I worked-on for a couple of years back in 1993 (don't ask) had a facility that displayed for any executing command the percentage of work that was completed. When I first saw this facility working on the programs I maintained, I couldn't believe my eyes, because I was sure that those rusty Cobol programs didn't contain any functionality to monitor their progress.

Continue reading "Monitor Process Progress on Unix"

2008.05.16

Open and Closed Source Kernels Go Head to Head

Earlier today I presented at the 30th International Conference on Software Engineering a research paper comparing the code quality of Linux, Windows (its research kernel distribution), OpenSolaris, and FreeBSD. For the comparison I parsed multiple configurations of these systems (more than ten million lines), and stored the results in four databases, where I could run SQL queries on them. This amounted to 8GB of data, 160 million records. (I’ve made the databases and the SQL queries available online.) The areas I examined were file organization, code structure, code style, preprocessing, and data organization. To my surprise there was no clear winner or looser, but there were interesting differences in specific areas.

Continue reading "Open and Closed Source Kernels Go Head to Head"

2008.02.02

The Power of an Integrated Platform

FreeBSD, unlike Linux, is not a kernel, but a complete operating system. This allows a much smoother integration of its components, which is a real boon when you try to locate and fix a problem. The source code for all the parts is all ordered in a single directory tree for you to examine and experiment with.

Continue reading "The Power of an Integrated Platform"

2008.01.07

The Relativity of Performance Improvements

Today, after receiving a 1.7MB daily security log message containing thousands of ssh failed login attempts from bots around the world, I decided I had enough. I enabled IPFW to a FreeBSD system I maintain, and added a script to find and block the offending IP addresses. In the process I improved the script's performance. The results of the improvement were unintuitive.

Continue reading "The Relativity of Performance Improvements"

2007.10.21

International BSD Conference in Turkey

I'm on my way back from the International BSD Conference in Turkey, which a group of enthusiastic members of our community organized on Friday and Saturday.

Continue reading "International BSD Conference in Turkey"

2007.10.06

The Memory Savings of Shared Libraries

A recent thread in the FreeBSD ports mailing list discusses the benefits and drawbacks of static builds. How can we measure the memory savings of shared libraries?

Continue reading "The Memory Savings of Shared Libraries"

2007.06.28

The Tools we Use

It is impossible to sharpen a pencil with a blunt ax. It is equally vain to try to do it with ten blunt axes instead.

— Edsger W. Dijkstra

What’s the state of the art in the tools we use to build software? To answer this question I let over a period of a month a powerful server build from source code about seven thousand open-source packages. The packages I built form a subset of the FreeBSD ports collection, comprising a wide spectrum of application domains: from desktop utilities and biology applications to databases and development tools. The collection is representative of modern software, because, unlike say a random sample of sourceforge.net projects, these are programs that developers have found useful enough to spend effort to port to FreeBSD. The build process involves fetching each application’s source code bundle from the internet, patching it for FreeBSD, and compiling the source code into executable programs or libraries. Over the monthly period I also setup the operating system to write an accounting record for each one of the commands it executed. I then tallied up the CPU times of the 144 million records corresponding to the work in order to get a picture of how our software builds exploit the power of modern GHz processors.

Continue reading "The Tools we Use"

2007.04.04

A Humbling Upgrade

Yesterday I upgraded one of the servers I maintain from FreeBSD 4.11, which had reached its end of life, into the latest production release 6.2. It was a humbling experience.

Continue reading "A Humbling Upgrade"

2006.09.30

Cross Compiling

Cross compiling software on a host platform to run on a different target used to be an exotic stunt to be performed by the brave and desperate. One had first to configure and build the compiler, assembler, archiver, and linker for the different architecture, then cross-build the other architecture's libraries, and finally the software. This week, while preparing a new release of the CScout refactoring browser I realized that what was once a feat is nowadays a routine operation.

Continue reading "Cross Compiling"

2006.09.17

NASSCOM Quality Summit 2006

Last week I attended NASSCOM's 2006 Quality Summit in Bangalore, India. There I gave a tutorial on tooling with open source software, and delivered a talk on Global Software Development in the FreeBSD Project. It was an edifying trip.

Continue reading "NASSCOM Quality Summit 2006"

2006.09.02

Hardware and Software Debugging

Debuggging day. The MySQL 5.0 server I tried to run as part of a MediaWiki installation under FreeBSD, crashed during initialization, and a Tomy Walkabout digital baby monitor started emitting a low beeping sound. I solved both cases through educated guesses.

Continue reading "Hardware and Software Debugging"

2006.05.07

Surprising Findings on Software Reuse

Kevin DeSouza and his colleagues in a recent article in the Communications of the ACM published some surprising findings regarding software reuse: reuse happens more by novices rather than by experts, within projects rather than across them, and in transient teams rather than permanent ones. The statement regarding the higher propensity of rookies to reuse compared to older professionals rang particularly true to my ears.

Continue reading "Surprising Findings on Software Reuse"

2006.04.24

Public Bookmarking

You're searching the internet to answer a question you have, and after some painstaking detective work you locate the answer. Where do you store the answer for future reference?

Continue reading "Public Bookmarking"

2006.01.07

A Tree of Mentors

In the FreeBSD project, new committers are assigned a mentor who overlooks their work, until they are judged to be confident enough to work on their own. As lots of things in the open-source landscape, having a mentor is a loan, which we should pay back by mentoring somebody else.

Continue reading "A Tree of Mentors"

2005.04.13

A Pipe Namespace in the Portal Filesystem

The portal filesystem allows a daemon running as a userland program to pass descriptors to processes that open files belonging to its namespace. It has been part of the *BSD operating systems since 4.4 BSD. I recently added a pipe namespace to its FreeBSD implementation. This allows us to perform scatter gather operations without using temporary files, create non-linear pipelines, and implement file views using symbolic links.

Continue reading "A Pipe Namespace in the Portal Filesystem"

2005.02.04

Maintainability of the FreeBSD System

Last November Ioannis Samoladas and his colleagues published an article in the Communications of the ACM [1] that compared the maintainability of open-source versus-closed source projects. I applied the maintainability index [2] they used on the FreeBSD source repository following the code's maintainability over time, and comparing the maintainability of different modules. Here are the results.

Continue reading "Maintainability of the FreeBSD System"

2004.12.11

Measuring the Effect of Shared Objects

For the Code Quality book I am writing I wanted to measure the memory savings of shared libraries. On a lightly loaded web server these amounted to 80MB, on a more heavilly loaded shell access machine these ammounted to 300MB.

Continue reading "Measuring the Effect of Shared Objects"

2004.09.11

System administration stories: The Revolt

Can a small embedded system the size of a paperback lead a group of machines into revolt? Apparently yes.

Continue reading "System administration stories: The Revolt"

2004.08.23

Detective Work and Dropped TCP Connections

I had problems with TCP connections (mostly long-lasting ssh sessions) getting dropped on my ADSL line. In the end, I found that the problem had two different roots. The detective work behind establishing them is, I believe, interesting. It also shows how accessible source code, and the will to use it, can be a tremendous boost to difficult system administration problems.

Continue reading "Detective Work and Dropped TCP Connections"

2004.08.16

The hypot() Mystery

I was writing a section for the Code Reading followup volume, and wanted to demonstrate the pitfalls of using homebrewn mathematical functions instead of the library ones. As an example, I chose to compare the C library hypot(x, y) function, against sqrt(x * x, y * y). I created a plot of "unit in last place" (ulp) error values between the two functions, which demonstrated how the error increased for larger values of y.

Continue reading "The hypot() Mystery"

2004.05.15

Optimizing ppp and Code Quality

The Problem

While debugging a problem of my ppp connection I noticed that ppp was apparently doing a protocol lookup (with a file open, read, close sequence) for every packet it read. This is an excerpt from the strace log, one of my favourite debugging tools.

Continue reading "Optimizing ppp and Code Quality"

2004.03.19

Binary File Similarity Checking

How can one determine whether two binary files (for example, executable images) are somehow similar? I started writing a program to perform this task. Such a program could be useful for determing whether a vendor had included GNU Public License (GPL) code in a propriatary product, violating the GPL license. After writing about 20 lines, I realized that I needed an accurate definition of similarity than the vague "the two files contain a number of identical subsequences" I had in mind.

Continue reading "Binary File Similarity Checking"

2003.10.03

Software Complexity: Open Source vs Microsoft

In a readable and interesting paper titled CyberInsecurity: the cost of a monopoly seven notable security experts argue that the Microsoft's near monopoly in the desktop operating system and office productivity markets is creating a dangerous monoculture that exacerbates the effect of security vulnerabilities.

Continue reading "Software Complexity: Open Source vs Microsoft"

2003.06.21

FreeBSD Committer

I became a FreeBSD committer. I've been using BSD Unix systems since 1986 starting with 4.3 BSD on a pair of VAX 780 machines. In 1992, as a bored PhD student, I reimplemented sed(1) and contributed it the unencumbered BSD version that was then being put together; it is now part of the *BSD family. I crossed again paths with BSD software when the prize of the 2000 Usenix technical conference ``win a pet Shark contest'', Digital's Network Appliance Reference Design-DNARD, came with a NetBSD boot image. I used that code for drawing about 500 examples for my book Code Reading: The Open Source Perspective (Addison-Wesley 2003), detailing how to read software code others have written . Since 2001 I 've been using FreeBSD to control my home's security, communications, and entertainment systems as described in a SANE conference paper and a recent article in Personal and Ubiquitous Computing (as an academic I have to live by the "publish or perish" motto).

Continue reading "FreeBSD Committer"


Creative Commons License Last update: Thursday, September 22, 2016 9:56 am
Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-Share Alike 3.0 Greece License.