Foreword
I read The Soul of a New Machine not long after it was published.
At the time, I was already working at Boeing on systems where correctness was not a matter of taste or optimization, but of responsibility. Earlier in my career there, I had designed and implemented something called the Automatic Test Control System (ATCS)—a compiled test scripting language that operated inside large engineering development simulators. Today it would be recognized as a domain- specific language for automated, repeatable system validation. At the time, it was simply a way to force complex sys tems to reveal their behavior honestly, at a scale that could not be achieved any other way.
These were not crew training simulators.
They were engineering development simulators used for current- and next- generation aircraft and flight control systems. We used them to explore designs, to integrate full systems, and to investigate failures —long before any of that software ever reached an aircraft, and long before a single life was put at stake. The simulator was where assumptions were challenged, where interactions that could not be reasoned about on paper were forced into the open, and where mistakes were meant to be found early, while they were still survivable. ATCS would go on to save Boeing millions of dollars over the years that followed, and it became the basis for similar systems built later. For that work, I received a Boeing Special Achievement Award. More importantly, it earned me the trust and credentials to move from the Applications group—where I had been working on automatic flight control systems—into the Systems group, where I worked on and maintained the Simulation Executive itself and the infrastructure that supported it. Those were systems you did not guess at.
I could take a three -inch-thick system dump from a line printer and tell you exactly which line of code had caused the failure. That wasn’t a parlor trick. It was table stakes. If the simulator lied, it taught engineers the wrong lessons. If it failed quietly, it undermined confidence in everything that followed. Every one of us understood what that meant.
We knew that what we were doing mattered, because someday we—or people we loved —might be on one of those aircraft when something went wrong. The work we did in those simulators was about giving them the best chance of survival we could, by discovering failure modes early, understanding them deeply, and refusing to let ambiguity pass as correctness. Tracy Kidder’s book resonated with me because it gave language to something I already understood from experience: machines are profoundly human artifacts. Not because they think or feel, but because their behavior reflects the discipline, compromises, and integrity of the people who build them. That understanding stayed with me across a long and varied career. It shaped how I debug, how I reason about failure, and how I decide whether a system deserves to be trusted. This book exists because that lesson never stopped being true.
TChapter One — PingEverything he script was called pingEverything. It wasn’t elegant. It wasn’t clever. It didn’t do anything that hadn’t been done a thousand times before. It sent pings—five local machines and one address on the wider Internet—and recorded whether they came back. That was all.
I ran it on every node.
Four Raspberry Pis and a NUC, each asking the same questions, expecting the same answers. If anything was wrong anywhere—routing, naming, forwarding, authority—it would show up immediately. I had been staring at variations of this output for weeks.
Red where there should have been green. Timeouts that made no sense. A machine that could see three others but not the fourth. A node that could reach the Internet but not its peers. A system that worked perfectly until it didn’t, and never failed the same way twice. Most of the work happened in silence.
Long stretches where nothing changed visibly. Where I searched, researched, tried an idea, rejected it, tried another. Where I wrote code I knew I would throw away. Where I wiped the world and started over because there was no other honest move left. I learned not to rush the test.
Running it too often just produced noise. Running it at the wrong moment told me nothing. So I waited —until I thought I understood what the system was supposed to do —and only then did I ask it the question again.
That night, when I ran pingEverything, the output looked different.
Not dramatically. Not with fanfare. Just different.
Green.
Everywhere.
All five machines reported each other. Routes held. Names resolved. Traffic went where it was supposed to go and nowhere else. The default route worked, but it didn’t dominate. Google answered, but it wasn’t required. I ran it again.
Still green.
I waited a moment, then ran it from another node. Same result. No flicker. No hesitation. No silent failure hiding behind success. I didn’t celebrate.
I sat back and watched it scroll.
What it meant, slowly and carefully, was not that the system was finished. I knew better than that. It wasn’t even close. This was the beginning, not the end—closer to the initial form than the final one. But it meant something else.
It meant that I finally understood —or at least that I had an understanding —of how the pieces fit together. How machines could find each other without assuming authority. How routes could appear and disappear without poisoning the system. How silence could mean absence instead of failure. It meant the thing was possible.
Not hypothetically. Not in theory. In practice.
Those long hours hadn’t been wasted. The weeks spent sitting alone, talking to the machine and whatever powers might have been listening, hadn’t been a delusion. I hadn’t been chasing something imaginary. And somewhere under all of that —under the work, the doubt, the restarts —was a quieter realization, one I didn’t say out loud: I was still as good as I believed myself to be.
Not because the screen was green.
But because I had stayed with the problem until it told me the truth.