Peter Naur’s classic 1985 essay “Programming as Theory Building” argues that a program is not its source code. A program is a shared mental construct (he uses the word theory) that lives in the minds of the people who work on it. If you lose the people, you lose the program. The code is merely a written representation of the program, and it’s lossy, so you can’t reconstruct a program from its code.
Ken Kocienda tells stories of how Apple built the iPhone and iPad: a small team (less than twenty people) charged with delivering on direction set by executives.
- Viewing product development decisions as economic tradeoffs.
- Shifting from deterministic planning models to a stochastic perspective.
- Managing the Fuzzy Front End: the time from opportunity identification to commitment.
- Assessing the reserve buoyancy of a project.
- Using loss of safety margin as a leading indicator of project deterioration.
- Becoming more rigorous about how we think about technical debt.
Dashboards are the human-facing views into our systems that provide concise summaries of how the system is behaving by displaying time series metrics, logs, traces, and alarms data.
A look at how Amazon do dashboards.
Below is a list of some lessons I’ve learned as a distributed systems engineer that are worth being told to a new engineer. Some are subtle, and some are surprising, but none are controversial.
161 suggestions from the staff at Automattic.
Trade offs and spooky stories from Stefan Tilkov.
Game design and unsuccessful monad tutorials.
As software engineers our job is not to produce code per se, but rather to solve problems. Unstructured text, like in the form of a design doc, may be the better tool for solving problems early in a project lifecycle, as it may be more concise and easier to comprehend, and communicates the problems and solutions at a higher level than code.
Since HEY made a big splash on arrival, I thought it’d be fun to share the backstory of how we ended up reinventing email. Because we certainly didn’t start by wanting to reinvent email.
Assume that every student you interact with has limited information, but infinite intelligence. That places the onus squarely on the shoulders of the mentor to make sure that their explanations make sense — which, given the inherent imbalance of power between a teacher and a learner, is a fine way to distribute the extra emotional labor.
Process is documented culture. How a team gets a familiar thing done should be broadly understood by the team. This is how we fix a bug. This is how we do a code check-in. This is how a feature is designed. This is how executive sign-off occurs.
Process comfortably and efficiently describes the common path. Process does not define what to do when the indescribable occurs. A crisis or a disaster does not neatly fit into the common path; it’s when you need someone to swoop in, break the glass, and put out the fire.
Finally, to increase communication, especially if the message is vital, use the three-way handshake. Tell your message to someone using whatever medium you’re using. Then, have that person tell you your message back (in their own words of course, no copy and paste). You then repeat that message back to them. Assuming everyone has it right, you’ve just completed a three-way handshake.
The purpose of a meeting, then, is not to convey information efficiently. It is to force an audience to pay exclusive attention to one thing, to get that creative focus pointed in a particular direction.
Complex systems, cascades (from neurons up to people and then crowds), failures, too much efficiency, not enough slack, and toilet paper.
Dan North on how to break the rules when applying the Theory of Constraints.
A look at negotiation and how to avoid the pitfalls associated with debate.
A nice rundown of implementing Sagas in a distributed system.
Makers receive constant praise for solving problems, and take pleasure in being the expert. Leaders in Maker mode go out of their way to show they have the right answer. They need to have the first and last say. They over invest in their own solutions and don’t create space for others to contribute.
Multipliers amplify or multiply the intelligence of the people around them. They lead organisations or teams that are able to understand and solve hard problems rapidly, achieve their goals, and adapt and increase their capacity over time.
There is a sweet spot of React: in moderately interactive interfaces. Complex forms that require immediate feedback, UIs that need to move around and react instantly.
I’m still a sucker for server-generated markup.
Michael Deng and Jonathan Chang:
Deploys require a careful balance of speed and reliability. At Slack, we value quick iteration, fast feedback loops, and responsiveness to customer feedback.
An interesting dive into how Slack handles deploys to a large fleet of users.
In the war to reclaim your attention, some battles have clearer fronts than others. It has become clear to me that these differences matter.
An attention charter is a document that lists the general reasons that you’ll allow for someone or something to lay claim to your time and attention. For each reason, it then describes under what conditions and for what quantities you’ll permit this commitment.
I use LucidChart for collaborative drawing at work but I’m going to give Excalidraw a crack over the next few weeks.
A solid collection of remote working advice. There is some overlap with the strategies I employ.
What I’ve learned is that if we want things to go fast, a sense of momentum is much more effective than a sense of urgency.
Stories focus on the why, sharing the experience and context around the decision and results.
Stories also eliminate most of the least productive follow-up conversations after giving advice, where the advice-requester then argues against the advice. Stories relieve the advice-giver from the obligation to defend their advice. There’s nothing to agree or disagree with, just a recounting of past events.
It’s really helpful to respond to a person’s ineffective behavior with curiosity rather than judgement.
If a person’s behavior doesn’t make sense to you, it is because you are missing a part of their context.
Every line of code written comes at a price: maintenance. To avoid paying for a lot of code, we build reusable software. The problem with code re-use is that it gets in the way of changing your mind later on.
Building reusable code is something that’s easier to do in hindsight with a couple of examples of use in the code base, than foresight of ones you might want later.
A look back at Carmack using academic research to improve rendering performance in the original Doom.
Jesse Frederik and Maurits Martijn:
Is online advertising working? We simply don’t know.
(we look at) how to handle writes to two independent backends without using two-phase commits. Instead we can rely on using at-least-once delivery guarantees and ask other systems to deduplicate our messages.
Inspired by CRDTs but simpler due to the server being a central authority.
Kent Beck comparing baskets of options to product roadmaps and goals:
I’ve come to hate the damage the “product roadmap” metaphor does to the brains of everyone involved in developing a product. When I use an actual map of actual roads, I assume that I know where I’m going and how I’m going to get there. This is never the case when developing a product.
When you encounter long lead times, you’re hearing option-on-a-basket thinking. “We need to know what features will be in the release in 8 months so Marketing has time to prepare.” What if product development doesn’t go according to plan? The value of the option on a basket falls to zero. What if the launch doesn’t come off? The value of the option on a basket falls to zero.
The man can present.
Estimates matter because most people and businesses are date-driven.
Estimation is difficult but developing it as a skill is helpful for delivering software.
it’s actually an unqualified good for engineers to be interacting with production on a daily basis, observing the code they wrote as it interacts with infrastructure and users in ways they could never have predicted.
A system’s resilience is not defined by its lack of errors; it’s defined by its ability to survive many, many, many errors. We build systems that are friendlier to humans and users alike not by decreasing our tolerance for errors, but by increasing it. Failure is not to be feared. Failure is to be embraced, practiced, and made your good friend.
When people have divided attention, work suffers. The area of code that you work for months is something that you understand deeply. The framework, off to the side, that you update just to facilitate your work may not seem as important. This is a function of distance: cognitive, temporal, and locational distance. In a way, these are all the same.
Gary P. Pisano:
The cliché “celebrating failure” misses the point—we should be celebrating learning, not failure.
Without discipline, almost anything can be justified as an experiment. Discipline-oriented cultures select experiments carefully on the basis of their potential learning value, and they design them rigorously to yield as much information as possible relative to the costs.
The first step is acknowledging that our relationship is more important than the design of the system. As long as we have a productive working relationship we can move the design in any direction. When our relationship breaks down we don’t get anywhere.
Okay, so you just want to go implement the next feature and along I come and say no no no this should be designed completely differently. Even if you are right that the new structure will eventually make my behavior changes easier to implement it’s not eventually, it’s today.
First, acknowledge that our incentives diverge in this moment. It doesn’t help to pretend that we agree when we don’t.
Second, as the structure changer I need to acknowledge that I am placing a burden of learning on you. I think it’s worth it, but if I’m asking something of you I better be prepared to offer something to you.
Software design is a human relationship problem with interesting technical aspects. Geeks relating to geeks requires as much effort as geeks relating to their systems. Maintaining relationships may be hard and confusing and frustrating to geeks (I could be projecting here but yeah no I don’t think I am), but if you want your technical skills to matter you really have no choice but to improving your people skills.
We couldn’t just start replacing old code with new code willy-nilly; without some type of structure to keep the old and new code separate, they’d end up getting hopelessly tangled together and we’d never have our modern codebase. To solve this problem, we introduced a few rules and functions in a concept called legacy-interop:
- old code cannot directly import new code: only new code that has been “exported” for use by the old code is available
- new code cannot directly import old code: only old code that has been “adapted” for use by modern code is available.
The progressive approach to the rebuild is interesting. Especially the rules that enforced how the rewritten parts of the code base could interact with the old code they were ultimately replacing.
Richard Hamming giving advice to researchers in 1995, plenty of which serves as general career advice.
Here’s a selection:
- Work on important problems.
- Luck favours a prepared mind.
- Work on problems you’re committed to.
- Talk to people outside of your field.
- Pursue opportunities when they’re presented.
- Schedule regular time for deep reflections.
- Take a step back to see the larger problem.
- Every defect can be looked at as an asset.
I often say that with knowledge workers, the biggest bottleneck is always getting up in the morning. Knowledge work requires not only our time and effort, but also our engagement and creativity. For that reason, personal motivation is the prime problem that supersedes all other problems.
BFFs are not about the shape of your endpoints, but about giving your client applications autonomy.
Rough consensus relies on the distinction between two types of objections:
“Not the best choice” feedback: “I don’t believe Solution A is the best choice, because XYZ. I believe Solution B would be better, but I accept that Solution A can work too.”
Fundamental flaws: “I believe Solution A is unacceptable because XYZ.”
A chair who asks, “Is everyone OK with choice A?” is going to get objections. But a chair who asks, “Can anyone not live with choice A?” is more likely to only hear from folks who think that choice A is impossible to engineer given some constraints. The objector might convince the rest of the group that the objections are valid and the working group might choose a different path.
In my very first programming role my manager said to me “You can make any mistake you like once. You’ll have my full support the first time you screw anything up. If you’re not making mistakes, you’re not learning, and if you’re repeating mistakes you aren’t either”.
A leader doesn’t shape people – a leader shapes an environment.
In order to succeed at production ownership, a team needs a roadmap for developing the necessary skills to run production systems. We don’t just need production ownership; we also need production excellence. Production excellence is a teachable set of skills that teams can use to adapt to changing circumstances with confidence. It requires changes to people, culture, and process rather than only tooling.
Even a perfect set of SLOs and instrumentation for observability do not necessarily result in a sustainable system. People are required to debug and run systems. Nobody is born knowing how to debug, so every engineer must learn that at some point. As systems and techniques evolve, everyone needs to continually update with new insights.
Standardizing technology is a powerful way to create leverage: improve tooling a bit and every engineer will get more productive. Adopting superior technology is, in the long run, an even more powerful force, with successes compounding over time. The tradeoffs and timing between standardizing on what works, exploring for superior technology, and supporting adoption of superior technology are at the core of engineering strategy.
An effective approach is to prioritize standardization, while explicitly pursuing a bounded number of explorations that are pre-validated to offer a minimum of an order of magnitude improvement over the current standard.
Ideas are funny things. It can take hours or days or months of noodling on a concept before you’re even able to start putting your thoughts into a shape that others will understand. And by then, you’ve explored the contours of the problem space enough that the end result of your noodling doesn’t seem interesting anymore: it seems obvious.
But as you get into more senior-type engineering roles, your most valuable contributions start to take the form not of concrete labor, but of conceptual labor. You’re able to draw on a rich mental library of abstractions, synthesizing and analyzing concepts in a way that only someone with your experience can do.
When building a dashboard, start with a set of questions you want to answer about a system’s behavior, and then choose where and how to add instrumentation; not the other way around.
The incident response team’s common ground is their theory of the system’s behavior – in order to make troubleshooting observations meaningful, that theory needs to be kept up to date with the data.
The purpose of quadratic voting is to determine “whether the intense preferences of the minority outweigh the weak preferences of the majority,”
This is something I’d like to try in planning meetings.
We’ve repurposed the idea of a technology tree, popular in many strategy video games, and used that as a vehicle to communicate the Up product roadmap.
At The Economist, we take data visualisation seriously. Every week we publish around 40 charts across print, the website and our apps. With every single one, we try our best to visualise the numbers accurately and in a way that best supports the story. But sometimes we get it wrong. We can do better in future if we learn from our mistakes — and other people may be able to learn from them, too.
Once you’ve learned enough that there’s a certain distance between the current version of your product and the best version of that product you can imagine, then the right approach is not to replace your software with a new version, but to build something new next to it — without throwing away what you have.
First and foremost, autonomous teams need to live with the consequences of their decisions.
A tired subject but a solid analogy.
“Silent Meetings” are meetings where most of the time is spent working and not talking. When done correctly most of the meeting is spent silently working together.
A look at radical transparency at the hedge fund Bridgewater.
The man can tell a story.
I’ve been playing with Sketch.systems a bit already. This post looking into adding verification on top of it.
Dark is a holistic programming language, structured editor, and infrastructure, for building backend web services. It’s aimed at frontend, backend, and mobile engineers.
Soup to nuts.
Australia is a place with more land than people, more geography than architecture. But it is not and never has been empty. Few landscapes have been so deeply known.
A post that builds up from simple princples. It looks into React, its programming model, its goals, and the trade offs it takes in solving its design challenges.
Startup strategy is like Kung Fu. There are many styles that work. But in a bar fight, you’re going to get punched in the face regardless.
I can only teach you my style. Others can only teach you theirs.
Lots to chew on.
the research is clear: Telling people what we think of their performance doesn’t help them thrive and excel, and telling people how we think they should improve actually hinders learning.
The only realm in which humans are an unimpeachable source of truth is that of their own feelings and experiences.
Speaking of lifting others up, your core group of friends can make or break your life. And your participation can make or break theirs as well.
Instead of working with a thing you love, think about how to work in a way you love.
This is totally my bag.
Still trying to learn how to think better.
Cindy Sridharan quoting Joe Armstrong:
We should identify the error kernel. The error kernel of a system is that part which must be correct. That’s what the error kernel is. All the other code can be incorrect, it doesn’t matter. The error kernel is the part of the system that must be correct. If it’s incorrect, then all bets are off. The error kernel must be correct.
John D. Cook:
The rule of three gives a quick and dirty way to estimate these kinds of probabilities. It says that if you’ve tested N cases and haven’t found what you’re looking for, a reasonable estimate is that the probability is less than 3/N. So in our proofreading example, if you haven’t found any typos in 20 pages, you could estimate that the probability of a page having a typo is less than 15%.
Hashing syntax trees and storing them directly in a database is very interesting. I’ve long wondered what will come after the grab bag of text files approach we’ve been using to date.
Embrace the power of compounding.
Now you can learn them too.
I ❤️ every time Sam goes on The Watch.
Alex Blumberg interviews Ira Glass.
Come for the Feynman anecdote, stay for the exploration of technology and culture.