Owning up to your mistakes
Long, long time ago I deployed an e-commerce website. On a Friday evening. Manually uploading files to a server.
Long, long time ago I deployed an e-commerce website. On a Friday evening. Manually uploading files to a server.
Pinged client to test on their side too.
Only person in the office, as a young engineer, preparing to go home for the weekend. I clicked into the system. Smoke test, manually. Seemed okey.
As my left leg was outside the door, phone rings. Client. Order does not work. They can’t make a test order.
As you can already guess, deployment went wrong. After two hours of testing and debugging, I found it. Missed a single configuration variable in the web.config file. Throwing null reference during the last phase of order as there was a new optional feature implemented here. A week ago.
Missed the step in my notes due to fatigue.
Fixed it manually. Two hours later. At the time, we did not had git flow or anything more advanced. Neither CI/CD servers and processes.
Never complained that I had to work. It was not a question in my mind. Business impact over who is to blame, why and how. Still have the same mindset. I am responsible, then I am fixing it. Of course, I am responsible paired with I am empowered to do. I had all the tools and rights available to deploy any fix. Otherwise would have never attempted the deployment the first place.
I respect when others take accountability too. I may not have written all the deployed code. But I was ready to tackle any of them in case something goes wrong. And it did.
It was not a problem that it did. Because I owned the issue, communicated about it and kept the client in loop as I progressed toward the fix. They validated the fix too. Our relationship got stronger. Because how I handled an issue. And of course it was caused not by negligence or bad faith, only by a honest mistake. That counts.
And the same mindset propelled me through to higher responsibilities and my best moments in my career and personal life too.
By the way, right after the event, managed to convince the leader and client to invest into automation, avoiding this issues ever repeating. Of course this wasn’t the first time something like this happened. But at that company, it was the last. We built CI/CD from scratch, learned ins and outs and we all use those learnings today in our jobs.
Committing mistakes is just human. As learning from them and avoid repeating.
As engineers, we need to be mindful about risks and apply the matching mitigation, with costs and business impact in mind.
It is perfectly okey to manually deploy, if you impact a few non-critical processes, or still early in prototyping. It is perfectly okey to lose an hour of data if the business can withstand it. So you don’t spend excessively on mitigations, tanking the business from the inside.
The key is, to learn and apply. Experience and expertise counts.
What is your best “Friday deployment” insight? How you handle mistakes?