Repeated withdrawal from software development

I needed to add a feature to our web application, which required making calls to a provider’s remote web service. This call needed to be made shortly after the web app user made a purchase, within the app, but it didn’t need to happen right away.

In this situation, you don’t want to make that remote API call while handling the user’s web request. For all kinds of reasons.

The “correct” way to do this:

Set up a job queue, which will asynchronously perform that web service request.

Your web application can send the task to this queue, which is fast and reliable, and then quickly return the response to the client. After all, I don’t want to keep them waiting, especially when they just gave you money.

This type of job queue is a bit complex to set up, but it’s not bad. And he had four days. A lot of time.

So I started doing it the “right way”… and long story short, it didn’t work.

I created a new node that would host Celery, a Pythonic task queue, backed by RabbitMQ. Plus another unnamed tool, which wrapped this node in a nice microservice API.

Unnamed, because after two days of work, I found that this tool liked to bind the CPU to 99% every hour or two. Until I manually restarted the service.

Turns out this was a bug that upstream had been working on for a YEAR. I wasn’t going to solve it by myself, in the two days I had left.

I thought in the long run it was better to configure the job queue node separately. But I needed to implement this feature and decided to configure Celery alongside Django, on the same node where the web server was running.

I thought it would be quick and easy. Because it usually is.

Spoiler alert: it wasn’t.

In fact, after another full day, I ran into ANOTHER show-stopper with that approach. I won’t even bore you explaining what it was: it was stupid and frustrating. But in short, I couldn’t do that either.

So with only a few hours left…what did I do?

I set up an hourly “cron” job.

If you’re not familiar with Cron, it’s a built-in Linux (and Unix) setup, which will run a script for you on a regular basis. It’s simple, rock solid, and I’ve used it a million times.

So I wrote a quick script that would check who recently made purchases and create the API for that web service for each of them. And I told cron to run it once an hour.

Quick and easy. It has been running fine for months, doing what it is supposed to do. There are no problems.

At some point, I’ll go back and fix my first approach. It’s better if the task runs immediately after purchase, rather than an hour later, and we’ll eventually need that separate task queue node running for other things, anyway.

But I think this story illustrates some real-world practicalities of writing software.

Sometimes we have to make tactical retreats. Because implementing an imperfect solution is often much better than a late perfect solution. And be willing to recognize when something is at risk of not working and we need to change our approach.

Author: admin

Leave a Reply

Your email address will not be published. Required fields are marked *