4 Tips for a healthy codebase

As developers, our days (and nights) revolve around writing code, moving from one piece of code to the next.

What makes a codebase easier and fun to work with? Ask developers this question and you'll get a variety of answers.

For me it's about efficiency, productivity, extendibility and readability.
It's about building a codebase that can grow over time, both in complexity and size, while keeping the same experience for the developer.
In my opinion, these represent a healthy codebase.

Here are four principles I've found useful for keeping a codebase healthy.

Disclaimer: I'm not going to talk about coding design principles like SOLID, advocate design patterns or automated tests. I'm assuming you've got all that figured out. These should be considered baseline for every codebase that can be maintained over time.

Clone -> Build -> Run

How many times have you cloned a piece of code from GitHub and found yourself following a list of instructions to build or run it? It's a very frustrating experience.

With the diversity of tools in the market, it's very easy to create a codebase that requires pretty complex steps in order to run it: Install dependencies, build the project, then run a css minifier, React transpiler, database creation scripts and more and more and more...

A lot of boilerplating just to run an application.

You might say, "Sure, but I only have to do this once", right? Clone is a one-off action you do when starting to work on a product. And maybe again when you get an upgraded laptop.

Actually, every time you pull new changes, there's a chance that someone added a new thing, a new tool, a new dependency. Suddenly, stuff that worked before is not working now. Then someone throws in "Oh, man... you also need to run the new XXX thingy".

Application should be runnable without any black magic

Make sure your codebase is always runnable right of the bat. Modify your build script to run everything needed to build and run the project from a clean clone, without any overhead for the poor developer who just wants to implement his feature or solve a bug.

Keep the developers minded to the build process, encouraging them to raise a red flag when they experience complex operations when trying to run the app, and fix them when possible.
The build script should be readable and accessible, like any other piece of code so developers will be able to understand what makes the code work, giving them the a ability to update and add more steps.

Keep the build fast

The time it takes between changes done by a developer until the moment he actually see them running is critical. During this time, most of us (single threaded people) simply wait until we verify that what we did actually works... Usually it doesn't and we have to fix it and wait again.

Compilation is the number one waste for every developer

Coding is an iterative process: Write some code -> build it -> run it -> write some more code, and so on.

All parts of the loop should be optimized to a minimum. Every second wasted here causes time multipliers across all developers, ending up decreasing velocity of feature development and costing the organization valuable resources.

Increased lead-time to production is another artifact. If your CI takes more time to run each build, a bug in production will take longer to fix, inflicting on metrics and the reputation of your product.

How?

Earlier, we talked about how the build process should include everything required for running an application. However, it doesn't mean that everything should run all the time.

If you're doing a server change, there's no reason to minify CSS and JS files, right? If you are doing a client change, why compile the server? If you didn't add any dependencies, no need to restore anything (running the restore/install action even when there's nothing to install still has huge overhead).

Invest time to optimize your build->run processes. For example:

  • Allow developers to opt-out from tools they don't need for their current task.
  • Run one-off's only when needed. For example: no need to create the database every build - only run it if the database scripts have changed.
  • Run incremental changes. Many tools allow incremental builds, which only build the code that was changed. Find these in your application.
  • Add file watchers that perform live tasks during development (React transpiling, for example).

Apply these principles on all the other parts of the development loop, like application startup and test run times.

Having said all that, don't hurt the clone->build->run ability. Run everything by default.

Everything should run on local machine

All Features should work on dev machines. Continuous Deployment made it very easy to forget. When it's easy to deploy to production, it's also easy to "experiment" on production instead of checking issues on a local machine.
Bugs will happen, features will need to be changed. Allowing proper development on local dev machine is a must. If there's a lot of overhead, it simply won't happen.

Data is an essential part of running code. Code is only one half of the equation - having the code without the necessary data is like having a car without gas. Meaning data should be available and fresh on dev machines.

Data affects the way the code behaves

It doesn't mean you need to keep ALL that data. However, some fraction of it is necessary.

You can enforce this with procedures, like each new feature should be built with a script that adds the relevant data to it. Alternatively, create migration processes from your production environment, which keeps the data the developers are working on fresh.

Another aspect is the ability to run tests. Make sure a developer can run every test - unit, integration, system, UI and even mobile on his local machine when you need to fix a bug, or is trying to reproduce an issue in the CI or, god forbid, production.

Enforce guidelines

Guidelines are usually writing in documents, Wiki or word of mouth. They are enforced by code reviews or even in design decisions.

That's all well and good, but as your codebase gets bigger, as more developers becomes involved, it will become increasingly difficult to track all changes and enforce even the simplest conventions.
Guidelines also change overtime, some are updated, added or even discarded.
Moreover, in our line of business, developers move from team to team, leave and are recruited, making it even harder to spread the knowledge to all developers.

Use automated tools to enforce these guidelines. Use linters and meta tests in order to run static analysis on your codebase. Use existing rules for standard conventions and add custom rules of your own that are relevant to your organization.
Use integrity tests for dynamic analysis of your code while it executes, to make sure code paths are valid.

Use a linter for your styling rules (wherever you think the correct position of the curly brackets is), an integrity test that validates there are no dependency loops between services that might cause recursion at runtime, a meta test that validates the file system structure conforms to your conventions.

The sky is the limit.

Make it easy to add more rules and more tests so the validation system can grow organically, aligned with the pace and needs of your developers.
Make sure all the rules are properly visible and have a viable reasoning. It can be fertile ground for good and bad arguments.

Conclusion

Keeping a healthy codebase is not a one-time task. The team should be minded to the integrity of the codebase as it grows in size and complexity.
Broken windows are easily created, fixing them is harder as time goes by. Keep your ear to the ground and make sure the state of the codebase is always in check, and you can ensure developers will enjoy their coding.

How do you keep your codebase healthy? Comment or post back.


Post photo by Kevin on Unsplashed

Yossi Shmueli

Keeping it green since 1995

comments powered by Disqus