Previously I wrote about better PR reviews. I want to extend on the theme of software and share thoughts on testing.
Getting started with tests is easy but requires the same rigour of writing code bases. There is much more than simply writing unit, integration, system and performance test.
Understand the value
Tests should be added as per their chances of failure x impact radius. Start with high impact x (high / low chances). The impact is measured in terms of user experience.
Tests should be correct
Make sure the asserts added are actually correct and not in alliance with an existing bug. This requires eyeball analysis to catch them.
Updating tests should require inquiry
When an assertion fails for a test, our instinct is to overwrite the asserted value with a new value without checking the why behind the failure. This can introduce bugs unknowingly.
Make them reproducible
Do not call remote services
A test can fail if it relies on remote service. This makes it irreproducible.
Lock the artefacts
Either mock artefacts or save them to blob storage to use them reliably. Model versions and configs used for it should be fixed.
Random generator
Fix the random generator seed if a test relies on the value generated by a random generator.
Make port number random if tests are happening in parallel and require starting a server on a port.
Make tests independent
Don’t make tests rely on artefacts generated by other tests. Tests kept after a test, which fails, will not be executed leading to no clarity on later tests.
Add enough logging
Test failures can require deep debugging. Add enough logging to understand the why behind the failure.
Clean up artefacts
Make sure artefacts generated by tests are written to unique directories and are cleaned after completion. This avoids collision and storage issues. You can do this easily with tmpdir fixture in pytest.
Make them fast
Almost never do tests reduce with time. Make sure to execute them in parallel to save time.
Tip: If the tests rely on artefacts ( ML model ) and every test tries to download the model from the remote at the same time, it can cause failures. pytest-parallel solves this problem.
Add pre-hooks
Add pre-hooks to lint and format on local before pushing to the remote repo. This will save immense time + cost of running tests on CI/CD.
Add code metrics
Integrate report generation for code coverage to find blind spots. The chance of bugs in a code with significant complexity is often higher, and complexity metrics can guide you to fix it. Leverage tools like Codecov/SonarQube for this.
Connect with me on Medium, Twitter & LinkedIn.
Check out my anthology of writing and talks.