Testing, Processes, Python and Nuts (part 2)

The magical power of writing tests for all your code is that you don't write as much code, cuz you'd have to test it.

Chris McDonough

What tests are for?

Tests are very important part of development process, probably even more than requirements in AGILE approach and definitely is more important than documentation: 

Working software over comprehensive documentation.


Tests are the only way to guaranty that code is working and some part of the system you develop hadn't been broken by recent changes someone made. The other great thing is that tests are a very good way to show how code should be used, so it may sense treat tests as it were a documentation, especially if you have documentation based tests.

Tests are not to prove it works now, they're to prove it works a year from now. And they're not even really for that, they're a design tool.

Chris McDonough

General good practice about writing tests is to be suspicious about code you write and cover it with tests for edge cases with elements of monkey testing.  But don't get carried away, because tests are not actually about working code, tests are about working solution:

Software testing can provide objective, independent information about the quality of software and risk of its failure to users and/or sponsors.

Software testing by wikipedia

So I encourage to be suspicious not only about the code, but also about the user stories, use-cases and general requirements and specification,  because the most expensive bugs are those that are hide in product by design. Here is a table from wiki to scare and motivate you more:

Cost to fix a defect

Time detected
Requirements Architecture Construction System test Post-release
Time introduced Requirements 5–10× 10× 10–100×
Architecture 10× 15× 25–100×
Construction 10× 10–25×

From what to start?

Of course you should start with test strategy, I guess you should already have one developed by PM/QA team, probably even before the actual project development have started.

A test strategy is an outline that describes the testing approach of the software development cycle. It is created to inform project managers, testers, and developers about some key issues of the testing process. This includes the testing objective, methods of testing new functions, total time and resources required for the project, and the testing environment.

Test strategy by wikipedia

Test Strategy consists of multiple components that play together, but as a developer you should pay most attention to following:

  • Environment Requirements   
  • Risks and Mitigation
  • Regression test approach
  • Test Groups
  • Test Priorities
  • Test Status Collections and Reporting

Environment Requirements

Reproducibility is the ability of an entire experiment or study to be duplicated, either by the same researcher or by someone else working independently. Reproducing an experiment is called replicating it. Reproducibility is one of the main principles of the scientific method.

Reproducibility by wiki

Reproducibility is a key figure when we talk about Environment Requirements, because that is all about reproducibility. That is a quite huge and complicated field of software development, I won't get into much details about it, but there is a couple very important things that you should not miss if you try to achieve reproducibility:

  1. System Dependencies
  2. Project dependencies 
  3. Set up scripts

To ensure that all is Okey with system dependencies you need a test that builds environment on clean instance, for that there is quite nice visualization/containerization tools such as docker or vagrant just make a test that builds project on clean environment and run all tests inside of it, in most cases that would be more than enough to ensure that you don't have any hidden issues with system dependencies and setup.  Also use build system that allows efficient version management, including tracking of unpinned versions, make you tests to check if all versions are pinned.

Types of testing

 Unit testing - Black box testing - Doctests

Not all code is good for testing

fuck state shared between tests

Chris McDonough

The original quote is about `py.test` framework, but it gives a good idea about that all tests should be stateless as much as possible and that aside the way of tests design it also goes to code design as well, tests and good architecture are actually a quite tight couple and there on small example I try to show why. As an example I use a bit modified reddit ranking algorithm implementation, below is code rewritten it in the way I often see  people write their own code without splitting it on units:

from datetime import datetime, timedelta
from math import log, sqrt
from operator import itemgetter

epoch = datetime(1970, 1, 1)

def get_sorted_by_ranks(request):
    date = now()
    ranked_posts = []
    for post in request.user.posts:
        comments_score = 0
        for comment in post.comments:
            n = comment.ups + comment.downs
            if n == 0:
            z = 1.281551565545
            p = float(post.ups) / n
            left = p + 1 / (2 * n) * z * z
            right = z * sqrt(p * (1 - p) / n + z * z / (4 * n * n))
            under = 1 + 1 / n * z * z
            comments_score += (left - right) / under
        s = post.ups - post.downs + comments_score
        order = log(max(abs(s), 1), 10)
        sign = 1 if s > 0 else -1 if s < 0 else 0
        td = date - epoch
        seconds = td.days * 86400 + td.seconds + (float(td.microseconds) / 1000000) - 1134028003
        rank = round(sign * order + seconds / 45000, 7)
        ranked_posts.append({'post': post, 'rank': rank})
    return ranked_posts

As you can see it implements ranking of posts and posts comments and returns posts sorted by a rank. How would you test if comments ranking works right? And what about posts ranking? Only can test it all together with kind of international test and  that would require a tons of mocked up data to test the flow and edge cases, also request object is required to be constructed, but in the end you will never know what actually is falling posts or comments ranking... A way better approach is how reddit actually do it, by splitting code on stateless units, which can be easily covered by simple unit  tests:

def confidence(ups, downs):
    n = ups + downs
    if n == 0:
        return 0
    z = 1.281551565545
    p = float(ups) / n
    left = p + 1 / (2 * n) * z * z
    right = z * sqrt(p * (1 - p) / n + z * z / (4 * n * n))
    under = 1 + 1 / n * z * z
    return (left - right) / under

def epoch_seconds(date):
    td = date - epoch
    return td.days * 86400 + td.seconds + (float(td.microseconds) / 1000000)

def score(ups, downs, comments_score):
    return ups - downs + comments_score

def hot(ups, downs, date, comments_score):
    s = score(ups, downs)
    order = log(max(abs(s), 1), 10)
    sign = 1 if s > 0 else -1 if s < 0 else 0
    seconds = epoch_seconds(date) - 1134028003
    return round(sign * order + seconds / 45000, 7)

def get_sorted_by_ranks(posts):
    date = now()
    return sorter({
        'post': post,
        'rank': hot(
            sum(confidence(c.ups, c.downs) for c in post.comments))
    } for post in posts, key=itemgetter('rank'))


Testing frameworks and libs

  • sphinx
  • robotframewort
  • nose
  • py.test
  • tox

TDD/BDD/ADD what else?

The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown. There are several common ways to use doctest:

  • To check that a module’s docstrings are up-to-date by verifying that all interactive examples still work as documented.
  • To perform regression testing by verifying that interactive examples from a test file or a test object work as expected.
  • To write tutorial documentation for a package, liberally illustrated with input-output examples. Depending on whether the examples or the expository text are emphasized, this has the flavor of “literate testing” or “executable documentation”.
Prev Post Next Post