Reusing Code, or: How I Learned to Stop Repeating Myself

One of the best things about coding is not having to do the same thing over and over again. You automate. You work things into functions and objects and have them worry about completing a series of actions for you. Why wouldn't you do the same thing when actually writing code?

There are times where you find yourself repeating code; when this happens, you should consider if it's possible to refactor and break the issue into a reuable piece of code. Generally, the rule of three comes in play:

There are two "rules of three" in [software] reuse:

* It is three times as difficult to build reusable components as single use components, and
* a reusable component should be tried out in three different applications before it will be sufficiently general to accept into a reuse library.

Facts and Fallacies of Software Engineering #18 Credit to Jeff Atwood's Coding Horror post about the Rule of Three for bringing it to my attention.

About This Post

This post is just going to be a brief overview of common techniques and patterns to avoid writing the same thing over and over again. Starting with functions and moving into objects, inheritance, mixins, composition, decorators and context managers. There's plenty of other techniques, patterns and idioms that I don't touch on either but this post isn't meant to be an exhaustive list either.

Functions

Functions are a great way to ensure that a piece of code is always executed the same way. This could be as simple a small expression like (a + b) * x or something that performs a complicated piece of logic. Functions are the most basic form of code reuse in Python.

In [1]:

def calc(a, b, x):
    """Our business crucial algorithm"""
    return (a + b) * x

calc(1,2,3)

Out[1]:

Python also offers a limited form of anonymous functions called lambda. They're limited to just a single expression with no statements in them. A lot of them time, they serve as basic callbacks or as key functions for a sort or group method. The syntax is simple and the return value is the outcome of the expression.

In [2]:

sorted([(1,2), (3,-1), (0,0)], key=lambda x: x[1])

Out[2]:

[(3, -1), (0, 0), (1, 2)]

While lambdas are incredibly useful in many instances, it's generally considered bad form to assign them to variables (since they're supposed to be anonymous functions), not that I've never done that when it suited my needs. ;)

Objects

Objects are really the poster child for code reuse. Essentially, an object is a collection of data and functions that inter relate. Many in the Python community are fond of calling them a pile of dictionaries -- because that's what they essentially are in Python.

Objects offer all sorts of possibilities such as inheritance and composition, which I'll briefly touch upon here. For now, a simple example will suffice: take our business critical algorithm and turn it into a spreadsheet row

In [3]:

class SpreadsheetRow:
    
    def __init__(self, a, b, x):
        self.a = a
        self.b = b
        self.x = x
    
    def calc(self):
        return calc(self.a, self.b, self.x)
    
row = SpreadsheetRow(1,2,3)
print(row.calc())

Notice how we're already reusing code to find our business critical total of 9! If later, someone in accounting realizes that we should actually be doing a * (b + x), we simply change the original calculation function.

Inheritance

Inheritance is simply a way of giving access of all the data and methods of a class to another class. It's commonly called "specialization," though Raymond Hettinger aptly describes it as "delegating work." If later, accounting wants to be able to label all of our spreadsheet rows, we could go back and modify the original class or we could design a new one that does this for us.

Accessing information in the inherited class is done through super(), I won't delve into it's details here but it is quite super.

In [4]:

class LabeledSpreadsheetRow(SpreadsheetRow):
    
    def __init__(self, label, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.label = label
        
row = LabeledSpreadsheetRow(label='1', a=1, b=2, x=3)
print("The total for {} is {}".format(row.label, row.calc()))

The total for 1 is 9

Mixins

Mixins are a type of multiple inheritance, which I won't fully delve into here because it's a complicated and touchy subject. However, Python supports it. Because of this and it's support for duck typing, we can completely forego the use of Interfaces and Traits which are common in single inheritance languages.

Mixins are a way of writing logic that is common to many objects and placing it in a single location. Mixins are also classes that aren't meant to be instantiated on their own either, since they represent a small piece of a puzzle rather than the whole picture. A common problem I use mixins for is creating a generic __repr__ method for objects.

In [5]:

class ReprMixin:
    
    def __repr__(self):
        name = self.__class__.__name__
        attrs = ', '.join(["{}={}".format(k,v) for k,v in vars(self).items()])
        return "<{} {}>".format(name, attrs)
        
class Row(LabeledSpreadsheetRow, ReprMixin):
    pass

row = Row(label='1', a=1, b=2, x=3)
repr(row)

Out[5]:

'<Row b=2, x=3, a=1, label=1>'

This showcases the power of inheritance and mixins: composing complex objects from smaller parts into what you're wanting. The actual class we're using implements no logic of it's own but we're now provided with:

A repr method
A calculation method
A label attribute
Data points to calculate

Composition

Composition is a fancy way of saying we're going to build an object using other objects, in other words: composing them from parts. It's a similar idea to inheritance, but instead the objects we're using are stored as attributes on the main object. We have spreadsheet rows, why not a spreadsheet to hold them?

In [6]:

class Spreadsheet(ReprMixin):
    
    def __init__(self, name):
        self.name = name
        self.rows = []
        
    def show_all(self):
        for row in self.rows:
            print("The total for {} is {}".format(row.label, row.calc()))
            
    def total(self):
        return sum(r.calc() for r in self.rows)
        
sheet = Spreadsheet("alec's totals")
sheet.rows.extend([Row(label=1, a=1, b=2, x=3), Row(label=2, a=3, b=5, x=8)])
sheet.show_all()
print(sheet.total())
repr(sheet)

The total for 1 is 9
The total for 2 is 64
73

Out[6]:

"<Spreadsheet name=alec's totals, rows=[<Row b=2, x=3, a=1, label=1>, <Row b=5, x=8, a=3, label=2>]>"

Here we're not only reusing the ReprMixin so we can have accurate information about our Spreadsheet object, we're also reusing the Row objects to provide that logic for free, leaving us to just implement the show_all and total methods.

Decorators

Decorators are a way factoring logic out of a class or function and into another class or function. Or to add extra logic to it. That sounds confusing, but it's really not. I've written about them elsewhere, so if you're unfamiliar with them I recommend reading that first. Here, we're going to use two decorators Python provides in the standard library called total_ordering so we can sort our Row objects and the other is the property decorator which allows us to retreat a function as if it were an attribute (via the descriptor protocol which is a fantastic code reuse ability that I won't explore here).

In [7]:

from functools import total_ordering

@total_ordering
class ComparableRow(Row):
    
    @property
    def __key(self):
        return (self.a, self.b, self.x)
    
    def __eq__(self, other):
        return self.__key == other.__key
    
    def __lt__(self, other):
        return self.__key < other.__key
    
rows = sorted([ComparableRow(label=1, a=3, b=5, x=8), ComparableRow(label=2, a=1, b=2, x=3)])
print(rows)

[<ComparableRow b=2, x=3, a=1, label=2>, <ComparableRow b=5, x=8, a=3, label=1>]

What total_ordering does is provide all the missing rich comparison operators for us. Meaning even though we only defined __lt__ and __eq__ here, we also have __le__, __gt__, __ge__, and __ne__ available to us.

Decorators are an incredibly powerful to modify your regular Python functions and objects.

Context Managers

Context managers are a way of handling operations you typically do in pairs: open a file, close a file; start a timer, end a timer; acquire a lock, release a lock; start a transactio, end a transaction. Really, anything you do in pairs should be a candidate for context managers.

Writing context managers is pretty easy, depending on which method you go about. I'll likely explore them in a future post. For now, I'm going to stick to using the generator context manager form as an example:

In [8]:

from contextlib import contextmanager

@contextmanager
def greeting(name=None):
    print("Before the greeting.")
    yield "Hello {!s}".format(name)
    print("After the greeting.")
    
with greeting("Alec") as greet:
    print(greet)

Before the greeting.
Hello Alec
After the greeting.

We won't be writing a context manager here, but rather using one to implement an "alternate constructor" for our Spreadsheet class. Alternate constructors are a way of initializing an object in a specific way. These are especially handy if you find yourself occasionally creating an object under certain conditions. Consider dict.fromkeys which lets you fill a dictionary with keys from an iterable that all have the same value:

In [9]:

print(dict.fromkeys(range(5), None))

{0: None, 1: None, 2: None, 3: None, 4: None}

In our case, we'll probably want to draw our information from a CSV file occasionally. If we do it often enough, writing the setup logic could become tedious to rewrite all over the place.

In [10]:

import csv

class CSVSpreadsheet(Spreadsheet):
    
    @classmethod
    def from_csv(cls, sheetname, filename):
        sheet = cls(sheetname)
        with open(filename) as fh:
            reader = csv.reader(fh.readlines())
            sheet.rows = [ComparableRow(*map(int, row)) for row in reader]
        
        return sheet
    
sheet = CSVSpreadsheet.from_csv('awesome', 'row.csv')
sheet.show_all()

The total for 1 is 9
The total for 2 is 64
The total for 3 is 16

Fin

Hopefully this gives you an idea for reusing code in your own projects. Maybe you'll write your own crappy spreadsheet object as well.

alec got a blog

Pages

Thursday, October 9, 2014

Code Reuse in multiple forms

Reusing Code, or: How I Learned to Stop Repeating Myself

About This Post

Functions

Objects

Inheritance

Mixins

Composition

Decorators

Context Managers

Fin

No comments:

Post a Comment

About Me

Baps

Louis

Labels

Blog Archive