A collection of computer, gaming and general nerdy things.

Wednesday, August 20, 2014

Decorator Day

An Aside: I originally wrote this as just a massive brain dump but in retrospect, it wasn't helpful at all. In the process of preparing a presentation for PyATL, I decided to rewrite it to be much more informative and less example heavy. The original post was essentially just a series of examples of how you'd use decorators or what you'd use them for rather than why you'd use decorators and addressing patterns. I'm hoping this rewrite clarifies that instead of being a "Look at all this code you probably don't understand!"

Decorators are well championed but most posts I've seen about them don't address many of the issues here. I wouldn't say they're something beginner or novice friendly since it delves into concepts like functional programming and "higher order functions" which is just scary talk for "functions that accept functions as arguments" and even currying, which basically boils down to "reducing the number of inputs a function takes." I'm grossly over simplifying here for the sake of explaination, but I stand by these simplifications.

Another note, apparently there was a lot of hemming and hawing in the Python mailing list when this construct (for lack of a better word) was introduced because of confusion with the decorator design pattern. Given that design patterns aren't my strongest suit (at least in terms of knowing exactly what each does and how to best implement it), I think there's quite of bit of one-to-one between the two on a basic level. Each take in an object and modifies it in a way that the original shouldn't be worried about. The decorator pattern seems to be more concerned with adding data or processing information and is compositional whereas Python's decorators seem to be more about removing logic that, yes, is associated with the object in question but doesn't need to be mixed directly with the actual logic of the object. That makes it sound very confusing, but hopefully that will become clear through examples and explaination.

Decorator Day

This is meant to be a "hit the ground running" sort of thing decorators. Just enough information to make you feel confident but not enough that you know all there is to them. Though, honestly, once they click, they're not so complicated.

I'm going to make the dangerous assumption that you have a decent grasp on functions and how Python handles them. I'm going to briefly cover a bit about that, but just what -- I think -- is requisite knowledge for understanding decorators. If you're confident with the concept of first class functions and closures, feel free to skip the next two sections.

And just remember, a little knowledge is a dangerous thing; I'm giving you enough rope to hang yourself here but not enough to build a bridge. That comes with practice and exploration as well as digging deeper into subject matter. Don't mistake me for a guru expert code ninja wizard (though, I do hope to attain that title some day) as I'm in the mire with everyone else.

First Class Functions

The first, and probably most important, piece of background knowledge is that everything in Python is an object. Everything. Functions, Objects, Strings, and even Classes. This is what makes decorators possible in Python. We can assign functions to new names, stuff them in data structures, pass them as arguments to other functions and even return them from functions. That's really all there is to first class functions, they are treated just the same as anything else in the language. And since methods are simply functions attached to a class or object, the same applied to them as well. str.join is just as first class as map.

In [1]:
my_func = dir

funcs = [filter, list, int]

map(str, [1, 2, 3])

def returns_func():
    return dict

Closures

The concept of closures scares a lot of people. But they're really simple: closures are just functions defined inside of another function. That's it.

But there's magic involved, too. Closures close over the available outer scope and stuff it into their __closure__ attribute -- which is just simply a tuple of cell objects that contain other things. This is essentially a snapshot of the outer function when it's created. Usually, the inner function is returned by the outer function.

In [2]:
def adds_four():
    x = 4
    def inner(y=0):
        return x+y
    return inner

o = adds_four()
print(o())
print(o.__closure__)
4
(<cell at 0x7f26c154bd68: int object at 0x9f8860>,)

The same logic applies to values passed into a function as well. When an argument is passed into a function (say outer actually looked like outer(x=4)), those values become part of the scope of the function as well and are available to closures for use if needed. Maybe I'll go deeper into closures in the future, maybe not.

An aside

If you're wondering how Python knows what cell object to use well, my understanding is that it doesn't and relies on the generated byte code to do the right thing. But I could very well be completely wrong on this matter.

So, Decorators

Now we finally get to decorators. Decorators are functions that take in a function, do something and return something -- typically a callable, but it could be anything. Usually, decorators are used to either remove extraneous logic from a function or to attach additional functionality to it.

There's two ways to use decorators: the old way, pre-2.5, and the special @decorator syntax. They do the same thing, like using list(1, 2, 3) or [1, 2, 3], one's just syntatic sugar.

In [3]:
def decorator(f):
    return f

def my_func():
    pass

my_func = decorator(my_func) # this is the old way

@decorator
def my_func():
    pass

The @decorator syntax is the one you'll see in most cases. Basically, what you're telling Python is when it's done creating my_func pass it to decorator and then assign the result back to my_func. It sounds complicated to some, but it's really not.

@decorator is just a way of shorthanding your intention to the interperter. And more importantly, it gives a heads up to people reading your code that something more is happening behind the scenes. If you manually wrap functions (which is needed sometimes), this can get lost at the bottom of your function definition unless you make glaringly obvious what's happening. This applies double to longer fuctions and objects.

If you've got that down, you'll probably get the common patterns that follow.

Patterns

There's plenty of patterns you'll encounter when working with decorators. I'm hoping to identify some of the most common ones.

Pass Through Decorators

Basically, this is a decorator that takes in an object, does something and returns the original object. It might register it somewhere, gather some information about or add an attribute to it.

In [4]:
def pass_through(f):
    f.inspected = True
    return f

@pass_through
def thing():
    return "Thing()"

assert hasattr(thing, 'inspected')
assert thing.__name__ == 'thing'

It's common enough but usually they're not doing anything too exciting or surprising. But it's definitely worth distinguishing it from decorators that use closures.

The rest of the patterns I'm going to cover are all function closure oriented, except for two which will be obvious.

Flexible Decorators

You're going to want your decorators to work with as many functions as possible. Not just the one you initially noticed and thought, "I can factor this other code out into a decorator." But what does a flexible decorator look like? It's best to compare it to a brittle decorator to understand fully.

In [5]:
def brittle(f):
    def wrapper(arg1, arg2, kwarg1=None, kwarg2=None):
        return f(arg1, arg2, kwarg1, kwarg2)
    return wrapper

def flexible(f):
    def wrapper(*args, **kwargs):
        return f(*args, **kwargs)
    return wrapper

What's so important about this that we end up using wrapper when we call a decorated function. If we used the first example, the brittle one, we'd have to write every function we decorate with it to accept four arguments. Where as the second example, the flexible one, can accept any number of positional or keyword arguments.

To give a concrete example...

In [6]:
@flexible
def dinner(food, beer): # <- becomes wrapper
    print('Eating', food, 'and drinking', beer)

dinner('bratzel', 'azazel') # <- actually wrapper
Eating bratzel and drinking azazel

I'm a little homesick for Memphis. If you're ever there, I'd recommend the Flying Saucer downtown. Anyways...

Preserving metadata

When you lock your original function up in a closure, you lose the original metadata: __name__, __doc__, etc. There's a few ways to retain it. Instead of hopping to the correct way, I'll show you why you'll want to do that instead. Just like ensuring defaults are set in a dictionary, there's an increasing level of "correctness" or "Pythonness" from what you should start with and what you should end with.

The first thing you might try is manually transfering these attributes inside the decorator.

In [7]:
def loggit(f):
    def logger(*args, **kwargs):
        print("Calling {}".format(f.__name__))
        return f(*args, **kwargs)
    logger.__name__ = f.__name__
    logger.__doc__ = f.__doc__
    logger.__dict__.update(f.__dict__)
    return logger

Other than the fact that we're missing three crucial attributes, this is a terrible way to migrate attributes. You'll have to do this in every decorator you write -- meaning it'll be no fun to maintain when a new attribute is introduced. This is very similiar to the basic idiom of checking if a key is in a dictionary before we do something with it:

if something not in my_dict:
    my_dict[something] = [value]
my_dict[something].append(value)

Well, decorators are good at removing extraneous code from a function, let's try writing a decorator decorator -- as confusing as that sounds, it's pretty common, so we're catching two pythons with one net.

In [8]:
def decorator_decorator(deco):
    def wrapper(f):
        g = deco(f)
        g.__name__ = f.__name__
        g.__doc__ = f.__doc__
        g.__dict__.update(f.__dict__)
        return g
    wrapper.__name__ = deco.__name__
    wrapper.__doc__ = deco.__doc__
    wrapper.__dict__.update(deco.__dict__)
    return wrapper

@decorator_decorator
def loggit(f):
    def logger(*args, **kwargs):
        print("Calling {}".format(f.__name__))
        return f(*args, **kwargs)
    return logger

@loggit
def double(x):
    """Doubles a number"""
    return 2 * x

assert double.__name__ == 'double'
assert double.__doc__ == 'Doubles a number'

That's almost better. Except it's really ugly. Like really ugly. And actually, I find it more confusing than helpful. When I initially wrote this code it was as an example of what not to do and it's here for that same reason. It requires you to think on multiple levels at the same time, which is great if you're just trying to anger people looking over it. This decorator_decorator is pretty benign if you're comfortable with what each piece means and works. But if you're doing this in your code, unless you're having to work with 2.4 or below, you're doing it the wrong way.

This is most akin to using dict.setdefault to ensure a value is present in a dictionary, and just like that method, I find it confusing every time I look at it (though, there are many times when it's useful):

grouping = my_dict.setdefault(something, [])
grouping.append(value)

However, Python is batteries included and comes with something that handle this for us. Enter functools.wraps and functools.update_wrapper. @wraps is just the decorator form of update_wrapper -- using a decorator to create decorators seems counter intuitive, but it's really not and @wraps is a testament to it's power and another example further down delves a little deeper. But update_wrapper is pretty simple, it merely:

  • Copies needed attributes from the original function to the new
    • This includes the name, docstring, module, qualified name and module attributes
  • Updates the wrapper's dict with the original function's dict
  • As of 3.2, it also supplies a __wrapped__ attribute that contains the original object
    • It doesn't appear that this was backported to Python 2 at all. I'd seriously recommend monkey patching update_wrapper to handle this for reasons that'll become obvious later.

update_wrapper is also smart enough to not blow up if one of these attributes isn't present on the underlying object (such as __name__ on an instance of an object).

Here's how you should be perserving metadata through decorators:

In [9]:
from functools import wraps

def loggit(f):
    @wraps(f)
    def logger(*args, **kwargs):
        print("Calling {}".format(f.__name__))
        return f(*args, **kwargs)
    return logger

@loggit
def double(x):
    """Doubles a number"""
    return 2 * x

assert double.__name__ == 'double'
assert double.__doc__ == 'Doubles a number'
assert not double.__annotations__

Using @wraps and update_wrapper is like when you stumble across defaultdict in the collections module. The complexity and hand holding you were previously doing just melts away into a few easy to grasp lines of code:

my_dict = defaultdict(list)
my_dict[something].append(value)

Minimum Decorator Best Practices

The two best things you can do for yourself is writing flexible decorators (using *args and **kwargs) and preserving metadata through wraps and update_wrapper. 80% of the headaches you'll encounter will be fixed with these. The other 20% are...tricky. I'll touch on those a little later, however.

Decorators that accept optional kwargs

Occasionally, you need to use keyword arguments in your decorators, either to pass context, run setup or just set flags. There's a few ways to do it...

In [10]:
def print_before(phrase="Setting up!"):
    def actual_decorator(f):
        @wraps(f)
        def wrapper(*args, **kwargs):
            print(phrase)
            return f(*args, **kwargs)
        return wrapper
    return actual_decorator

@print_before(phrase="such setup")
def doubles(x):
    return 2 *x

@print_before() # parens required
def random():
    return 4

In case you're wondering, the parens are required because we need to call the outermost function, which generates the decorator which in turn generators the wrapper function we're actually using. To be 100% honest, outside of a handful of cases, I find this particular implementation to be lackluster because usually the outermost function just serves to pass information one level deeper.

However, one of the best arguments for using this particular pattern is passing positional arguments to a decorator. This has more to do with how Python handles unpacking into functions than anything else. But using an outer function like this is certainly the easiest way I've thought of to handle positional arguments to a decorator. You could fiddle with pushing the fuction to the front of *args, but I feel it would quickly become messy.

Back to optional kwargs, instead of having an outermost function that we must explictly call, why not use functools.partial and setup a little check?

In [11]:
from functools import partial

def print_before(f=None, *, phrase="Settin up!"):
    if f is None:
        return partial(print_before, phrase=phrase)
    
    @wraps(f)
    def wrapper(*args, **kwargs):
        print(phrase)
        return f(*args, **kwargs)
    
    return wrapper

@print_before # parens not needed
def double(x):
    return 2*x

@print_before(phrase="such dice roll!")
def random():
    return 4

That's actually a lot better. The most interesting thing here is how using partial allows us to for go the parens when we don't overide keyword arguments. Since we're accepting only one positional argument (via Python 3's keyword only syntax, there's not really a good way to imitate this is Python 2 other than using positional arguments sparingly when calling a function), when Python puts the function into the decorator, it does it positionally. We don't need to call the outermost function explicitly because Python will call it for us. And if we need to override a keyword argument, we can call the function explicitly to do so.

However, we've just moved our primary concern inside the decorator and it's creating noise. Moreover, if we decide that this is the best thing in the world, we'll be copying that code into every decorator we write. And six months from now, when partial is considered as terrible as GOTO, we'll have to replace all these instances (though, you can never take my partials from me!). I'm not sure if that's better or worse than before.

We're already writing a decorator, and we're getting pretty good at them, so why not a decorator that'll handle this for us? This is probably the most confusing example, so hold on:

In [12]:
def optional_kwargs(deco):
    @wraps(deco)
    def wrapper(f=None, **kwargs): # <- actual function our decorator becomes
        if f is None:
            return partial(wrapper, **kwargs)
        return deco(f, **kwargs)
    return wrapper

@optional_kwargs
def print_before(f, *, phrase="Setting up!"): # <- becomes the closure inside in optional_kwargs
    @wraps(f)
    def wrapper(*args, **kwargs):
        print(phrase)
        return f(*args, **kwargs)
    return wrapper

@print_before(phrase="much mind bend!") # <- optional_kwarg's closure
def random(): # <- becomes print_before's closure
    return 4

This looks really complex and intimidating, but it's really not when you get to understand decorators. All we've done is factor out the code print_before shouldn't be concerned with, namely if it actually recieved a function or not. What happens is that we replace print_before with the closure inside optional_kwargs. This closure handles the logic of ensuring that a function was passed or not.

The true power of this becomes apparent when you replace the print statement with something that will preprocess inputs for you: escaping characters or changing integers and floats to Decimal if you're working with financial data. Here, it's simply to illustrate how you would implement the pattern.

Like the horrible decorator_decorator example above, this sort of code forces your brain to operate on multiple levels at the same time. Once you're comfortable with it, this makes quite a bit of sense as to why you'd do it. However, before that, it merely causes a headache.

Using classes as decorators

Sometimes a function closure doesn't provide the umph needed to do something, or packing in the umph creates a twisted web of logic that causes your eyes to roll into the back of your head when you revisit it. Sometimes you just need a good 'ol object to handle things for you. There's two different ways to use objects as decorators, one is to use one instance per wrap and the other is to use one instance for every wrap.

In [13]:
from functools import update_wrapper

class wrapperobj:
    def __init__(self, f):
        self.f = f
        update_wrapper(self, f)
    
    def __call__(self, *args, **kwargs):
        return self.f(*args, **kwargs)

@wrapperobj
def random():
    return 4

When using a single instance per wrap, you need to first store the wrapped obj on the instance (self.f) and then when you need to call it, you simply pass it arguments. Fairly straight forward, but it's pretty contrived example as well. This sort of pattern would be used when your logic becomes much more complex than what a simple function should reasonably handle, things like transcieving data to an event loop or socket for example or creating individual caches for each object.

Let's look at using a single instance to wrap multiple objects.

In [14]:
class onewrapper:
    def __call__(self, f):
        @wraps(f)
        def wrapper(*args, **kwargs):
            return f(*args, **kwargs)
        return wrapper
    
inst = onewrapper()

@inst
def random():
    return 4

This is essentially the same thing as using a function closure as a decorator, except it comes with the extra umph that comes with using an object as well. This sort of pattern would be used for linking several objects together in a shared state, a rudimentary messaging queue should be easy to write with this sort of pattern. Your __init__ method in this case would set the initial state for the object rather than accepting the function there.

And of course, you don't have to use __init__ or __call__ to register and run functions, you can create other methods that would handle these as well. register and run_wrapped for example could easily do the same thing.

If you're wrapping multiple objects with the same instance, it may be helpful to also use some sort of data structure to organize them if you're doing more than simply wrapping them. In most cases the weakref module will either provide the answer you need or set you on the right path; though, there are objects that can't be weak referenced -- including most of the built in or "primitive" types in Python (str, int, float, list, dict and tuple). However, for more complex objects, weakrefs are a powerful tool that let you conserve memory among other things. It's worth balancing using something like WeakKeyDictionary versus a regular dictionary for caching.

I got sidelined, anyways...

Decorating Classes

As of Python 2.6, decorating classes is also supported. There's all the same reasons you'd want to decorate a function with the added benefit that you can apply metaclasses uniformly between Python 2 and 3. Or maybe it's just simpler than writing a metaclass in the first place. This next example comes from the six.py project and allows you to decorate a class to be applied in Python 2 and 3.

In [15]:
# from six.py
def add_metaclass(metaclass):
    """Class decorator for creating a class with a metaclass."""
    def wrapper(cls):
        orig_vars = cls.__dict__.copy()
        slots = orig_vars.get('__slots__')
        if slots is not None:
            if isinstance(slots, str):
                slots = [slots]
            for slots_var in slots:
                orig_vars.pop(slots_var)
        orig_vars.pop('__dict__', None)
        orig_vars.pop('__weakref__', None)
        return metaclass(cls.__name__, cls.__bases__, orig_vars)
    return wrapper

Applying multiple decorators

This generally works. Especially if you've used wraps or update_wrapper consistently. The important thing to note here is that decorators are evaluated from the bottom up. The first thing that happens is that Python defines the original object, then it walks up each decorator, passing the return value from each to the next decorator. That's pretty much all there is to stacking decorators, except for one thing that I'll cover in the next section.

In [16]:
@wrapperobj
@print_before
def random():
    return 4

random()
Setting up!

Out[16]:
4

One nice thing is if you remember to use wraps or update_wrapper, Python will very helpfully pass along added attributes as well (since they are really just key-value pairs sitting in __dict__). In case you forgot, pass_through is the first example in this section, it merely adds an attribute called inspected to an object.

In [17]:
@wrapperobj
@pass_through
def random():
    return 4

assert hasattr(random, 'inspected')
print(random.__dict__)
{'__annotations__': {}, 'f': <function random at 0x7f26c0039d08>, '__qualname__': 'random', '__wrapped__': <function random at 0x7f26c0039d08>, 'inspected': True, '__doc__': None, '__module__': '__main__', '__name__': 'random'}

Issues and Problems

As with anything, there are certain problems and issues you'll run into. Decorators are no exception. At some point, at least one of these will bite you, maybe hard.

Stacking wrong

This is something that will happen more often then you think. However, as long as you remember that they're evaluated from the bottom up and you've used wraps and update_wrapper through out, you'll probably be okay. Sometimes, it's important to note which decorators allow an object to pass through and which lock the object in a closure -- you shouldn't have to, but at least once you will.

However, there is something to be said for the @property, @classmethod and @staticmethod decorators. These return descriptors which are a slightly different sort of object that behave in a peculiar way. I've gone into detail about them else where, but the rule of thumb is to always apply these decorators last.

Testing

This is something of a sticky point. Pass through decorators are easy to test, those wrapped objects aren't hidden away anywhere. However, you're far more likely to encounter a closure decorator which makes testing much harder.

The easiest way is to use the __wrapped__ attribute if it's available. Python 3.4 does provide a very nice function in inspect called unwrap which continually unravels a decorated function until it's no longer wrapped up.

But if neither of these are available to you, and you're decorator does something like connect to a database or ensure a user's logged in, you might be in for a head ache.

Introspection and blowing stuff up.

More than once I've been bitten by bad introspection (looking at you, Google API wrapper), thankfully it was only me manually inspecting a function to see how to use it. Even using update_wrapper isn't enough to solve this problem.

  • Argument specification
  • Closely related is annotations in Python 3
  • Ability to pull source code
  • type and isinstance stop working

I'll admit blowing up code is one of the ways I learn the best. "What did I do that I shouldn't have done and how do I fix it?" Though, sometimes the errors are hard to track down.

Python 2's inspect.getargspec doesn't like being passed an object and complains loudly about it. Python 3's version doesn't seem to mind, though take that with a grain of salt because I'm sure there's a corner case somewhere that blows it up. Python 2 will display the argument specification and source code of the wrapper. Python 3 seems to do the right thing (at least in 3.4, and this may be because of inspect.unwrap and the __wrapped__ attribute).

Wrapping classes can seriously put a kink in your inheritance pattern if you're not careful about it. If your class simply passes through the decorator, you'll be okay. If it doesn't, you need to take care to make sure to either return the actual class (most likely using descriptors) or use something like __wrapped__ or inspect.unwrap in your inheritance line, both are which are ugly.

Similar to inheritance, isinstance isn't too happy about having a function passed to it. This behavior looks consistent between Python 2 and 3. The same fixes should apply here.

The final thing, which I'm unsure about fixing, is using type to identify explicitly what an object is. Descriptors seems like they should work and rudimentary testing with it seems to cause type to behave as expected, but I've not battle tested this -- and the real question is, "Why are you using type anyways?"

Actually fixing these

If you're much more curious about the proper way to fix these issues, I highly recommend Graham Dumpleton's blog series How you implemented your Python decorator is wrong. which goes into so much more depth than I have here and really opened my eyes to what I was actually doing when writing decorators beyond just stuffing one function inside of another. It also serves as a decent introduction into how descriptors work and practical uses of them.

No Man is an Island Unto Himself

This post wouldn't have been possible without the writings of Graham Dumpleton (sourced above), Jeff Knupp and Simeon Franklin. All three of these -- as well as countless other blog posts and StackOverflow questions -- introduced me and explained what decorators are to me. I'd also like to give thanks to both PyATL and Doug Hellman for sparking my interest in giving a presentation as well as the online learning group I'm a member of for sitting through about forty minutes of me listening to the sound of my own voice and asking for clarifications on many things I glossed over initially, thinking that people think the same way I do (which is probably a lot more twisted and convoluted than most).

I'm probably going to rewrite several other posts as well in a more frank "This is what's happening" manner instead of the usual "Here's a bunch of code, I hope you get it!" that makes sense to me more than anyone else.

No comments:

Post a Comment