A collection of computer, gaming and general nerdy things.

Wednesday, April 29, 2015

PEP 484 and Me

So PEP 484 is a thing. It's about type hinting in Python and seems to be heavily influenced by mypy-lang. However, this isn't a type system. It's meant as a helper for static code analysis. There's no type enforcement -- at least to my understanding. Basically, we'd be able to load up pyflakes or PyCharm and receive information on what the parameters are expected to be or if at some point we've got a type mismatch.

There's been a lot of talk about this. Some in favor, some not.

On one hand, I get it. This is super helpful for analysing a new code base -- assuming it's been used. :/ On the other hand, it's down right ugly. I'm not a big fan of inlining types, at all. Some things aren't so bad...

In [1]:
import typing as t

def add(x: int, y: int) -> int:
    return x+y

Not so bad. Just a simple add function, we see it takes two ints and returns an int. However, for something more complicated, let's say zipWith it's gets ugly really fast.

Here's the comparable Haskell type:

zipWith (a -> b -> c) -> [a] -> [b] -> [c]

And here's the proposed PEP syntax:

In [2]:
A, B, C = t.TypeVar('A'), t.TypeVar('B'), t.TypeVar('C')

def zip_with(func: t.Callable[[A, B], C], a: t.List[A], b: t.List[B]) -> t.List[C]:
    return map(func, a, b)

There's so much information in the parameter line I can hardly see what's actually relavant. This is something that really bothers me about all inlined types. Here's the proposed PEP syntax for something as simple as compose:

In [3]:
# compose :: (b -> c) -> (a -> b) -> (a -> c)
def compose(f: t.Callable[[B], C], g: t.Callable[[A], B]) -> t.Callable[[A], C]:
    return lambda x: f(g(x))

print(compose.__annotations__)
{'f': typing.CallableCallable[[~B], ~C], 'return': typing.CallableCallable[[~A], ~C], 'g': typing.CallableCallable[[~A], ~B]}

Using a decorator was explictly shot down in the PEP under the argument that it's verbose and function parameters would need to be repeated. However, I find the current proposed syntax to already be verbose.

Moreover, a special type of file was proposed: Stub files. These would be additional files maintainers right that mirror the structure of an existing project only to provide annotated functions. If decorators are being shot down as unnecessarily verbose, this should too even if addresses the issue of Python 2 and 3 compatibility. I surely don't want to maintain essentially two copies of my project structure to get the minimal benefits of type hinting. And I certainly think that projects that begin using these will see a decline in contributitions -- if your project is using stub files already, surely the onus will be on the committer to maintain changes in the stubs as well.

Breaking out the type definitions into a separate line would go a long way to clean it up. Retyping parameters shouldn't be needed, just doing something like this would help:

@typed(t.Callable[[B], C], t.Callable[[A], B], returns=t.Callable[[A], C])
def compose(f, g):
    return lambda x: f(g(x))

Using the special keyword syntax introduced in Python 3.0 provides a clean break between input and output types. And using a decorator to separate the concern of "this is type information" from "these are the parameters" is what decorators do.

As a proof of concept:

In [4]:
import inspect
from functools import wraps

def typed(*types, returns):
    def deco(f):
        # todo handle *args, **kwargs
        params = inspect.getargspec(f).args
        if not len(types) == len(params):
            raise TypeError("Must provide types for all parameters")
        annotations = {a: t for a, t in zip(params, types)}
        annotations['return'] = returns
        f.__annotations__ = annotations
        @wraps(f)
        def wrapper(*args, **kwargs):
            return f(*args, **kwargs)
        return wrapper
    return deco
In [5]:
@typed(t.Callable[[B], C], t.Callable[[A], B], returns=t.Callable[[A], C])
def compose(f, g):
    return lambda x: f(g(x))
In [6]:
compose.__annotations__
Out[6]:
{'f': typing.CallableCallable[[~B], ~C],
 'g': typing.CallableCallable[[~A], ~B],
 'return': typing.CallableCallable[[~A], ~C]}
In [7]:
@typed(A, returns=C)
def mismatched(a, b):
    pass
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-2d7bfefaa7d3> in <module>()
----> 1 @typed(A, returns=C)
      2 def mismatched(a, b):
      3     pass

<ipython-input-4-e8ade1e4ee86> in deco(f)
      7         params = inspect.getargspec(f).args
      8         if not len(types) == len(params):
----> 9             raise TypeError("Must provide types for all parameters")
     10         annotations = {a: t for a, t in zip(params, types)}
     11         annotations['return'] = returns

TypeError: Must provide types for all parameters

Of course, there's still the issue of things like classes that accept instances of themselves as arguments to methods. The cannonical example appears to be Nodes:

In [8]:
class Node:
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

Since class names aren't evaluated until the entire body of the class is evaluated, it's impossible to straight up reference the class in the top level of the class, i.e.:

class Node:
    def __init__(self, value: t.Any, left: Node, right: Node):
        ...

This results in a NameError because of the inside out evaluation (something that has bitten me before, but was easy enough to work around in that case). I believe the current fix for this is actually inheriting from something like Generic[T], i.e.:

In [9]:
class Node(t.Generic[t.T]):
    def __init__(self, left: t.T, right: t.T):
        pass

Nevermind the fact that I think imposing this requirement is ridiculous not only because should types be out of the way, the implication is that I'm gaining some tangible runtime benefit by inheriting from Generic[T] -- we're not, static type analysis is an "offline" thing.

Also the problem of using my own metaclass arises. These type variables are scaffolded around using abc.ABCMeta as a base, which is fine until the fact that we can only have one metaclass in a heirarchy comes into play. Wah wah wah.

I don't think that type hinting is necessarily a bad thing. However, I think as the PEP is written currently, we're sacrificing quite a bit for minimal gain.

2 comments:

  1. As far as I understand, the problem of referencing a not yet declared type is solved by typing the name of the type as a string (that is, between quotes).

    ReplyDelete