A collection of computer, gaming and general nerdy things.

Showing posts with label monad. Show all posts
Showing posts with label monad. Show all posts

Wednesday, April 29, 2015

I wrote a monad tutorial for some reason...

I swore to myself up and down that I wouldn't write one of these. But then I went and hacked up Pynads. And then I wrote a post on Pynads. And then I posted explainations about Monads on reddit. So what the hell. I already fulfilled my "Write about decorators when I understand them" obligation and ditto for descriptors. So Monads, why not...

It's simple, a monad is like a...

No. Stooooooop. :( Burritos. Bucket brigades. Semicolons. All these analogies just confused me for a long time. And then I "got them" and by "got them" I mean "Even more hopelessly confused but I didn't know that." Like what does "programmable semicolon" even mean? Every language I've used (which isn't many) a semicolon means "This bit of code ends here, kthxbai". The burrito analogy was meant as a critique of this phenomenon -- and I'll likely fall victim of the "Monad Tutorial Curse". And the bucket brigade was a valiant effort by a SO user to explain them.

It's simple, a monad is like a Unix Pipe

Instead of reaching for some non-programming analogy like burritos or bucket brigades, I think Unix Pipes are a pretty good analogy to Haskell-style monads. Let's say I'm in a directory that has a bunch of different types of files -- maybe it's the bottomless bin that is ~/Downloads ): And I want to find all the MP4 files in the top level directory and print them out:

ls -lh ~/Downloads | grep -i "*mp4" | less

Super simple. We take the first command ls feed it some options and a directory to list out. Then | goes "Oh, you have output and I have this thing that needs input, here grep!" And then grep does its business and | steps back in and goes "Oh, you have output and I have this thing that needs input, here less!"

Of course it isn't a perfect analogy. But all analogies break down under scrutiny. But this is essentially what Haskell's >>= does. "Oh, you have output, let me feed it to this function that wants input!" That's it. Monads are about chaining together a series of actions of functions (depending on how you want to look at it) in a way that each action/function returns something that can carry the chain forward somehow.

But the short of monads is that they have nothing to do with I/O, impure values, side effects or anything else. Those are implementation specific to certain monads. Monads in general only deal with how to combine expressions.

But Python doesn't have monads

Eh. It all depends on how you want to look at it. Sure, it doesn't have Haskell style monads. But it doesn't need to. Let's look at something:

In [1]:
x = y = '     Fred\n Thompson '

I have that input. But I need output that looks like this: "JACK THOMPSON". The obvious way is doing it imperatively:

In [2]:
x = x.replace('Fred', 'Jack')
x = x.replace('\n', '')
x = x.strip()
x = x.upper()
print(x)
JACK THOMPSON

And it works. Or I could just chain all those operations together:

In [3]:
print(y.replace('Fred', 'Jack').replace('\n', '').strip().upper())
JACK THOMPSON

Each string method returns a new string that can carry the chain forward. We can add in as many string methods that return a string. But if we place something like split or find then our chain can't be continued as there's a list or a integer now. That's not to say we can't continue the chain, but we likely need to do in a separate expression (which is okay).

Worshipping at the altar of bind

So Haskell style monads are pretty much defined by the presence of >>= and return. return just lifts a value into a monad. And >>= is the sequencing operator. Neither of these are magic, we need to define them ourselves. I like using Maybe as an example because it's simple enough to explain but addresses a real world problem: Null Pointer Exceptions. (:

We usually avoid this sort of thing with this pattern in Python:

In [4]:
def sqrt(x):
    if x is None:
        return None
    return x**.5

print(sqrt(4))
print(sqrt(None))
2.0
None

We can use this to process information from STDIN (for example):

In [5]:
def int_from_stdin():
    x = input()
    return int(x) if x.isdigit() else None
In [6]:
maybe_int = int_from_stdin()
print(sqrt(maybe_int))
a
None
In [7]:
maybe_int = int_from_stdin()
print(sqrt(maybe_int))
4
2.0

We just have to make sure we include the if x is None check everywhere. That's easy. Right. ...right? guise? On top of it being something to remember, it's line noise. Completely in the way of what we're attempting to accomplish. Instead, let's look at Maybe in terms of Haskell and Python:

data Maybe a = Nothing | Just a

instance Monad Maybe where
    return = Just
    (Just x) >>= f = f x
    Nothing  >>= f = Nothing

We have the type constructor Maybe which has two data constructors Just and Nothing. In Python terms, we have an abstract class Maybe and two implementations Just and Nothing. When we have a Just and >>= is used, we get the result of the function with the input of whatever is in Just. If we have Nothing and >>=is used, we get Nothing (Nothing from nothing leaves nothing. You gotta have something, if you wanna be with me). Notice that onus to return a Maybe is on whatever function we bind to. This puts the power in our hands to decide if we have a failure at any given point in the operation.

In Python, a simplified version looks a lot like this:

In [8]:
class Maybe:
    @staticmethod
    def unit(v):
        return Just(v)
    
    def bind(self, bindee):
        raise NotImplementedError
    
class Just(Maybe):
    
    def __init__(self, v):
        self.v = v
        
    def __repr__(self):
        return 'Just {!r}'.format(self.v)
    
    def bind(self, bindee):
        return bindee(self.v)

class Nothing(Maybe):
    
    def bind(self, bindee):
        return self
    
    def __repr__(self):
        return 'Nothing'

And we can use this to reimplement our int_from_stdin and sqrt functions above:

In [9]:
def int_from_stdin():
    x = input()
    return Just(int(x)) if x.isdigit() else Nothing()

def sqrt(x):
    return Just(x**.5)

And chain them together like this:

In [10]:
int_from_stdin().bind(sqrt)
4
Out[10]:
Just 2.0
In [11]:
int_from_stdin().bind(sqrt)
a
Out[11]:
Nothing

What >>= does isn't just sequence actions together. That's easy to do, we could have accomplished them the same thing before with sqrt(int_from_stdin()). However, the real magic sauce of >>= is abstracting how they're sequenced. In this case, sequencing a Just results in feeding the contained value of Just to a function and getting back a Maybe. And sequencing a Nothing results in Nothing.

The great thing about Maybe is we're allowed to decide at an arbitrary point if we even want to continue with the computation or bail out completely. Let's say we have something against even numbers. Perhaps it's that only one of them is Prime. But we like odds. So if we get an even number from STDIN, we'll just bail out.

In [12]:
def only_odds(x):
    return Just(x) if x&1 else Nothing()

int_from_stdin().bind(only_odds).bind(sqrt)
4
Out[12]:
Nothing
In [13]:
int_from_stdin().bind(only_odds).bind(sqrt)
3
Out[13]:
Just 1.7320508075688772

Other ways to sequence

Obviously bind/>>= isn't the only way to interact with monads if they're just about sequencing functions together. For example, Scala has a suped-up version of Maybe called Option. It's the same basic structure: Some (our successful computation) and None (a failed computation). It also has ways of recovering from a possibly failed computation with its getOrX methods. For example, if we have Some("abc") we can do this to recover when check if d is present:

Some("abc") filter (i => match i indexOf "d" {
                      case -1 => None
                      case _  => Some(i)
                      }
                    }) getOr "d"

Which should return "d" but Scala isn't my mother tongue, so there's probably an error somewhere.

You could argue that SQLAlchemy is monadic as well based on how you build queries in it:

q = session.query(Person).filter(Person.name.startswith('A')).first()

SQLAlchemy queries return query objects that can carry the chain further, allowing us to craft complicated queries in a relatively simple manner.

I found a more clever example in a thread on /r/learnpython about what features would you implement in Python given that chance. Below the "Everything not nailed down in Haskell" comment, there was one about universal function call syntax from D. /u/AMorpork proposed simply creating a monad where __getattr__ is the sequencing operation (reproduced here):

In [14]:
from itertools import islice
import builtins as __builtin__

def take(n, it):
    return islice(it, n)

class UFCS(object):
    def __init__(self, value):
        self.state = value

    def __getattr__(self, item):
        try:
            func = getattr(__builtin__, item)
        except AttributeError:
            func = globals()[item]
        def curried(*args):
            if not args:
                self.state = func(self.state)
            else:
                args = list(args)
                args.append(self.state)
                self.state = func(*args)
            return self

        return curried

    def get(self):
        return self.state
In [15]:
x = ['#3.462289264065068', 
     '4.283990003510465', 
     '#1.7285949138067824', 
     '#2.6009019446392987', 
     '5.089491698891653', 
     '3.854140130424576', 
     '4.118846086899804', 
     '5.110436429053362', 
     '9.044631493138326', 
     '5.503343391187907', 
     '1.4415742971795897', 
     '2.7162342709197618', 
     '9.438995804377226', 
     '1.8698624486908322', 
     '4.008599242523804', 
     '8.914062382096017', 
     '4.120213633898632', 
     '6.9189185117106975',
     # more were included, but removed here
     ]

UFCS(x).filter(lambda s: s and s[0] != "#").map(float).sorted().take(10).list().print()
[1.4415742971795897, 1.8698624486908322, 2.7162342709197618, 3.854140130424576, 4.008599242523804, 4.118846086899804, 4.120213633898632, 4.283990003510465, 5.089491698891653, 5.110436429053362]
Out[15]:
<__main__.UFCS at 0x7fd4c064ee10>

It's simple, a monad is like a...

Hopefully this goes a long way to explaining the idea of Monads in terms of programming. Maybe I fell upon the Monad Tutorial Fallacy. However, in the event that I've hopeless confused someone more, drop me a line and I'll be happy to go into further detail.

Monday, April 6, 2015

Fizzbuzz With Pynads!

In [1]:
from pynads import Writer
from pynads.funcs import multibind
from itertools import repeat
from functools import partial

pairs = ((5, 'fizz'), (3, 'buzz'))

def fizzer(n, pairs):
    fizzed = ''.join([buzz for fizz, buzz in pairs if not n % fizz]) or n
    return Writer(n+1, {n:fizzed})

res = multibind(Writer(1, {}), *repeat(partial(fizzer, pairs=pairs), 15))
print(res)
Writer(16, {1: 1, 2: 2, 3: 'buzz', 4: 4, 5: 'fizz', 6: 'buzz', 7: 7, 8: 8, 9: 'buzz', 10: 'fizz', 11: 11, 12: 'buzz', 13: 13, 14: 14, 15: 'fizzbuzz'})

Truly an 11x Developer solution to the world's universe's premier code interview question!

What is "pynads"

All joking aside, I've been hacking together a collection of Haskell-esque tools for Python over the last few weeks. It started as a "Well, I'll learn Haskell better this way." and has become...well, honestly a tool for me learning Haskell still.

Check it out, I think it's nifty.

A Quick Tour of Pynads

pynads strives to be a pythonic form of Haskell. Which is buzzword for I did some stuff. There's:

  • Functors
  • Applicatives
  • Monads
  • Monoids
  • Helpers

All of the base classes are implemented as Abstract Base Classes which makes inheritance easy. Well, that's a lie, the root object of pynads is a concrete class that just serves as an endpoint for __new__ and __init__.

Container

pynads.Container is the root object of every pynads class. It serves as a final endpoint for __new__ and __init__ as well as providing a consistent name for accessing the values held by objects in pynads. Some would say that it's a silly idea, but it works! Every class in pynads is also slotted for memory reasons since it's built around the idea of not manipulating a container but creating a new one.

The only important thing to know about Container is that it defines v as a property which actually delagates to the _get_val method. Meaning that's all that needs to be overriden to get multiple values out of a container.

For most subclasses of Container, the provided __init__ is fine, but it's a-ok to override it as well as the only setup that happens is setting a single attribute _v.

In [2]:
from pynads import Container

class Person(Container):
    __slots__ = ('name', 'age')
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def _get_val(self):
        return {'name': self.name, 'age': self.age}
    
    def __repr__(self):
        return "Person(name={!s}, age={!s})".format(self.name, self.age)

print(Person(name="Alec", age=26).v)
{'age': 26, 'name': 'Alec'}

Functors

Functors are data types that can be mapped over with the fmap method. But calling methods isn't very Haskell like, so there's an operator that does the same thing: %.

In [3]:
from pynads import Functor

class FmapPerson(Person, Functor):
    __slots__ = ()
    def fmap(self, f):
        return self.__class__(f(self.name), self.age)

print(str.upper % FmapPerson(name="Alec", age=26))
Person(name=ALEC, age=26)

Of course, you could just use FmapPerson.fmap but that's not very Haskellic. pynads also exports a funcs name space that contains functional shortcuts to some of these (though f % functor doesn't get much shorter). In this case, there's also:

In [4]:
from pynads import funcs

print(funcs.fmap(str.upper, FmapPerson(name="Alec", age=26)))
Person(name=ALEC, age=26)

Every class in the pynads.concrete namespace is a functor except for Mempty (we'll get there!).

Applicatives

Applicative Functors are functors that hold functions and can then be applied to other Applicatives/Functors. Every class (except for Mempty) in pynads.concrete is also an Applicative!

In [5]:
from pynads import Just

print(Just(lambda x: x+2) * Just(2))
Just 4

Yeah, like Haskell there's an infix operator. Except of <*> I just dropped the angle brackets because Python doesn't let you define custom operators (easily). It also combines nicely with % because they have the same precedence level!

In [6]:
print((lambda x: lambda y: x+y) % Just(4) * Just(6))
Just 10

BOOM! Mind blown! Whaaaaaat. Applicative uses the abstract method apply to determine how * operates. Just inherit from Applicative, define fmap and apply and boom, you're on your way. Well, you also need the unit method -- which is a class method for all pynad types, but that's not a requirement -- which knows how to put a value in a minimal context.

But wait, what if you have a curried function and you stuffed it into a Just and now you don't want to write out just_f * just_v1 * just_v2 .... Sure, you could think "Well, what if I used reduce(operator.mul, *justs)" But I thought of that already.

In [7]:
add_three_together = lambda x: lambda y: lambda z: x+y+z

print(funcs.multiapply(Just(add_three_together), *[Just(x) for x in range(1,4)]))
Just 6

If you're mind isn't blown yet, it's because I haven't revealed...

MOOOOOOOOONAAAAAAAADS!!!!!

Monads get a bad rap because they get all sorts of overblown explainations. You want to know a what a monad is? It's another way to compute things. It's a glorified container with a special method. You have a value in a monad, a function that takes a regular value and returns a monad and you bind them together. That's it. Literally all there is to it.

In [8]:
from pynads import Nothing

inc_if_odd_else_nothing = lambda x: Just(x+1) if not x&1 else Nothing

print(Just(2) >> inc_if_odd_else_nothing)
Just 3

The Maybe monad (which consists of the Just and Nothing data types) is basically a glorified if expression. That's it! The bind operation will detect if you have a failure in your computation and short circuit it. It's essentially an abstraction over this:

In [9]:
def safe_func(x):
    if x is None:
        return None
    else:
        return x+1

print(safe_func(1), safe_func(None))
2 None

Notice how that didn't cause a nasty AttributeError, because None doesn't have attributes? That's all Maybe lets you do (this behavior is mimicked in its fmap and apply: fmap(f, Nothing) and apply(ap_f, Nothing) both return you a Nothing). Nothing is extraspecialsauce because it's a singleton. It's basically a monadic None. Actually, it is a monadic None because it represents...well, Nothing.

If you've got more binds than editor columns, then there's something for you as well!

In [10]:
print(multibind(Just(1), *repeat(lambda x: Just(x+1), 5)))
Just 6

Before you start thinking, "Well, monads are just glorified if expressions" becuase that's missing the point, Monads are the abstraction abstraction. They represent a way to compute something by abstracting away how it's computated.

There's the Writer monad above which is a value in a monadic context but it also keeps some extra side-output as well. Instead of us trying to track this extra information ourselves, Writer says, "Hey, I'll handle it for you!" It just wants a function that accepts a value and returns a Writer. But here's the really cool thing, I didn't have to use a dictionary. It could have a list, or a string or an integer, or a custom class! I hear, "But how!" Through the power of...

Monoids!

So monoids are pretty cool. They're a set of something, a "zero" and a binary operator that's transative. Lots of things form monoids. Numbers are monoids! There's two ways to make numbers monoids: with 0 and +, with 1 and *. However, pynads is lazy and only defines the first...sorry, hard choices were made.

Wait, "a zero value" but 1 != 0. That's missing the point, a zero value is a value that doesn't change the input when combined with the binary operator. x * 1 == x.

But Python's "primitive" types all form monads!

  • list is a monoid. That's the zero value right there and it's binary operator would be list.extend if it was actually a binop.
  • dict is a monoid with {} and dict.update
  • bool is a monoid with False and |\or
  • set and frozenset are also monoids with their empty instances and |
  • str is a monoid with '' and + (boooo using + to combine strings! but whatever)

Tuple, Complex and Float are also monoids in exactly the ways you expect. There's a catch though: When combining tuples, you actually get a list back. I'll probably rectify this at a later point, but it's just something to live with right now.

pynads also defines two three monoids of its own: List (the monadic form of tuple), Map (the applicative form of dict) and Mempty (which we're getting to!).

Making your own monoid is easy, and you've probably done it before (just not in this fashion). Just inherit from pynads.Monoid create a mempty attribute (it won't let you without it through __new__ hackery) and the mappend method for combining two instances of your monoid. Let's assume we want a real tuple Monoid. We'd do it like this:

In [11]:
from pynads import Monoid

# also inherits from Container
# so we get all the Container goodness for free
class MTuple(Monoid):
    mempty = ()
    
    def __init__(self, *vs):
        super(MTuple, self).__init__(vs)
    
    def mappend(self, other):
        # type checking optional
        if not isinstance(other, MTuple):
            raise TypeError("Can only mappend MTuple with MTuple")
        return MTuple(*(self.v + other.v))
    
    def __repr__(self):
        return "MTuple{!r}".format(self.v)
In [12]:
print(MTuple(4,5) + MTuple(6,7))
MTuple(4, 5, 6, 7)

pynads.Monoid overrides + to be a shortcut to mappend. That's all well and good, but why other than have a unified way of combining values?! Because we get a way to reduce a list of monoids into a single value for free!

In [13]:
print(MTuple.mconcat(MTuple(1), MTuple(2), MTuple(3)))
MTuple(1, 2, 3)

Monoid.mconcat actually delegates to the mappend method and essentially looks like reduce(cls.mappend, monoids). That's it. That's all there is. But you can define your own mconcat to get performace bonuses if you need to.

In [14]:
from itertools import chain

class CMTuple(MTuple):
    def __iter__(self):
        return iter(self.v)
    
    @classmethod
    def mconcat(cls, *MTuples):
        return CMTuple(*chain.from_iterable(MTuples))
In [15]:
print(CMTuple.mconcat(CMTuple(1), CMTuple(2), CMTuple(3)))
MTuple(1, 2, 3)

pynads.List and pynads.Map take advantage of this to create only one intermediate object rather than a bunch. pynads will also let you treat the builtin types as monoids as well through pynads.funcs.monoid namespace which has four functions we're interested in: mempty, mappend, mconcat and is_monoid. mempty returns the "zero" value for a type, mappend knows how to combine types, mconcat knows how to combine an iter of types into a single one and is_monoid knows if something is monoidal or not (generally, it doesn't declare decimal.Decimal to be a monoid but this is because I didn't want to add a special case -- special cases beget special cases).

This is done through introspection of types and abstract base classes (to make the type introspection more acceptable less painful).

In [16]:
from pynads.funcs import monoid

print(monoid.mempty(list()))
print(monoid.mappend({'a':10}, {'b': 7}))
print(monoid.mconcat("hello", " ", "world"))
print(monoid.is_monoid(set()))
[]
{'a': 10, 'b': 7}
hello world
True

The monoid namespace is just a nice interface to the nasty plumbing that lives in pynads.utils.monoidal. It's pretty gross and actually probably pretty fragile, but it works! IT WORKS!!!

Mempty

So here's the thing that lets Writer do its little trick by accept any monoid as a potential log. I can't know ahead of time what you're going to use to keep track of stuff with Writer -- I've drank plenty of spice water, but I've yet to develop prescient abilities. And rather than making a bunch of subclasses specialized to handle a dictionary and a list and a string and a blah blah blah and forcing you to make your own for WriterLogWithMyFirstMonoid I decided to create a mempty monoid -- Mempty. It's not an original idea. Really, it's kind of a dumb object.

It's a singleton, so that's a strike against it (two singletons in my library, my god!). It doesn't actually do anything. It just sits around, taking up space until a real monoid comes along and scares it off. It's mempty value is actually itself! It's mappend just returns whatever its mappended with. And it's mconcat filters out any Mempty values before trying to mconcat the remaining values (you get a Mempty if mconcat an iter of Mempties). There's even an __iter__ method that yields from an empty tuple! What's going on!

In Haskell, mempty can be used a place holder and Haskell knows to do the right thing already. However, we have to teach Python how to use a placeholder and silently step it out of the way when a real value comes along. I suspect that this is similar, maybe, to how Haskell handles it, but I've not dug at all.

In [17]:
from pynads import Mempty

print(monoid.mempty(Mempty))
print(Mempty + 4)
print(monoid.mconcat(Mempty, {4}, {5}, Mempty))
Mempty
4
{4, 5}

So Writer's little trick of allowing any Monoid be the log is really just, "I have this dumb thing that pretends to be a Monoid as a default log."

Helpers

I won't explore this section indepth because pynads.funcs has a whole host of helpers, with more being added as needed. There's helpers for making constants, composing functions, an identity function (as in lambda x: x not id(x)), turning values into Monads. There's also pynads.utils if you need to plumb some of the internal workings like compatibility (did I mention all the tests pass 3.4 and 2.7?!) or how monoids are determined on things that aren't explicitly monoids.

That's it!

Well, not really. There's about five other monads implemented that I didn't cover (like Reader, which was a pain but also a lot of fun to implement). There's more coming as well. Not just monads, but other things. A proof of concept for do notation via abusing coroutines exists in the repo (lifted from a blog called Valued Lessons which served as partial inspiration for this project). And maybe Monad Transformers if I work the gumption up for it (probably yes). And who knows what else.