Thursday, May 14, 2015

Moving to Github Pages

I'm in the process of migrating this blog to github pages. I find that using blogger as my medium complicates my workflow.

With github pages, I can use pelican to build my blog and push the output to a repo on github, which then shows up on the page.

However, with blogger, I need to convert the output to HTML, load it into my clipboard, open blogger and create a new post. Even a nice bash function only takes me so far as Google's API requires authentication with OAuth. And their Python wrapper doesn't support Python 3.

If I were building my blog through a web interface, that's one thing. But I'm not. I like being able to issue one or two short commands in bash to do everything. So, blogger, I bid you adieu. I hope you do well. But I'm ending this relationship because of you and your stubbornness.

Sunday, May 10, 2015

Quotly: Building a simple JSON API with Flask, Marshmallow and SQLAlchemy

My love of Flask is well known. It's a great microframework that puts you in control of the pieces needed in your webapp. It comes with templating, requests, responses, cookies, sessions, routing and... really, that's it. Routes can be built by decorating functions or by inheriting from flask.views.View or flask.views.MethodView depending on your needs.

Something I've explored in the past is building a simple RESTful service using Flask and SQLAlchemy. However, that blog post fell into the trap of "Look at all this code!" that happens to me regretfully often. D: However, I'd like revisit the idea and explore the topic a little more in depth.

I will forewarn, that this post is still pretty code heavy. If that's the sort of thing that makes your eyes glaze over, you may want to go to /r/pitbulls instead.

Quotly: Cheeky Movie Quotes for Web 2.0¶

Okay, lame. But it's not another TODO list, though not much better. But by the end, hopefully we'll feel comfortable with Flask, Flask-SQLAlchemy, Marshmallow (12/10 serializer library) and a little bit of Flask-Restful (though, it's going to be used as a base that we're going to modify).

This post was written with Python 3.4, but it should work with at least 2.7 and 3.3. I'm not sure about others and I'm too lazy to add the deadsnakes PPA right now. D:

You can follow along with vim or PyCharm or whatever. The full application is at the bottom of the post as well if you'd rather peek now.

If you are following along, I recommend making two files: app.py and run.py. app.py is where we'll obviously be building Quotly. The run.py will simply run the debug server on the side, just set that one up to look like this:

In [ ]:

from app import app

if __name__ == '__main__':
    app.run(debug=True, reload=True)

Step Zero: Needed packages¶

Instead of installing packages along every step, let's just get all that mess out the way now...

pip install --user -U marshmallow --pre
pip install --user -U flask-sqlalchemy flask-restful

That'll install everything you need to follow along here. A note about that Marshmallow install: This installs a pre-release version of Marshmallow, which we'll need to take advantage of some cool stuff that's coming in Marshmallow. If you're also wanting to use Flask-Marshmallow (perhaps to avoid writing your own hyperlink serializer), install it afterwards to avoid getting a potentially older version.

Step One: Considering Our Data¶

First, let's consider what our data looks like. We have quotes. People say quotes. And that's really about it. We could use a list of dictionaries for this, but since we'll eventually involve SQLAlchemy which returns objects, lets use namedtuple instead.

In [3]:

from collections import namedtuple

# basic quote object
# can also use to represent many with [Quote(person..., quote=...)[,...]]
Quote = namedtuple('Quote', ['person', 'quote'])

# Let's prepopulate an example quote list
quotes = [Quote(p, q) for p, q in [
        ("Herbert West", "I was busy pushing bodies around as you well know "
         "and what would a note say, Dan? 'Cat dead, details later'?"),
        ("Jack Burton", "You remember what ol' Jack Burton always says at a time like that: "
         "'Have ya paid your dues, Jack?' 'Yessir, the check is in the mail.'"),
        ("Igor", "Well, why isn't it Froaderick Fronkensteen?")
]]

I wouldn't blame you if you took a break to track down one of these movies and turned it on the background.

In [4]:

from IPython.display import Image
Image(url="http://i.imgur.com/9VeIbMZ.gif")

Out[4]:

Step Two: Serializing to JSON¶

Assumuing you've not been distracted, let's see about taking these quote objects and turning them into JSON.

In [5]:

import json

print(json.dumps(quotes[0], indent=2))

[
  "Herbert West",
  "I was busy pushing bodies around as you well know and what would a note say, Dan? 'Cat dead, details later'?"
]

...um...er...That's not what we really wanted. Since JSON has no notation of tuples, let alone namedtuples, Python helpfully transforms them into JSON's nearest relation: lists. However, we'd probably find it nicer to have key-value pairs pop out the otherside. Of course, we could just use a dictionary, or write a namedtuple_to_dict function that'll do it ourselves:

In [6]:

namedtuple_to_dict = vars
print(json.dumps(namedtuple_to_dict(quotes[0]), indent=2))

{
  "person": "Herbert West",
  "quote": "I was busy pushing bodies around as you well know and what would a note say, Dan? 'Cat dead, details later'?"
}

But that's no fun and will only work one level deep. What happens when we need to serialize objects that have other objects living inside them? That won't work. I've seen lots of ways to handle this, most of them are just variations on a __json__ method on every object and subclassing json.JSONEncoder to just invoke that when it encounters something it can't serialize. Plus, it still wouldn't work for namedtuple since it can be serialized to a list.

In [7]:

Image(url="http://i.imgur.com/mWU6lP6.gif")

Out[7]:

Rather than hacking some function or a mixin together and making the object responsible for knowing how to transform itself into a dictionary, why not use a robust, well tested object serializer library? No, not pickle -- pickles are unsafe and too vinegary for me. My sweet tooth is craving Marshmallows.

In [8]:

from marshmallow import Schema, pprint
from marshmallow.fields import String

class QuoteSchema(Schema):
    person = String()
    quote = String()
    
pprint(QuoteSchema().dump(quotes[0]).data)

{'person': 'Herbert West',
 'quote': 'I was busy pushing bodies around as you well know and what would '
          "a note say, Dan? 'Cat dead, details later'?"}

...wait, is really that easy? Five lines, including the imports? It seems like it shouldn't be, but it is. Actually it can even be easier:

In [9]:

class QuoteSchema(Schema):
    class Meta:
        additional = ('person', 'quote')

pprint(QuoteSchema().dump(quotes[1]).data)

{'person': 'Jack Burton',
 'quote': "You remember what ol' Jack Burton always says at a time like "
          "that: 'Have ya paid your dues, Jack?' 'Yessir, the check is in "
          "the mail.'"}

Marshmallow is smart enough to know how to serialize built-in types without us saying, "This is a string." Which is fantastic. We can take that schema and json.dumps and produce what we actually wanted:

In [10]:

print(json.dumps(QuoteSchema().dump(quotes[2]).data, indent=2))

{
  "quote": "Well, why isn't it Froaderick Fronkensteen?",
  "person": "Igor"
}

And unlike many other solutions, Marshmallow will also allow us to serialize a collection of objects as well:

In [11]:

pprint(QuoteSchema(many=True).dump(quotes).data)

[{'person': 'Herbert West',
  'quote': 'I was busy pushing bodies around as you well know and what '
           "would a note say, Dan? 'Cat dead, details later'?"},
 {'person': 'Jack Burton',
  'quote': "You remember what ol' Jack Burton always says at a time like "
           "that: 'Have ya paid your dues, Jack?' 'Yessir, the check is in "
           "the mail.'"},
 {'person': 'Igor', 'quote': "Well, why isn't it Froaderick Fronkensteen?"}]

While this is valid JSON (a root object can be either an object or an array), Flask will only allow objects at the root level to prevent stuff like this. However, asking a schema to create a dictionary if it serializes a collection isn't hard to do at all:

In [12]:

from marshmallow import post_dump

class QuoteSchema(Schema):
    class Meta:
        additional = ('person', 'quote')
    
    @post_dump(raw=True)
    def wrap_if_many(self, data, many=False):
        if many:
            return {'quotes': data}
        return data
    
pprint(QuoteSchema(many=True).dump(quotes).data)

{'quotes': [{'person': 'Herbert West',
             'quote': 'I was busy pushing bodies around as you well know '
                      "and what would a note say, Dan? 'Cat dead, details "
                      "later'?"},
            {'person': 'Jack Burton',
             'quote': "You remember what ol' Jack Burton always says at a "
                      "time like that: 'Have ya paid your dues, Jack?' "
                      "'Yessir, the check is in the mail.'"},
            {'person': 'Igor',
             'quote': "Well, why isn't it Froaderick Fronkensteen?"}]}

In [13]:

Image(url="http://i.imgur.com/BUtt2Jd.gif")

Out[13]:

Step Three: Briefly Flask¶

Now that the Quote objects can be correctly serialized to JSON, feeding it from Flask is easy peasy.

In [ ]:

from flask import Flask, jsonify

app = Flask('notebooks')
# reuse the same QuoteSchema instance rather than creating new with each request
QuoteSerializer = QuoteSchema()

@app.route('/quote/<int:id>')
def single_quote(idx):
    if not 0 <= idx < len(quotes):
        # flask allows return a tuple of data, status code, headers (dict)
        # status code is 200 by default
        data = {'error': 'quote out of range'}, 400
    else:
        data = QuoteSerializer.dump(quote[idx]).data
    return data

Step Four: Deserialization¶

However, only getting a quote is pretty simple stuff. What if we wanted to create new Quote objects from JSON? This is pretty easy to do by hand with Flask's request object (note: the request.get_json method is currently the recommended method for plucking JSON out of the request rather than using the request.json attribute):

In [ ]:

from flask import request

@app.route('/quote/', methods=['POST'])
def make_new_quote():
    # get_json returns dict or None on failure
    json = request.get_json()
    if json and 'quote' in json:
        quotes.append(Quote(person=json['person'], quote=json['quote']))
        msg = {'success': True, 'msg': 'Added quote.'}
    else:
        msg = {'success': False, 'msg': 'must specify quote in JSON request'}, 400
    return msg

However, if we're deserializing complex objects, say a tracklist that has an attribute that holds track objects which reference artist objects. Pretty soon manually deserializing an object becomes quite...messy. However, there is a better way. Marshmallow not only serializes objects, but will also handle deserialization if we give it a little bit of help:

In [14]:

class QuoteSchema(Schema):
    class Meta:
        additional = ('person', 'quote')
    
    @post_dump(raw=True)
    def wrap_if_many(self, data, many=False):
        if many:
            return {'quotes': data}
        return data
    
    def make_object(self, data):
        assert 'person' in data and 'quote' in data, "Must specify person and quote in request"
        return Quote(person=data['person'], quote=data['quote'])
    
QuoteSchema().load({"person": "Ash", "quote": "Good. Bad. I'm the guy with the gun."}).data

Out[14]:

Quote(person='Ash', quote="Good. Bad. I'm the guy with the gun.")

Just the opposite of what we had before. Dictionary in, Object out. We can also deserialize a collection as well:

In [15]:

QuoteSchema(many=True).load([
        {'person':'Ash', 'quote':"Good. Bad. I'm the guy with the gun."}, 
        {'person': 'Shaun', 'quote': "You've got red on you."}
]).data

Out[15]:

[Quote(person='Ash', quote="Good. Bad. I'm the guy with the gun."),
 Quote(person='Shaun', quote="You've got red on you.")]

Hopefully the advantage of using Marshmallow for even sending and receiving simple JSON objects is apparent. With 11 lines we can take an object and cast it to a dictionary and we can take a dictionary with certain keys and build an object with it. Sure, we're just serializing and deserializing a namedtuple..."But that's how it always begins. Very small."

In [16]:

Image(url="http://i.imgur.com/zvv3ymL.gif")

Out[16]:

Step Five: Routing with Flask-Restful¶

Flask-Restful is a great library that builds on top of Flask's MethodView and makes it pretty easy to support multiple API styles (XML, CSV, JSON, etc). It ships with JSON serialization by default, leaving the others up to the user to implement. There's a bunch of other features as well, but I'm only to tread on the incredibly useful routing mechanism in place here.

All we need to do to hook into this is to inherit from flask_restful.Resource and return dictionaries from our methods. Dictionaries like the ones produced by Marshmallow. Changing the routing from vanilla Flask routing to class based is a little weird at first, but it quickly becomes very intuitive.

And, since the methods for deserialization are in place, let's also handle accepting JSON and appending quotes to our little list.

In [17]:

from flask.ext.restful import Resource
from flask import request

class SingleQuote(Resource):
    def get(self, idx):
        if idx and 0 <= idx < len(quotes):
            # flask-restful also allows data, status code, header tuples
            return QuoteSerializer.dump(quotes[idx]).data
        return {'error': 'quote index out of range'}, 400

class ListQuote(Resource):
    def get(self):
        return QuoteSerializer.dump(quotes, many=True).data
    
    def post(self):
        json = request.get_json()
        if not json:
            return {"success": False, "msg": "malformed request"}, 400
            
        if not 'quote' in json:
            return {"success": False, "msg": "must specify quote in request"}, 400
        
        else:
            # remember QuoteSchema.make_object causes an assert
            try:
                q = QuoteSerializer.load(request['json']).data
            except AssertionError as e:
                return {'success': False, 'msg': str(e)}, 400
            else:
                quotes.append(q)
                return {"success": True, "msg": "Quote added."}

And then we simply register these resources on an API object that's hooked up to our application:

In [ ]:

from flask.ext.restful import Api

api = Api(app)
api.register_resource(SingleQuote, '/quote/<int:id>')
api.register_resource(ListQuote, '/quotes')

Step Six: Persistence with SQLA¶

This is great and all, but these quotes will only last as long as the session runs. If we need to restart, we lose it all except for the preloaded quotes. To achieve real persistence, we should shake up with a database. SQLite is a good choice for this, plus bindings come native with Python.

In [ ]:

from flask.ext.sqlalchemy import SQLAlchemy

app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///quotes.db'

db = SQLAlchemy(app)

class Quote(db.Model):
    
    id = db.Column(db.Integer, primary_key=True)
    person = db.Column(db.Unicode(50))
    quote = db.Column(db.UnicodeText)

Our schema doesn't change at all. Marshmallow doesn't know or care if we're passing a namedtuple or a SQLA model, just that it has the correct attributes. This is great because we can write many quick tests with something like namedtuple to verify our schema behaves correctly and then just a few integration tests with the models.

In [18]:

Image(url="http://i.imgur.com/mudwVxd.gif")

Out[18]:

However, our resource end points do need to change some, though. Since we're dealing with SQLA models now and not just simple lists. The changes are trivial:

In [19]:

class SingleQuote(Resource):
    def get(self, idx):
        if idx:
            return Quote.query.get(idx) or {'error': 'quote does not exist'}
        return {'error': 'must specify quote id'}

class ListQuote(Resource):
    def get(self):
        return QuoteSerializer.dump(quotes, many=True).data
    
    def post(self):
        json = request.get_json()
        if not json: # get_json will return a dict or None
            return {"success": False, "msg": "malformed request"}, 400
            
        if not 'quote' in json:
            return {"success": False, "msg": "must specify quote in request"}, 400
        
        else:
            try:
                q = QuoteSerializer.load(request['json']).data
            except AssertionError as e:
                return {'success': False, 'msg': str(e)}, 400
            else:
                db.session.add(q)
                db.session.commit()
                return {"success": True, "msg": "Quote added."}

Just two simple changes to go from list to SQLA models. Be sure to run db.create_all() somewhere before and load up initial quotes and Quotly is up and running, ready to send and receive cheeky movie quotes for everyone.

Parting Thoughts¶

While this was more of a "hit the ground running" guide to building a simple REST API with Flask and its little ecosystem, I hope it's been enlightening. I've included the whole application in this gist for reference. If you see a bug or have questions, hit me up on twitter (@just_anr) or on github (justanr).

In [ ]:

Wednesday, April 29, 2015

PEP 484 and Me

So PEP 484 is a thing. It's about type hinting in Python and seems to be heavily influenced by mypy-lang. However, this isn't a type system. It's meant as a helper for static code analysis. There's no type enforcement -- at least to my understanding. Basically, we'd be able to load up pyflakes or PyCharm and receive information on what the parameters are expected to be or if at some point we've got a type mismatch.

There's been a lot of talk about this. Some in favor, some not.

On one hand, I get it. This is super helpful for analysing a new code base -- assuming it's been used. :/ On the other hand, it's down right ugly. I'm not a big fan of inlining types, at all. Some things aren't so bad...

In [1]:

import typing as t

def add(x: int, y: int) -> int:
    return x+y

Not so bad. Just a simple add function, we see it takes two ints and returns an int. However, for something more complicated, let's say zipWith it's gets ugly really fast.

Here's the comparable Haskell type:

zipWith (a -> b -> c) -> [a] -> [b] -> [c]

And here's the proposed PEP syntax:

In [2]:

A, B, C = t.TypeVar('A'), t.TypeVar('B'), t.TypeVar('C')

def zip_with(func: t.Callable[[A, B], C], a: t.List[A], b: t.List[B]) -> t.List[C]:
    return map(func, a, b)

There's so much information in the parameter line I can hardly see what's actually relavant. This is something that really bothers me about all inlined types. Here's the proposed PEP syntax for something as simple as compose:

In [3]:

# compose :: (b -> c) -> (a -> b) -> (a -> c)
def compose(f: t.Callable[[B], C], g: t.Callable[[A], B]) -> t.Callable[[A], C]:
    return lambda x: f(g(x))

print(compose.__annotations__)

{'f': typing.CallableCallable[[~B], ~C], 'return': typing.CallableCallable[[~A], ~C], 'g': typing.CallableCallable[[~A], ~B]}

Using a decorator was explictly shot down in the PEP under the argument that it's verbose and function parameters would need to be repeated. However, I find the current proposed syntax to already be verbose.

Moreover, a special type of file was proposed: Stub files. These would be additional files maintainers right that mirror the structure of an existing project only to provide annotated functions. If decorators are being shot down as unnecessarily verbose, this should too even if addresses the issue of Python 2 and 3 compatibility. I surely don't want to maintain essentially two copies of my project structure to get the minimal benefits of type hinting. And I certainly think that projects that begin using these will see a decline in contributitions -- if your project is using stub files already, surely the onus will be on the committer to maintain changes in the stubs as well.

Breaking out the type definitions into a separate line would go a long way to clean it up. Retyping parameters shouldn't be needed, just doing something like this would help:

@typed(t.Callable[[B], C], t.Callable[[A], B], returns=t.Callable[[A], C])
def compose(f, g):
    return lambda x: f(g(x))

Using the special keyword syntax introduced in Python 3.0 provides a clean break between input and output types. And using a decorator to separate the concern of "this is type information" from "these are the parameters" is what decorators do.

As a proof of concept:

In [4]:

import inspect
from functools import wraps

def typed(*types, returns):
    def deco(f):
        # todo handle *args, **kwargs
        params = inspect.getargspec(f).args
        if not len(types) == len(params):
            raise TypeError("Must provide types for all parameters")
        annotations = {a: t for a, t in zip(params, types)}
        annotations['return'] = returns
        f.__annotations__ = annotations
        @wraps(f)
        def wrapper(*args, **kwargs):
            return f(*args, **kwargs)
        return wrapper
    return deco

In [5]:

@typed(t.Callable[[B], C], t.Callable[[A], B], returns=t.Callable[[A], C])
def compose(f, g):
    return lambda x: f(g(x))

In [6]:

compose.__annotations__

Out[6]:

{'f': typing.CallableCallable[[~B], ~C],
 'g': typing.CallableCallable[[~A], ~B],
 'return': typing.CallableCallable[[~A], ~C]}

In [7]:

@typed(A, returns=C)
def mismatched(a, b):
    pass

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-2d7bfefaa7d3> in <module>()
----> 1 @typed(A, returns=C)
      2 def mismatched(a, b):
      3     pass

<ipython-input-4-e8ade1e4ee86> in deco(f)
      7         params = inspect.getargspec(f).args
      8         if not len(types) == len(params):
----> 9             raise TypeError("Must provide types for all parameters")
     10         annotations = {a: t for a, t in zip(params, types)}
     11         annotations['return'] = returns

TypeError: Must provide types for all parameters

Of course, there's still the issue of things like classes that accept instances of themselves as arguments to methods. The cannonical example appears to be Nodes:

In [8]:

class Node:
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

Since class names aren't evaluated until the entire body of the class is evaluated, it's impossible to straight up reference the class in the top level of the class, i.e.:

class Node:
    def __init__(self, value: t.Any, left: Node, right: Node):
        ...

This results in a NameError because of the inside out evaluation (something that has bitten me before, but was easy enough to work around in that case). I believe the current fix for this is actually inheriting from something like Generic[T], i.e.:

In [9]:

class Node(t.Generic[t.T]):
    def __init__(self, left: t.T, right: t.T):
        pass

Nevermind the fact that I think imposing this requirement is ridiculous not only because should types be out of the way, the implication is that I'm gaining some tangible runtime benefit by inheriting from Generic[T] -- we're not, static type analysis is an "offline" thing.

Also the problem of using my own metaclass arises. These type variables are scaffolded around using abc.ABCMeta as a base, which is fine until the fact that we can only have one metaclass in a heirarchy comes into play. Wah wah wah.

I don't think that type hinting is necessarily a bad thing. However, I think as the PEP is written currently, we're sacrificing quite a bit for minimal gain.

I wrote a monad tutorial for some reason...

I swore to myself up and down that I wouldn't write one of these. But then I went and hacked up Pynads. And then I wrote a post on Pynads. And then I posted explainations about Monads on reddit. So what the hell. I already fulfilled my "Write about decorators when I understand them" obligation and ditto for descriptors. So Monads, why not...

It's simple, a monad is like a...¶

No. Stooooooop. :( Burritos. Bucket brigades. Semicolons. All these analogies just confused me for a long time. And then I "got them" and by "got them" I mean "Even more hopelessly confused but I didn't know that." Like what does "programmable semicolon" even mean? Every language I've used (which isn't many) a semicolon means "This bit of code ends here, kthxbai". The burrito analogy was meant as a critique of this phenomenon -- and I'll likely fall victim of the "Monad Tutorial Curse". And the bucket brigade was a valiant effort by a SO user to explain them.

It's simple, a monad is like a Unix Pipe¶

Instead of reaching for some non-programming analogy like burritos or bucket brigades, I think Unix Pipes are a pretty good analogy to Haskell-style monads. Let's say I'm in a directory that has a bunch of different types of files -- maybe it's the bottomless bin that is ~/Downloads ): And I want to find all the MP4 files in the top level directory and print them out:

ls -lh ~/Downloads | grep -i "*mp4" | less

Super simple. We take the first command ls feed it some options and a directory to list out. Then | goes "Oh, you have output and I have this thing that needs input, here grep!" And then grep does its business and | steps back in and goes "Oh, you have output and I have this thing that needs input, here less!"

Of course it isn't a perfect analogy. But all analogies break down under scrutiny. But this is essentially what Haskell's >>= does. "Oh, you have output, let me feed it to this function that wants input!" That's it. Monads are about chaining together a series of actions of functions (depending on how you want to look at it) in a way that each action/function returns something that can carry the chain forward somehow.

But the short of monads is that they have nothing to do with I/O, impure values, side effects or anything else. Those are implementation specific to certain monads. Monads in general only deal with how to combine expressions.

But Python doesn't have monads¶

Eh. It all depends on how you want to look at it. Sure, it doesn't have Haskell style monads. But it doesn't need to. Let's look at something:

In [1]:

x = y = '     Fred\n Thompson '

I have that input. But I need output that looks like this: "JACK THOMPSON". The obvious way is doing it imperatively:

In [2]:

x = x.replace('Fred', 'Jack')
x = x.replace('\n', '')
x = x.strip()
x = x.upper()
print(x)

JACK THOMPSON

And it works. Or I could just chain all those operations together:

In [3]:

print(y.replace('Fred', 'Jack').replace('\n', '').strip().upper())

JACK THOMPSON

Each string method returns a new string that can carry the chain forward. We can add in as many string methods that return a string. But if we place something like split or find then our chain can't be continued as there's a list or a integer now. That's not to say we can't continue the chain, but we likely need to do in a separate expression (which is okay).

Worshipping at the altar of bind¶

So Haskell style monads are pretty much defined by the presence of >>= and return. return just lifts a value into a monad. And >>= is the sequencing operator. Neither of these are magic, we need to define them ourselves. I like using Maybe as an example because it's simple enough to explain but addresses a real world problem: Null Pointer Exceptions. (:

We usually avoid this sort of thing with this pattern in Python:

In [4]:

def sqrt(x):
    if x is None:
        return None
    return x**.5

print(sqrt(4))
print(sqrt(None))

2.0
None

We can use this to process information from STDIN (for example):

In [5]:

def int_from_stdin():
    x = input()
    return int(x) if x.isdigit() else None

In [6]:

maybe_int = int_from_stdin()
print(sqrt(maybe_int))

a
None

In [7]:

maybe_int = int_from_stdin()
print(sqrt(maybe_int))

4
2.0

We just have to make sure we include the if x is None check everywhere. That's easy. Right. ...right? guise? On top of it being something to remember, it's line noise. Completely in the way of what we're attempting to accomplish. Instead, let's look at Maybe in terms of Haskell and Python:

data Maybe a = Nothing | Just a

instance Monad Maybe where
    return = Just
    (Just x) >>= f = f x
    Nothing  >>= f = Nothing

We have the type constructor Maybe which has two data constructors Just and Nothing. In Python terms, we have an abstract class Maybe and two implementations Just and Nothing. When we have a Just and >>= is used, we get the result of the function with the input of whatever is in Just. If we have Nothing and >>=is used, we get Nothing (Nothing from nothing leaves nothing. You gotta have something, if you wanna be with me). Notice that onus to return a Maybe is on whatever function we bind to. This puts the power in our hands to decide if we have a failure at any given point in the operation.

In Python, a simplified version looks a lot like this:

In [8]:

class Maybe:
    @staticmethod
    def unit(v):
        return Just(v)
    
    def bind(self, bindee):
        raise NotImplementedError
    
class Just(Maybe):
    
    def __init__(self, v):
        self.v = v
        
    def __repr__(self):
        return 'Just {!r}'.format(self.v)
    
    def bind(self, bindee):
        return bindee(self.v)

class Nothing(Maybe):
    
    def bind(self, bindee):
        return self
    
    def __repr__(self):
        return 'Nothing'

And we can use this to reimplement our int_from_stdin and sqrt functions above:

In [9]:

def int_from_stdin():
    x = input()
    return Just(int(x)) if x.isdigit() else Nothing()

def sqrt(x):
    return Just(x**.5)

And chain them together like this:

In [10]:

int_from_stdin().bind(sqrt)

Out[10]:

Just 2.0

In [11]:

int_from_stdin().bind(sqrt)

Out[11]:

Nothing

What >>= does isn't just sequence actions together. That's easy to do, we could have accomplished them the same thing before with sqrt(int_from_stdin()). However, the real magic sauce of >>= is abstracting how they're sequenced. In this case, sequencing a Just results in feeding the contained value of Just to a function and getting back a Maybe. And sequencing a Nothing results in Nothing.

The great thing about Maybe is we're allowed to decide at an arbitrary point if we even want to continue with the computation or bail out completely. Let's say we have something against even numbers. Perhaps it's that only one of them is Prime. But we like odds. So if we get an even number from STDIN, we'll just bail out.

In [12]:

def only_odds(x):
    return Just(x) if x&1 else Nothing()

int_from_stdin().bind(only_odds).bind(sqrt)

Out[12]:

Nothing

In [13]:

int_from_stdin().bind(only_odds).bind(sqrt)

Out[13]:

Just 1.7320508075688772

Other ways to sequence¶

Obviously bind/>>= isn't the only way to interact with monads if they're just about sequencing functions together. For example, Scala has a suped-up version of Maybe called Option. It's the same basic structure: Some (our successful computation) and None (a failed computation). It also has ways of recovering from a possibly failed computation with its getOrX methods. For example, if we have Some("abc") we can do this to recover when check if d is present:

Some("abc") filter (i => match i indexOf "d" {
                      case -1 => None
                      case _  => Some(i)
                      }
                    }) getOr "d"

Which should return "d" but Scala isn't my mother tongue, so there's probably an error somewhere.

You could argue that SQLAlchemy is monadic as well based on how you build queries in it:

q = session.query(Person).filter(Person.name.startswith('A')).first()

SQLAlchemy queries return query objects that can carry the chain further, allowing us to craft complicated queries in a relatively simple manner.

I found a more clever example in a thread on /r/learnpython about what features would you implement in Python given that chance. Below the "Everything not nailed down in Haskell" comment, there was one about universal function call syntax from D. /u/AMorpork proposed simply creating a monad where __getattr__ is the sequencing operation (reproduced here):

In [14]:

from itertools import islice
import builtins as __builtin__

def take(n, it):
    return islice(it, n)

class UFCS(object):
    def __init__(self, value):
        self.state = value

    def __getattr__(self, item):
        try:
            func = getattr(__builtin__, item)
        except AttributeError:
            func = globals()[item]
        def curried(*args):
            if not args:
                self.state = func(self.state)
            else:
                args = list(args)
                args.append(self.state)
                self.state = func(*args)
            return self

        return curried

    def get(self):
        return self.state

In [15]:

x = ['#3.462289264065068', 
     '4.283990003510465', 
     '#1.7285949138067824', 
     '#2.6009019446392987', 
     '5.089491698891653', 
     '3.854140130424576', 
     '4.118846086899804', 
     '5.110436429053362', 
     '9.044631493138326', 
     '5.503343391187907', 
     '1.4415742971795897', 
     '2.7162342709197618', 
     '9.438995804377226', 
     '1.8698624486908322', 
     '4.008599242523804', 
     '8.914062382096017', 
     '4.120213633898632', 
     '6.9189185117106975',
     # more were included, but removed here
     ]

UFCS(x).filter(lambda s: s and s[0] != "#").map(float).sorted().take(10).list().print()

[1.4415742971795897, 1.8698624486908322, 2.7162342709197618, 3.854140130424576, 4.008599242523804, 4.118846086899804, 4.120213633898632, 4.283990003510465, 5.089491698891653, 5.110436429053362]

Out[15]:

<__main__.UFCS at 0x7fd4c064ee10>

It's simple, a monad is like a...¶

Hopefully this goes a long way to explaining the idea of Monads in terms of programming. Maybe I fell upon the Monad Tutorial Fallacy. However, in the event that I've hopeless confused someone more, drop me a line and I'll be happy to go into further detail.

alec got a blog

Pages

Thursday, May 14, 2015

Moving to Github Pages

Sunday, May 10, 2015

Quotly: Building a simple JSON API with Flask, Marshmallow and SQLAlchemy

Quotly: Cheeky Movie Quotes for Web 2.0¶

Step Zero: Needed packages¶

Step One: Considering Our Data¶

Step Two: Serializing to JSON¶

Step Three: Briefly Flask¶

Step Four: Deserialization¶

Step Five: Routing with Flask-Restful¶

Step Six: Persistence with SQLA¶

Parting Thoughts¶

Wednesday, April 29, 2015

PEP 484 and Me

I wrote a monad tutorial for some reason...

It's simple, a monad is like a...¶

It's simple, a monad is like a Unix Pipe¶

But Python doesn't have monads¶

Worshipping at the altar of bind¶

Other ways to sequence¶

It's simple, a monad is like a...¶

About Me

Baps

Louis

Labels

Blog Archive