A collection of computer, gaming and general nerdy things.

Thursday, May 14, 2015

Moving to Github Pages

I'm in the process of migrating this blog to github pages. I find that using blogger as my medium complicates my workflow.

With github pages, I can use pelican to build my blog and push the output to a repo on github, which then shows up on the page.

However, with blogger, I need to convert the output to HTML, load it into my clipboard, open blogger and create a new post. Even a nice bash function only takes me so far as Google's API requires authentication with OAuth. And their Python wrapper doesn't support Python 3.

If I were building my blog through a web interface, that's one thing. But I'm not. I like being able to issue one or two short commands in bash to do everything. So, blogger, I bid you adieu. I hope you do well. But I'm ending this relationship because of you and your stubbornness.

Sunday, May 10, 2015

Quotly: Building a simple JSON API with Flask, Marshmallow and SQLAlchemy

My love of Flask is well known. It's a great microframework that puts you in control of the pieces needed in your webapp. It comes with templating, requests, responses, cookies, sessions, routing and... really, that's it. Routes can be built by decorating functions or by inheriting from flask.views.View or flask.views.MethodView depending on your needs.

Something I've explored in the past is building a simple RESTful service using Flask and SQLAlchemy. However, that blog post fell into the trap of "Look at all this code!" that happens to me regretfully often. D: However, I'd like revisit the idea and explore the topic a little more in depth.

I will forewarn, that this post is still pretty code heavy. If that's the sort of thing that makes your eyes glaze over, you may want to go to /r/pitbulls instead.

Quotly: Cheeky Movie Quotes for Web 2.0

Okay, lame. But it's not another TODO list, though not much better. But by the end, hopefully we'll feel comfortable with Flask, Flask-SQLAlchemy, Marshmallow (12/10 serializer library) and a little bit of Flask-Restful (though, it's going to be used as a base that we're going to modify).

This post was written with Python 3.4, but it should work with at least 2.7 and 3.3. I'm not sure about others and I'm too lazy to add the deadsnakes PPA right now. D:

You can follow along with vim or PyCharm or whatever. The full application is at the bottom of the post as well if you'd rather peek now.

If you are following along, I recommend making two files: app.py and run.py. app.py is where we'll obviously be building Quotly. The run.py will simply run the debug server on the side, just set that one up to look like this:

In [ ]:
from app import app

if __name__ == '__main__':
    app.run(debug=True, reload=True)

Step Zero: Needed packages

Instead of installing packages along every step, let's just get all that mess out the way now...

pip install --user -U marshmallow --pre
pip install --user -U flask-sqlalchemy flask-restful

That'll install everything you need to follow along here. A note about that Marshmallow install: This installs a pre-release version of Marshmallow, which we'll need to take advantage of some cool stuff that's coming in Marshmallow. If you're also wanting to use Flask-Marshmallow (perhaps to avoid writing your own hyperlink serializer), install it afterwards to avoid getting a potentially older version.

Step One: Considering Our Data

First, let's consider what our data looks like. We have quotes. People say quotes. And that's really about it. We could use a list of dictionaries for this, but since we'll eventually involve SQLAlchemy which returns objects, lets use namedtuple instead.

In [3]:
from collections import namedtuple

# basic quote object
# can also use to represent many with [Quote(person..., quote=...)[,...]]
Quote = namedtuple('Quote', ['person', 'quote'])

# Let's prepopulate an example quote list
quotes = [Quote(p, q) for p, q in [
        ("Herbert West", "I was busy pushing bodies around as you well know "
         "and what would a note say, Dan? 'Cat dead, details later'?"),
        ("Jack Burton", "You remember what ol' Jack Burton always says at a time like that: "
         "'Have ya paid your dues, Jack?' 'Yessir, the check is in the mail.'"),
        ("Igor", "Well, why isn't it Froaderick Fronkensteen?")
]]

I wouldn't blame you if you took a break to track down one of these movies and turned it on the background.

In [4]:
from IPython.display import Image
Image(url="http://i.imgur.com/9VeIbMZ.gif")
Out[4]:

Step Two: Serializing to JSON

Assumuing you've not been distracted, let's see about taking these quote objects and turning them into JSON.

In [5]:
import json

print(json.dumps(quotes[0], indent=2))
[
  "Herbert West",
  "I was busy pushing bodies around as you well know and what would a note say, Dan? 'Cat dead, details later'?"
]

...um...er...That's not what we really wanted. Since JSON has no notation of tuples, let alone namedtuples, Python helpfully transforms them into JSON's nearest relation: lists. However, we'd probably find it nicer to have key-value pairs pop out the otherside. Of course, we could just use a dictionary, or write a namedtuple_to_dict function that'll do it ourselves:

In [6]:
namedtuple_to_dict = vars
print(json.dumps(namedtuple_to_dict(quotes[0]), indent=2))
{
  "person": "Herbert West",
  "quote": "I was busy pushing bodies around as you well know and what would a note say, Dan? 'Cat dead, details later'?"
}

But that's no fun and will only work one level deep. What happens when we need to serialize objects that have other objects living inside them? That won't work. I've seen lots of ways to handle this, most of them are just variations on a __json__ method on every object and subclassing json.JSONEncoder to just invoke that when it encounters something it can't serialize. Plus, it still wouldn't work for namedtuple since it can be serialized to a list.

In [7]:
Image(url="http://i.imgur.com/mWU6lP6.gif")
Out[7]:

Rather than hacking some function or a mixin together and making the object responsible for knowing how to transform itself into a dictionary, why not use a robust, well tested object serializer library? No, not pickle -- pickles are unsafe and too vinegary for me. My sweet tooth is craving Marshmallows.

In [8]:
from marshmallow import Schema, pprint
from marshmallow.fields import String

class QuoteSchema(Schema):
    person = String()
    quote = String()
    
pprint(QuoteSchema().dump(quotes[0]).data)
{'person': 'Herbert West',
 'quote': 'I was busy pushing bodies around as you well know and what would '
          "a note say, Dan? 'Cat dead, details later'?"}

...wait, is really that easy? Five lines, including the imports? It seems like it shouldn't be, but it is. Actually it can even be easier:

In [9]:
class QuoteSchema(Schema):
    class Meta:
        additional = ('person', 'quote')

pprint(QuoteSchema().dump(quotes[1]).data)
{'person': 'Jack Burton',
 'quote': "You remember what ol' Jack Burton always says at a time like "
          "that: 'Have ya paid your dues, Jack?' 'Yessir, the check is in "
          "the mail.'"}

Marshmallow is smart enough to know how to serialize built-in types without us saying, "This is a string." Which is fantastic. We can take that schema and json.dumps and produce what we actually wanted:

In [10]:
print(json.dumps(QuoteSchema().dump(quotes[2]).data, indent=2))
{
  "quote": "Well, why isn't it Froaderick Fronkensteen?",
  "person": "Igor"
}

And unlike many other solutions, Marshmallow will also allow us to serialize a collection of objects as well:

In [11]:
pprint(QuoteSchema(many=True).dump(quotes).data)
[{'person': 'Herbert West',
  'quote': 'I was busy pushing bodies around as you well know and what '
           "would a note say, Dan? 'Cat dead, details later'?"},
 {'person': 'Jack Burton',
  'quote': "You remember what ol' Jack Burton always says at a time like "
           "that: 'Have ya paid your dues, Jack?' 'Yessir, the check is in "
           "the mail.'"},
 {'person': 'Igor', 'quote': "Well, why isn't it Froaderick Fronkensteen?"}]

While this is valid JSON (a root object can be either an object or an array), Flask will only allow objects at the root level to prevent stuff like this. However, asking a schema to create a dictionary if it serializes a collection isn't hard to do at all:

In [12]:
from marshmallow import post_dump

class QuoteSchema(Schema):
    class Meta:
        additional = ('person', 'quote')
    
    @post_dump(raw=True)
    def wrap_if_many(self, data, many=False):
        if many:
            return {'quotes': data}
        return data
    
pprint(QuoteSchema(many=True).dump(quotes).data)
{'quotes': [{'person': 'Herbert West',
             'quote': 'I was busy pushing bodies around as you well know '
                      "and what would a note say, Dan? 'Cat dead, details "
                      "later'?"},
            {'person': 'Jack Burton',
             'quote': "You remember what ol' Jack Burton always says at a "
                      "time like that: 'Have ya paid your dues, Jack?' "
                      "'Yessir, the check is in the mail.'"},
            {'person': 'Igor',
             'quote': "Well, why isn't it Froaderick Fronkensteen?"}]}
In [13]:
Image(url="http://i.imgur.com/BUtt2Jd.gif")
Out[13]:

Step Three: Briefly Flask

Now that the Quote objects can be correctly serialized to JSON, feeding it from Flask is easy peasy.

In [ ]:
from flask import Flask, jsonify

app = Flask('notebooks')
# reuse the same QuoteSchema instance rather than creating new with each request
QuoteSerializer = QuoteSchema()

@app.route('/quote/<int:id>')
def single_quote(idx):
    if not 0 <= idx < len(quotes):
        # flask allows return a tuple of data, status code, headers (dict)
        # status code is 200 by default
        data = {'error': 'quote out of range'}, 400
    else:
        data = QuoteSerializer.dump(quote[idx]).data
    return data

Step Four: Deserialization

However, only getting a quote is pretty simple stuff. What if we wanted to create new Quote objects from JSON? This is pretty easy to do by hand with Flask's request object (note: the request.get_json method is currently the recommended method for plucking JSON out of the request rather than using the request.json attribute):

In [ ]:
from flask import request

@app.route('/quote/', methods=['POST'])
def make_new_quote():
    # get_json returns dict or None on failure
    json = request.get_json()
    if json and 'quote' in json:
        quotes.append(Quote(person=json['person'], quote=json['quote']))
        msg = {'success': True, 'msg': 'Added quote.'}
    else:
        msg = {'success': False, 'msg': 'must specify quote in JSON request'}, 400
    return msg

However, if we're deserializing complex objects, say a tracklist that has an attribute that holds track objects which reference artist objects. Pretty soon manually deserializing an object becomes quite...messy. However, there is a better way. Marshmallow not only serializes objects, but will also handle deserialization if we give it a little bit of help:

In [14]:
class QuoteSchema(Schema):
    class Meta:
        additional = ('person', 'quote')
    
    @post_dump(raw=True)
    def wrap_if_many(self, data, many=False):
        if many:
            return {'quotes': data}
        return data
    
    def make_object(self, data):
        assert 'person' in data and 'quote' in data, "Must specify person and quote in request"
        return Quote(person=data['person'], quote=data['quote'])
    
QuoteSchema().load({"person": "Ash", "quote": "Good. Bad. I'm the guy with the gun."}).data
Out[14]:
Quote(person='Ash', quote="Good. Bad. I'm the guy with the gun.")

Just the opposite of what we had before. Dictionary in, Object out. We can also deserialize a collection as well:

In [15]:
QuoteSchema(many=True).load([
        {'person':'Ash', 'quote':"Good. Bad. I'm the guy with the gun."}, 
        {'person': 'Shaun', 'quote': "You've got red on you."}
]).data
         
Out[15]:
[Quote(person='Ash', quote="Good. Bad. I'm the guy with the gun."),
 Quote(person='Shaun', quote="You've got red on you.")]

Hopefully the advantage of using Marshmallow for even sending and receiving simple JSON objects is apparent. With 11 lines we can take an object and cast it to a dictionary and we can take a dictionary with certain keys and build an object with it. Sure, we're just serializing and deserializing a namedtuple..."But that's how it always begins. Very small."

In [16]:
Image(url="http://i.imgur.com/zvv3ymL.gif")
Out[16]:

Step Five: Routing with Flask-Restful

Flask-Restful is a great library that builds on top of Flask's MethodView and makes it pretty easy to support multiple API styles (XML, CSV, JSON, etc). It ships with JSON serialization by default, leaving the others up to the user to implement. There's a bunch of other features as well, but I'm only to tread on the incredibly useful routing mechanism in place here.

All we need to do to hook into this is to inherit from flask_restful.Resource and return dictionaries from our methods. Dictionaries like the ones produced by Marshmallow. Changing the routing from vanilla Flask routing to class based is a little weird at first, but it quickly becomes very intuitive.

And, since the methods for deserialization are in place, let's also handle accepting JSON and appending quotes to our little list.

In [17]:
from flask.ext.restful import Resource
from flask import request

class SingleQuote(Resource):
    def get(self, idx):
        if idx and 0 <= idx < len(quotes):
            # flask-restful also allows data, status code, header tuples
            return QuoteSerializer.dump(quotes[idx]).data
        return {'error': 'quote index out of range'}, 400

class ListQuote(Resource):
    def get(self):
        return QuoteSerializer.dump(quotes, many=True).data
    
    def post(self):
        json = request.get_json()
        if not json:
            return {"success": False, "msg": "malformed request"}, 400
            
        if not 'quote' in json:
            return {"success": False, "msg": "must specify quote in request"}, 400
        
        else:
            # remember QuoteSchema.make_object causes an assert
            try:
                q = QuoteSerializer.load(request['json']).data
            except AssertionError as e:
                return {'success': False, 'msg': str(e)}, 400
            else:
                quotes.append(q)
                return {"success": True, "msg": "Quote added."}

And then we simply register these resources on an API object that's hooked up to our application:

In [ ]:
from flask.ext.restful import Api

api = Api(app)
api.register_resource(SingleQuote, '/quote/<int:id>')
api.register_resource(ListQuote, '/quotes')

Step Six: Persistence with SQLA

This is great and all, but these quotes will only last as long as the session runs. If we need to restart, we lose it all except for the preloaded quotes. To achieve real persistence, we should shake up with a database. SQLite is a good choice for this, plus bindings come native with Python.

In [ ]:
from flask.ext.sqlalchemy import SQLAlchemy

app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///quotes.db'

db = SQLAlchemy(app)

class Quote(db.Model):
    
    id = db.Column(db.Integer, primary_key=True)
    person = db.Column(db.Unicode(50))
    quote = db.Column(db.UnicodeText)

Our schema doesn't change at all. Marshmallow doesn't know or care if we're passing a namedtuple or a SQLA model, just that it has the correct attributes. This is great because we can write many quick tests with something like namedtuple to verify our schema behaves correctly and then just a few integration tests with the models.

In [18]:
Image(url="http://i.imgur.com/mudwVxd.gif")
Out[18]:

However, our resource end points do need to change some, though. Since we're dealing with SQLA models now and not just simple lists. The changes are trivial:

In [19]:
class SingleQuote(Resource):
    def get(self, idx):
        if idx:
            return Quote.query.get(idx) or {'error': 'quote does not exist'}
        return {'error': 'must specify quote id'}

class ListQuote(Resource):
    def get(self):
        return QuoteSerializer.dump(quotes, many=True).data
    
    def post(self):
        json = request.get_json()
        if not json: # get_json will return a dict or None
            return {"success": False, "msg": "malformed request"}, 400
            
        if not 'quote' in json:
            return {"success": False, "msg": "must specify quote in request"}, 400
        
        else:
            try:
                q = QuoteSerializer.load(request['json']).data
            except AssertionError as e:
                return {'success': False, 'msg': str(e)}, 400
            else:
                db.session.add(q)
                db.session.commit()
                return {"success": True, "msg": "Quote added."}

Just two simple changes to go from list to SQLA models. Be sure to run db.create_all() somewhere before and load up initial quotes and Quotly is up and running, ready to send and receive cheeky movie quotes for everyone.

Parting Thoughts

While this was more of a "hit the ground running" guide to building a simple REST API with Flask and its little ecosystem, I hope it's been enlightening. I've included the whole application in this gist for reference. If you see a bug or have questions, hit me up on twitter (@just_anr) or on github (justanr).

In [ ]: