A collection of computer, gaming and general nerdy things.

Wednesday, April 9, 2014

SQLAlchemy, many-to-many-to-many relations, association proxies and I.

I'm currently building an API driven/based web jukebox for shits and giggles. It's based on Flask and Flask-SQLAlchemy (I went back and forth if I should use Flask-SQLAlchemy or vanilla SQLAlchemy) primarily, with a few other things thrown in, but that'll be the focus of another post. This post is about hitting snags while working with SQLAlchemy, bear in mind I am by no means an expert with SQLAlchemy or even Flask or Python.

Most relationships with SQLAlchemy I've found to be extremely straightforward:

  • Many-to-many uses an intermediary table
  • One-to-many places the relationship command on the one side and a foreign key on the many side.
  • Many-to-one places both the relationship and the foreign key on the many side.
  • One-to-one can place the foreign key and the relationship on either as long as uselist is flagged false in relationship.
  • You can flag backref on relationship to make it bidirectional.
What happens when you need a many-to-many-to-many relationship? This is where I hit my snag at. I'm not great at holding complex relationship maps in my head all at once, so I tend to write them out or utilize my underutilized whiteboard.
FlaskAmp model relationships.
As you can see from the poor quality photo above (you'd think it'd be better since my girlfriend is a photographer), most of the relationships are pretty obvious. The only ones that might not be are between Tracklist-Album-Playlist, in that case it's an example of SQLA's table inheritance: Album and Playlist are subclasses of Tracklist. But that's not where I'm having issues at. It's the relationship between Member-Tag-Artist.

What's maddening is that I know how to do this with SQL. Sure, it'd give me a little trouble but I know where to start. With SQLAlchemy...I'm pretty much lost. My girlfriend asked me, "Well, if you know how to do it that other way, why don't you?" Of course, the answer is obvious to a programmer: dropping down to raw SQL causes the ORM to not know anything special about the query. Moreover, it largely defeats the purpose of using an ORM: not having to write SQL (I know it's necessary sometimes to fine tune results).

More importantly, the issue isn't even the relations. It's how I want to the end result to come out.
  • Member(name='justanr').tags['Death Metal'].append('Gorguts'), or even Member('justanr').tags['Death Metal'].append(Artist.get_from(name='Gorguts')) (though, I feel the former is cleaner, even if more magical)
  • Artist(name='Gorguts').tags = {'Death Metal':345, 'Tech Death':299, 'Weird':194}
  • Tag(name='Death Metal').artists = {'Gorguts':345, 'Immolation':311, 'The Faceless':148}
Members represent their tagged artists as a dictionary of lists: keys are tags, list values are artists. Artists and Tags represent their relationships with a dictionary where keys are either tags or artists and values are integers (sounds like collections.Counter, to me...).

Here's my logic for approaching the situation currently:
  • Members can use many tags on many artists, artists can have many tags from many members.
  • Ergo: Members are the actors, tags are the direct object and artists are indirect objects.
  • Split the relationship in two 
  • Since members directly apply tags, those two get their own relationship: MemberTag.
  • Since members affect artists indirectly, artists are paired with membertags.
So, instead of one extra table mapping a many-to-many-to-many like I would have done by hand, I now have two: MemberTag and MemberTaggedArtist. So far, I'm good. Now let's consider proper restraints on the system:
  • Member and Tag both have a one-to-many relationship with MemberTag (each member-tag combination is unique).
  • MemberTag has a one-to-many relationship with MemberTaggedArtist (and each pairing of MemberTag and Artist is unique).
  • Artist has a one-to-many relationship with MemberTaggedArtist.
  • Member, Tag and Artist all have unique constraints on their name attributes.
  • MemberTag has a composite unique restraint on member_id and tag_id.
  • MemberTaggedArtist has a composite unique restraint on membertag_id and artist_id.
  • Tag.artists and Artist.tags are read only (not hard to manage, prevents tampering and integrity errors).
What I need to happen, for this to work how I want, is:
  • Tag to proxy it's name attribute to Member.tags as a key, adding attribute_mapped_collection('tag') to the relationship between Member and MemberTag solves that.
  • Proxy a list of artists from MemberTag to Member and be able to create values in it with just strings on Member. An association proxy, or two, should handle this.
  • Proxy a set of artists to serve as keys on Tag.artists and count the number of times they appear total as values.
  • Proxy a set of tags to serve as keys on Artists.tags and count the number of time they appear total as values.
Since the logic behind Artist.tags and Tag.artists is likely to be exactly the same except for a few words switched around, I'll likely make it a function that both models can use.

Hopefully this'll work since I've just now begun to wrap my head around it.