On Rap Genius we have songs that are "published" and those that are "works in progress". Recently we decided to streamline our editorial process by creating a third category ("under review") for songs that are basically complete and just need to be edited. Here's how I did it
Before the change
How were we distinguishing between published and non-published songs before adding the concept of "under review"? One strategy here is to add a field in your songs database table for is_published. This field would store a 1 if the song was published, and a 0 if it was unpublished.
This isn't bad, but it's more efficient to use a field called published_at, which stores either the time of publication if the song is published, or NULL if the song is unpublished. This strategy allows you to store the time of publication without using an extra field, and is what I was doing.
A false start
We could use the same approach for adding the concept of "under_review". I.e., we could add a field for nominated_for_publishing_at (I prefer "nominate for publishing" to "marked under review") that would store the time something was marked under review, and NULL if the song was not under review.
There are a few problems with this approach:
- It requires us to create a lot of fields.
- Both now. Actually I lied -- in addition to published_at, we also had published_by_id (which stored the user ID of the user who published the song). If we used this approach for adding "under review", we would also need nominated_for_review_by_id
- And later. Adding nominated_for_review_by_id isn't such a big deal, but suppose down the line we want to be able to mark some songs hidden (e.g., those with very few explanations). If we went with this approach, we would have to then add 2 more fields (hidden_at and hidden_by_id) -- obviously this strategy doesn't scale
- It doesn't capture the relationship between published and under review. Obviously a song cannot be both published and under review. This wasn't a problem before because the published_at field could only take one of two values, and so we didn't have to worry about any song ending up in an invalid state. Not so any longer -- we have to write our own application-level code to make sure that no song is both published and under review.
It gets worse: once we've written the code that ensures that no song can be in more than one "state" simultaneously, we'll might need still more code to ensure that a song's state transitions are "valid". E.g., we might want to prohibit songs from transferring from published to under review, or maybe we want to require that songs pass through under review before getting published.
The right approach here is to implement a state machine. Instead of dealing with several disconnected X_at fields, a state machine allows you to deal in concepts like:
- State. A song can be in one state at a time
- Events. A song has events (e.g., "publish") that transition from certain states to other states (e.g., under review to published). If you try to transition a song to an invalid state (e.g., straight to published without passing through under review), you get an error.
- state
- state_updated_at
- state_updated_by_id
My implementation
AASM (which kind of stands for "acts as state machine" -- for awhile it was voguish to name your Rails plugin "acts as whatever") is the most popular Rails state machine plugin (state_machine is actually better, but I couldn't get it to work). Here's my AASM state machine:

What's going on here?
- 52: I tell AASM the name of the state column for my songs table (I believe the default here is aasm_state, which isn't ideal because it ties the state concept of your song model to your current state machine implementation -- I'd like to be able to swap out AASM for a different plugin without modifying my database schema)
- 53: All songs start with state = "work_in_progress"
- 55-57: Songs can also be "under_review", and "published". Note the information duplication here -- AASM should be able to infer that my default state ("work_in_progress") is valid without me having to list it again here. The same is true of the states listen in the transitions below; AASM should automatically recognize any state referenced by a transition as valid
- 59-70: Song state transitions. Every transition is automatically exposed as a method I can call on any song object to change its state. For example, when I want to publish some_song, I'll call some_song.publish. Transitions also allow you to specify:
- Valid to and from states. For example, it doesn't make sense to "nominate_for_publishing" an already-published song. By listing "work_in_progress" as the only valid from state for "nominate_for_publishing", I tell AASM to raise an error if I try to nominate a song for publishing that's in any state other than "work_in_progress"
- Requirements for the state transition to take place. For example, the "publish" event will fail for any song that doesn't have a description, embed link, rating, and category
- The implementation of guards is another design error in the AASM plugin. Specifically, the conditions for moving a song into a given state are likely to be identical to the conditions for being able to save a song that's already in that state. E.g., if I can't publish a song that lacks a description, I probably can't take an already-published song and remove its description. Since AASM doesn't take this into account, I have to duplicate the guard logic as a validation to account for the latter case:
- A callback function to run on successful transition. I haven't implemented any yet, but it's easy to imagine an announce_new_song_to_mailing_
list method that executes whenever a song is published
