CakePHP hasAndBelongsToMany relationship queries demystified (HABTM, Contained, Pagination and more)

29 Jul
29/07/2012

Introduction

CakePHP is a pretty powerful development php framework. If you are like me, you understand that Ruby on Rails is way cooler these days, but do not feel ashamed to admit you’re coding in PHP because sometimes it just gets the job done faster. And without being less elegant if you’re using the right tools.

I’ve used CakePHP for Google+ Counter‘s front end tier and used it for numerous other professional web projects and frequently share my findings via Twitter (@ralf).

However, my relationship with the framework is a love-hate one. The main reason: CakePHP’s documentation is far from good. At a first glance, it appears to be fairly complete and comprehensive. There are tutorials, an API documentation and even complete online books. Unfortunately, once you start coding real-world projects, you’ll find that many parts of the docs to not be unambiguous and leaving many questions open.

Worst thing: The above is mostly true for stuff related to CakePHP’s powerful object-relational mapper (ORM).

As efficiently querying the model is an extremely important part of any web application, it’s a shame that CakePHP leaves developers guessing and in trial-and-error mode most of the time. Hop over to Stack Overflow and see for yourself. Even on the #CakePHP IRC channel, users are asking the same query related questions over and over again – a clear sign for a lack of documentation.

Recently I ran into what I expected to be a very basic and everyday scenario that ultimately took me days to figure out. Once I got it right, I felt so happy, that I wanted to give something back to the community. Hence this article.

All of the below is valid for CakePHP 2.x.

The Model

I had to model a pretty common scenario: Users belonging to groups. Obviously, any user can belong to many groups and any group can have more than a single user. In CakePHP the relationship the User and the Group model share is called “has and belongs to many”, abbreviated as HABTM.

I started with fairly straightforward model classes (omitting the validation code for brevity):

Note that I’ve set the models up having the HABTM relationship with each others. Behind the scenes CakePHP will leverage a join table which – by convention – is named GroupsUsers. Three columns (id, group_id and user_id) will be populated to link m users to n groups.

Challenge I:
How to save a single user belonging to multiple groups and have CakePHP add the correct records to the join table?

Sounds pretty basic, I know. And in fact, it’s not all that complicated. The problem: CakePHP’s documentation! For example, the chapter about saving data for the saveAssociated() method states: “Method used to save multiple model associations at once.”

Immediate questions that would need clarification:

  • What exactly does saving model associations mean? Does it mean that the related models are saved to the database, too? If so, what else does saveAll() do in addition?
  • In case neither of the data already exists in the database, does save also mean initially create?
  • Does this only work for hasOnehasMany and belongsTo relationships?
  • Does it work for hasAndBelongsToMany relationships, too?

The docs leave us guessing.

My finding: Whatever I tried to do, neither could I get saveAssociated() nor saveAll() to create and save a new user along with a couple of groups to my database. If there is a way, at least nobody in the Cake community seems to be aware of it [1][2].

When reading through the docs, you get the impression that your code should look somewhat like this:

If prior to creating the user you create a group and put a reference to it as in line 13 of the following code snippet, CakePHP correctly inserts the appropriate record into the join table. So this works:

In line 1 we fetch a group object for an existing group from the database and in line 14 we associate it with the to-be-created user. Saving the user does also insert a new record into the join table.

You might be tempted to just hand over an array of groups as in the following snippet (fetching it via find(‘all’) in line 1 and handing it to the new user object in line 14):

Turns out, CakePHP does not insert anything into the join table.

The complete association gets lost. To be clear, if you change line 14 from ‘Group’=>$groups to ‘Group’=>$groups[1] a record is inserted into the join table. If you hand over more than a single group, nothing gets inserted. Now, that’s what I call non-intuitive!

The correct solution is to hand over an array of just the group ids:

In lines 3-6 we create an array of group ids which we hand to the to-be-created user in line 20. CakePHP correctly inserts all the right records into the join table. Why the hell this is not clearly stated nor we find any example of such a frequent use case anywhere in the documentation remains the secret of the authors.

Challenge II:
Finding all users belonging to a specific group

Assume we’ve used the above to create a couple of groups and hundreds of users with each user belonging to more than one of these groups. Our join tables contains all records linking user IDs to group IDs. Next we want to get all users belonging to a specific group. Pretty common scenario, one would think. Again, not so in CakePHP.

Without bothering you with my trial-and-error learning curve, here is the second piece of totally non-intuitive code:

In order to find all Users belonging to a specific Group you actually do have to perform your find on the Group model and not on the User model. I found absolutely no way doing it the other way around. No matter how complex find parameters I handed over to the User model, I never got the result I was looking for. The response array for the above query looks something like this:

Challenge III:
Finding all users belonging to a specific group filtered on a property of the User model

While in written, the headline might sound crazy, there are actually very common use cases demanding for this sort of query:

  • Show me all users belonging to the admin group that have a status of “locked”.
  • Show me all posts with the tag “cakephp” authored by @ralf.

Again, I spare you my days of trial and error. Here is the correct find query:

Before this will work though, there is one more thing you have to do and again, I don’t understand why the documentation does not at all emphasize this in the utmost possible way: You have to make your Group model containable!

It’s as simple as putting public $actsAs = array(Containable’); into your model’s class declaration. If you don’t do this, CakePHP will not give you any error, but it will simply ignore the conditions you set for the related model. This is how my Group model ends up looking like:

Another one of my favorite non-intuitive CakePHP aspects: In the above example you have to make the Group model containable, while it actually is the User model that is contained. Note, that the User model does not have to act as containable but again, once you detach the containable behavior from the Group model, you will no longer be able to constrain on properties of the User model in this HABTM scenario.

Supporting pagination

If your app requires you to support pagination, you might feel tempted to extend the above query like so:

This will yield a Model “User” is not associated with model “3″ error. The limit statement alone works, it’s the page that causes the error. The reason for this: page is syntactic sugar offered by CakePHP without a one-to-one representation in SQL. The framework doesn’t seem to recognize it when attached to contained models.

You can make use of paging by using SQL’s native syntax which is:

limit <start index>, <number of records>

CakePHP’s

'limit' => 5,
'page' => 5

would become

limit' => $page*$limit.', '.$limit

which would evaluate to

'limit' => '25, 5'

That way, you can page through the records of any contained model. Just one of those little CakePHP inconsistencies you should be aware of.

Challenge IV: Adding the user to an additional group

You might guess it by now: No, it’s not straight forward to have CakePHP insert an additional record to the join table.

While I don’t provide a code snippet for this, all you need to know is that CakePHP always deletes all existing records in the join table for a given model when you make a change. Though this is absolutely counter intuitive, you did understand correctly: Prior to inserting any new record for an existing user into the join table, CakePHP deletes all existing entries for this user.

In fact, this has been such a long standing and well known issue in the CakePHP community, that people started creating behaviors to work around this. Unfortunately, I could get none of these to work.

To solve Challenge IV you would have to do the following:

  • Find all groups the user belongs to.
  • Move just their IDs to a new array. (Remember Challenge I?)
  • Add the ID for the additional group the the array.
  • Save the user setting the Group field to the array of IDs.

If you do it in any other way, CakePHP will remove existing join table records without asking.

Closing remarks

I’m not a CakePHP professional.

That means, all of the above are my findings from days of trying and starting to hate the incomplete documentation. I don’t know, whether what I’ve outlined is the only way of doing what I wanted or if I even describe the best approach. I just did not get any better response from the community, nor could anybody point me to the “Cake way” of addressing these common requirements. All I’m sure of is, it works.

And while folks in the #CakePHP IRC channel more often than not accused me to RTFM, I couldn’t find any of the above clearly explained anywhere else.

Should you have any comments, please do make use of the comments feature. And don’t forget to follow me on Twitter. I’m @ralf there and always open for a constructive conversation.


Tags: ,
9 replies
  1. Mike says:

    Thank you so much for this line of code: “public $actsAs = array(‘Containable’);” (Challenge III). I am myself a cakeHP beginner and I must say that sometimes it can be very hard and time consuming to find the right information (especially when we don’t know what we need to search !).

    Once again, thank you for this ^^

    Reply
  2. Tim says:

    Hey Ralf,

    Thanks for this article. It definitely gave me some insights on HABTM relationships in CakePHP. Easy to follow your code, and easy to see why you would get frusturated :)

    One suggestion if you don’t mind: Have you thought about trying to add some of these insights to the CakePHP docs themselves? (easy enough by forking on github and submitting a pull request) I saw how you said in your closing remarks that the community on the IRC channel wern’t the friendliest, and maybe that would make you not want to help them out, but I found this article very helpful, and only came across it on the 2nd page of Google (I know its only the 2nd page but most don’t look past the first, right?) If you fix the docs themselves, more CakePHP users will benefit, not just the readers of this blog post. Just a thought.

    Thanks again.

    ps: looks like some of the issues raised here might get addressed in the upcoming 3.0 version – hopefully – http://bakery.cakephp.org/articles/lorenzo/2012/07/06/3_0_a_peek_into_cakephps_future

    Reply
  3. Martin says:

    works perfect, thanks for saving a lot of hours.
    Gruesse aus Madrid
    Martin

    Reply
  4. Martin says:

    o could resolve it also this way:

    $this->Post->recursive = 1;

    $this->Post->bindModel(array(
    ‘hasOne’ => array(‘ChannelsPost’),
    ),false);

    $this->paginate = array(
    ‘conditions’ => array(‘ChannelsPost.channel_id’ => $channelId),
    ‘order’ => array(‘Post.created’ => ‘DESC’),
    ‘limit’ => 2,
    //’group’ => ‘Post.id’,
    ‘recursive’ => 1
    );

    $this->set(‘posts’, $this->paginate());

    regards, martin

    Reply
  5. Meg says:

    So glad I’m not the only one who is struggling with CakePHP’s inadequate documentation! Thank you so much for sharing your finds!

    Reply
  6. Edwin says:

    Wow, thank you so much. I’ve been doing so much trail and error for couple of days that the situation was starting to look grim. Until I read your Challenge III that is.

    Reply
  7. Faiz Shukri says:

    Thanks for the info. Doc about HABTM in cakephp site really unhelpful but this one save my day.

    Reply
  8. Kornel says:

    Hello Ralf,

    Thank you so much for ending a two week torment. I have to integrate a system with my app via JSON and the problem to save various id’s for the same user gave me some really bad headaches, cause like you said documentation of cakephp is not quiet consistent.

    all my respects,

    Kornel

    Reply
  9. Oliver says:

    I am considering using Cake for some development. I just browsed through the documentation tonight, an d was totally horrified by its inconsistency and lack of clarity.
    By comparison, it seems that the Django framework, for Python, is so cutting-clear and powerful. With such a straightforward documentation. For instance, it seems easy to query objets via filtered search on their associated objets:
    https://docs.djangoproject.com/en/dev/topics/db/examples/many_to_many/ See paragraph “Many-to-many relationships can be queried using lookups across relationships”.
    I think It is nearly worth it to learn Python just to acces the sheer clarity of Django, instead of waisting my time with that creepy Cake…
    (See also the excellent Apprendre la programmation web avec Python et Django, in French only, http://www.amazon.com/Apprendre-programmation-Python-Edition-ebook/dp/B00A8LS8ZU)

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© Copyright 2013 by Ralf Rottmann. rottmann.net is a work in progress by Ralf Rottmann. This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
If you would like to make use of any of the content you see here, please contact the author.