CakePHP hasAndBelongsToMany relationship queries demystified (HABTM, Contained, Pagination and more)
CakePHP is a pretty powerful development php framework. If you are like me, you understand that Ruby on Rails is way cooler these days, but do not feel ashamed to admit you’re coding in PHP because sometimes it just gets the job done faster. And without being less elegant if you’re using the right tools.
I’ve used CakePHP for Google+ Counter‘s front end tier and used it for numerous other professional web projects and frequently share my findings via Twitter (@ralf).
However, my relationship with the framework is a love-hate one. The main reason: CakePHP’s documentation is far from good. At a first glance, it appears to be fairly complete and comprehensive. There are tutorials, an API documentation and even complete online books. Unfortunately, once you start coding real-world projects, you’ll find that many parts of the docs to not be unambiguous and leaving many questions open.
Worst thing: The above is mostly true for stuff related to CakePHP’s powerful object-relational mapper (ORM).
As efficiently querying the model is an extremely important part of any web application, it’s a shame that CakePHP leaves developers guessing and in trial-and-error mode most of the time. Hop over to Stack Overflow and see for yourself. Even on the #CakePHP IRC channel, users are asking the same query related questions over and over again – a clear sign for a lack of documentation.
Recently I ran into what I expected to be a very basic and everyday scenario that ultimately took me days to figure out. Once I got it right, I felt so happy, that I wanted to give something back to the community. Hence this article.
All of the below is valid for CakePHP 2.x.
I had to model a pretty common scenario: Users belonging to groups. Obviously, any user can belong to many groups and any group can have more than a single user. In CakePHP the relationship the User and the Group model share is called “has and belongs to many”, abbreviated as HABTM.
I started with fairly straightforward model classes (omitting the validation code for brevity):
Note that I’ve set the models up having the HABTM relationship with each others. Behind the scenes CakePHP will leverage a join table which – by convention – is named GroupsUsers. Three columns (id, group_id and user_id) will be populated to link m users to n groups.
How to save a single user belonging to multiple groups and have CakePHP add the correct records to the join table?
Sounds pretty basic, I know. And in fact, it’s not all that complicated. The problem: CakePHP’s documentation! For example, the chapter about saving data for the saveAssociated() method states: “Method used to save multiple model associations at once.”
Immediate questions that would need clarification:
- What exactly does saving model associations mean? Does it mean that the related models are saved to the database, too? If so, what else does saveAll() do in addition?
- In case neither of the data already exists in the database, does save also mean initially create?
- Does this only work for hasOne, hasMany and belongsTo relationships?
- Does it work for hasAndBelongsToMany relationships, too?
The docs leave us guessing.
My finding: Whatever I tried to do, neither could I get saveAssociated() nor saveAll() to create and save a new user along with a couple of groups to my database. If there is a way, at least nobody in the Cake community seems to be aware of it .
When reading through the docs, you get the impression that your code should look somewhat like this:
If prior to creating the user you create a group and put a reference to it as in line 13 of the following code snippet, CakePHP correctly inserts the appropriate record into the join table. So this works:
In line 1 we fetch a group object for an existing group from the database and in line 14 we associate it with the to-be-created user. Saving the user does also insert a new record into the join table.
You might be tempted to just hand over an array of groups as in the following snippet (fetching it via find(‘all’) in line 1 and handing it to the new user object in line 14):
Turns out, CakePHP does not insert anything into the join table.
The complete association gets lost. To be clear, if you change line 14 from ‘Group’=>$groups to ‘Group’=>$groups a record is inserted into the join table. If you hand over more than a single group, nothing gets inserted. Now, that’s what I call non-intuitive!
The correct solution is to hand over an array of just the group ids:
In lines 3-6 we create an array of group ids which we hand to the to-be-created user in line 20. CakePHP correctly inserts all the right records into the join table. Why the hell this is not clearly stated nor we find any example of such a frequent use case anywhere in the documentation remains the secret of the authors.
Finding all users belonging to a specific group
Assume we’ve used the above to create a couple of groups and hundreds of users with each user belonging to more than one of these groups. Our join tables contains all records linking user IDs to group IDs. Next we want to get all users belonging to a specific group. Pretty common scenario, one would think. Again, not so in CakePHP.
Without bothering you with my trial-and-error learning curve, here is the second piece of totally non-intuitive code:
In order to find all Users belonging to a specific Group you actually do have to perform your find on the Group model and not on the User model. I found absolutely no way doing it the other way around. No matter how complex find parameters I handed over to the User model, I never got the result I was looking for. The response array for the above query looks something like this:
Finding all users belonging to a specific group filtered on a property of the User model
While in written, the headline might sound crazy, there are actually very common use cases demanding for this sort of query:
- Show me all users belonging to the admin group that have a status of “locked”.
- Show me all posts with the tag “cakephp” authored by @ralf.
Again, I spare you my days of trial and error. Here is the correct find query:
Before this will work though, there is one more thing you have to do and again, I don’t understand why the documentation does not at all emphasize this in the utmost possible way: You have to make your Group model containable!
It’s as simple as putting public $actsAs = array(Containable’); into your model’s class declaration. If you don’t do this, CakePHP will not give you any error, but it will simply ignore the conditions you set for the related model. This is how my Group model ends up looking like:
Another one of my favorite non-intuitive CakePHP aspects: In the above example you have to make the Group model containable, while it actually is the User model that is contained. Note, that the User model does not have to act as containable but again, once you detach the containable behavior from the Group model, you will no longer be able to constrain on properties of the User model in this HABTM scenario.
If your app requires you to support pagination, you might feel tempted to extend the above query like so:
This will yield a Model “User” is not associated with model “3” error. The limit statement alone works, it’s the page that causes the error. The reason for this: page is syntactic sugar offered by CakePHP without a one-to-one representation in SQL. The framework doesn’t seem to recognize it when attached to contained models.
You can make use of paging by using SQL’s native syntax which is:
limit <start index>, <number of records>
'limit' => 5, 'page' => 5
limit' => $page*$limit.', '.$limit
which would evaluate to
'limit' => '25, 5'
That way, you can page through the records of any contained model. Just one of those little CakePHP inconsistencies you should be aware of.
Challenge IV: Adding the user to an additional group
You might guess it by now: No, it’s not straight forward to have CakePHP insert an additional record to the join table.
While I don’t provide a code snippet for this, all you need to know is that CakePHP always deletes all existing records in the join table for a given model when you make a change. Though this is absolutely counter intuitive, you did understand correctly: Prior to inserting any new record for an existing user into the join table, CakePHP deletes all existing entries for this user.
In fact, this has been such a long standing and well known issue in the CakePHP community, that people started creating behaviors to work around this. Unfortunately, I could get none of these to work.
To solve Challenge IV you would have to do the following:
- Find all groups the user belongs to.
- Move just their IDs to a new array. (Remember Challenge I?)
- Add the ID for the additional group the the array.
- Save the user setting the Group field to the array of IDs.
If you do it in any other way, CakePHP will remove existing join table records without asking.
I’m not a CakePHP professional.
That means, all of the above are my findings from days of trying and starting to hate the incomplete documentation. I don’t know, whether what I’ve outlined is the only way of doing what I wanted or if I even describe the best approach. I just did not get any better response from the community, nor could anybody point me to the “Cake way” of addressing these common requirements. All I’m sure of is, it works.
And while folks in the #CakePHP IRC channel more often than not accused me to RTFM, I couldn’t find any of the above clearly explained anywhere else.
Should you have any comments, please do make use of the comments feature. And don’t forget to follow me on Twitter. I’m @ralf there and always open for a constructive conversation.