Vhost caching issue

While I was deploying the new website, I ran into some issues where CakePHP was blowing up on missing model database tables. The weirdness was that these models were not part of my application, but were part of another application running on the same nginx box. I immediately deduced that the problem was the cache, but where was the disconnect? Since the issue was related to model caching, it had to be part of the internal CakePHP caching mechanism.

The problem was a simple one, I forgot to change the $prefix (defaults to myapp_) cache variable in Config/core.php. A small oversight, but a problematic one at that. Just a reminder to everyone else that this variable does exist and to change it when running vhosts.

Naming your cache keys

Everyone caches, that's a pretty well known fact. However, the problem I always seemed to have was how to properly name my cache keys. After much trial and tribulation, I believe I have found a great way to properly name cache keys. To make things easy, my keys usually follow this format.

<model|table>__<function|method>[-<params>]

To clear up some confusion, it goes as follows. The first word of your cache key should be your model name (or database table name), as most cached data relates to a database query result. The model name is followed by a double underscore, which is then followed by the function/method name (which helps to identify exactly where the cache is set), which is then followed by multiple parameters (optional). Here's a quick example:

public function getUserProfile($id) {
	$cacheKey = __CLASS__ .'__'. __FUNCTION__ .'-'. $id;
	// Check the cache or query the database
	// Cache the query result with the key
	// Return the result
}

The $cacheKey above would become: User__getUserProfile-1337, assuming the user's ID is 1337. Pretty easy right? Besides the verbosity that it takes to write these constants, it works rather well (unless you want to write the method and class manually). You may also have noticed that I used __FUNCTION__ over __METHOD__ -- this was on purpose. The main reasoning is that __METHOD__ returns the class and method name, like User::getUserProfile, while __FUNCTION__ just returns the method name.

The example above will work in most cases, but there are other cases where something more creative is needed. The main difficulty is how to deal with array'd options. There are a few ways of dealing with that, the first is checking to see if an ID or limit is present, if so, use that as the unique value. If none of the options in the array are unique, you can implode/serialize the array and run an md5() on the string to create a unique value.

User::getTotalActive();
// User__getTotalActive
Topic::getPopularTopics($limit);
// Topic__getPopularTopics-15
Forum::getLatestActivity($id, $limit);
// Forum__getLatestActivity-1-15
Post::getAllByUser(array('user_id' => $user_id, 'limit' => $limit));
// Post__getAllByUser-1-15
User::searchUsers(array('orderBy' => 'username', 'orderDir' => 'DESC'));
// User__searchUsers-fcff339541b2240017e8d8b697b50f8b

In most cases an ID or query limit can be used as a unique identifier. If you have another way that you name your cache keys or an example where creating the key can be difficult, be sure to tell us about it!

Using the session within models

This is something that everyone wants to do, but are afraid it breaks the MVC paradigm. Theoretically, the session should be a model, seeing as how it represents data and manages adds, edits, deletes, etc. Regardless, it's a much easier approach to use the session within the model directly, instead of having to pass it as an argument within each method call. Other developers who have attempted this task either try to import the SessionComponent or to use $_SESSION directly.

If you use the component, then you are using the class outside of its scope (a controller helper). If you use the $_SESSION global, then you don't have the fancy Cake dot notation access (Auth.User.id, etc) as well as its session management and security. But don't worry, Cake comes packaged with this powerful class called CakeSession, which both the SessionComponent and helper extend. Merely instantiate this class within your AppModel and you are set.

// Import the class
App::import('Core', 'CakeSession');
// Instantiate in constructor
public function __construct($id = false, $table = null, $ds = null) {
	parent::__construct($id, $table, $ds);
	$this->Session = new CakeSession();
}
// Using it within another model
$user_id = $this->Session->read('Auth.User.id');

Now you have control of the session within the model, bundled with Cake's awesome session management.

Fixing the missing Model and AppModel replacement errors

If you are a CakePHP developer, I am almost certain you have run into this problem on multiple occasions. What problem are you referring to you ask? Well, the problem that you have when your model is not being found and Cake automatically substitutes it with AppModel. Cake does this on purpose so that your relations and HABTM's do not need the junction Model to operate correctly. This works in most cases, but sometimes you get some weird errors or missing method problems. But before I continue with this, lets setup a quick scenario so I can better explain this problem. I will be using an example of relating users to teams.

User Model - users Table
Team Model - teams Table
TeamsUser Model - teams_users Table (Join)
User -> hasAndBelongsToMany -> Team
Team -> hasAndBelongsToMany -> User
TeamsUser -> belongsTo -> User, Team

In our code we are stating that a user can be on multiple teams, and a team can have multiple users. Are relation seems pretty simple, but lets not rely on Cake to magically figure everything out. In some cases we would receive the following errors:

  • Model "User" is not associated with model "Team"
  • Warning (512): SQL Error: 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'myCustomMethod' at line 1 [CORE\cake\libs
    \model\datasources\dbo_source.php, line 514] Query: myCustomMethod

Now what do those errors mean? Well the first one states that our Models can not be related. So the first thing we need to do is check that our Models naming conventions are correct: model names are singular and camelcased, where as the filenames are singular, underscored and lowercased (without having _model in the filename). The second error means we are trying to call a method on a Model that does not have that method in its class. In other words, it basically means our Model is not being loaded and the AppModel is being loaded in its place. This happens when Cakes naming magic fails to load the correct Model or we have improperly followed conventions.

Most of these problems can be fixed by following the proper naming conventions. I'd also like to note that most of these errors appear during HABTM relations, and can easily be fixed with the following:

  • With any relation, its best to define the "className" parameter. By doing this, we are telling Cake the exact name of our Model, instead of having Cake rely on the relation name (which can easily be changed to custom text).
  • When defining conditions or containments, be sure to include the model name followed by the column name: Model.field
  • Important! When working with complex HABTM relations, define the "className, joinTable and with" parameters. If you have complex relations with crazy table names, its highly required you define the joinTable and with parameters.

During one of my projects, I had a Model that had a HABTM relationship, in which that HABTM had its own HABTM relationship, so it got quite complex. I ran into problems over and over again where Models would not be related and the AppModel was used instead. I later found out that the problem was the with parameter. Since I didn't define it manually, Cake could not figure out the correct HABTM Model to use, hence the whole application broke. Its pretty funny that my whole app didn't work for weeks, all based on this small declaration.

So, since we know how to fix this problem, lets define our relations:

class User extends AppModel {
	public $hasAndBelongsToMany = array('Team' => array(
		'className' => 'Team',
		'joinTable' => 'teams_users',
		'with' => 'TeamsUser',
		'foreignKey' => 'user_id',
		'associationForeignKey' => 'team_id'
	));
}

So in conclusion, follow the naming conventions and define the required parameters when creating relations, and I promise you, you won't have any problems.

Custom method for grabbing a row based on its ID

More times then none when working with a database, you need a general purpose method for grabbing fields from a row that matches an id. Cake has built in magic methods based on the table column that do just that, for example findById() or findBySlug(), but sometimes it grabs associated data that you do not want. Below is a basic method that you can place in your AppModel to grab a row based on its id, with optional parameters for restricting what fields to grab or what associations to contain.

/**
 * Grab a row and defined fields/containables
 *
 * @param int $id
 * @param array $fields
 * @param array $contain
 * @return array
 */
public function get($id, $fields = array(), $contain = false) {
	if (empty($fields)) {
		$fields = $this->alias .'.*';
	} else {
		foreach ($fields as $row => $field) {
			$fields[$row] = $this->alias .'.'. $field;
		}
	}
	return $this->find('first', array(
		'conditions' => array($this->alias .'.id' => $id),
		'fields' => $fields,
		'contain' => $contain
	));
}

With a little bit of editing, you can make it work for other fields other then id. You must also have containable listed in your behaviors for the 3rd argument to work. If you still aren't sure how to use this method, the following examples should help.

// Grab a basic row based on id
$user = $this->User->get($id);
// Grab a row and limit fields
$user = $this->User->get($id, array('id', 'username'));
// Grab a row, fields and associations
$user = $this->User->get($id, array('id', 'username'), array('Country', 'Profile'));

Stripping HTML automatically from your data

About a week ago I talked about automatically sanitizing your data before its saved. Now I want to talk about automatically stripping HTML from your data before its saved, which is good practice. Personally, I hate saving any type of HTML to a database, thats why I prefer a BB code type system for this website. To strip all tags from your data, add this method to your AppModel.

/**
 * Strip all html tags from an array
 *
 * @param array $data
 * @return array
 */
public function cleanHtml($data) {
	if (is_array($data)) {
		foreach ($data as $key => $var) {
			$data[$key] = $this->cleanHtml($var);
		}
	} else {
		$data = Sanitize::html($data, true);
	}
	return $data;
}

Pretty simple right? The next and final step is to add it to AppModel::beforeSave(). In the next example, I will use the code snippet from my previous related article. Once you have done this your are finished, now go give it a test drive.

function beforeSave() {
	if (!empty($this->data) && $this->cleanData === true) {
		$connection = (!empty($this->useDbConfig)) ? $this->useDbConfig : 'default';
		$this->data = Sanitize::clean($this->data, array('connection' => $connection, 'escape' => false));
		$this->data = $this->cleanHtml($this->data);
	}
	return true;
}

Automatically sanitizing data with beforeSave()

So if you are like me and hate having to sanitize or clean your data manually within each action, and was hoping there was an easier way, there is. Simple combine the magic of Model::beforeSave() and the powerful strength of Sanitize::clean().

function beforeSave() {
	if (!empty($this->data) && $this->cleanData === true) {
		$connection = (!empty($this->useDbConfig)) ? $this->useDbConfig : 'default';
		$this->data = Sanitize::clean($this->data, array('connection' => $connection, 'escape' => false));
	}
	return true;
}

The previous code will attempt to clean all data before it is saved. Secondly it will convert HTML, it will not strip tags completely. So if you do not want HTML in your database, you will need to add some extra functionality and set encode to false in the clean() options.

But that's not it, were not finished just yet. You may have noticed a $cleanData variable and are probably wondering what it does. This is a custom property that should be placed in your AppModel and IS NOT a CakePHP property. By placing it in the AppModel we will receive no error notices and all data will be cleaned, additionally you can disable cleaning in certain models by setting the property to false in the respective model.

public $cleanData = true;
Known Errors

So far this has worked smoothly, except for the following exception:

- Serialized arrays will be escaped incorrectly and will break when trying to unserialize(), simply set $cleanData to false to not escape the serialized arrays.

- When escape is set to true, all data will have slashes added on top of the slashes already added with the Model class, so its best to turn escaping off.

Validating an images dimensions through Model validation

Since Cake's default model validation really doesn't support files that well, we have to build the methods our self. Right now I will be showing you how to validate an images dimensions. The method will either check the width, the height or both width and height. Simply place the following code in your AppModel.

/**
 * Checks an image dimensions
 *
 * @param array $data
 * @param int $width
 * @param int $height
 * @return boolean
 */
public function dimension($data, $width = 100, $height = null) {
	$data = array_values($data);
	$field = $data[0];
	if (empty($field['tmp_name'])) {
		return false;
	} else {
		$file = getimagesize($field['tmp_name']);
		if (!$file) {
			return false;
		}
		$w = $file[0];
		$h = $file[1];
		$width = intval($width);
		$height = intval($height);
		if ($width > 0 && $height > 0) {
			return ($w > $width || $h > $height) ? false : true;
		} else if ($width > 0 && !$height) {
			return ($w > $width) ? false : true;
		} else if ($height > 0 && !$width) {
			return ($h > $height) ? false : true;
		} else {
			return false;
		}
	}
	return true;
}

To use this validation, you would write it like any other custom validation rule. Lets set up our example view and model validation.

// View
echo $form->create('TestModel', array('type' => 'file'));
echo $form->input('image', array('type' => 'file'));
echo $form->end('Upload');
// Model
class TestModel extends AppModel {
    public $validate = array(
        'image' => array(
            'rule' => array('dimensions', 500, 500),
            'message' => 'Your image dimensions are incorrect: 500x500'
        )
    );
}

Now all you have to do is validate the data using your Models save() or validates() method. If the image fails the dimensions, the error should appear next to the field.

Problems with allowEmpty on files

Since the file support in Cake is lacking, you cannot use allowEmpty equals true. So this means that your image validation fields will always be required. There is currently a Trac bug for this with a quick fix.

Caching each query individually

So lately I have been delving into the caching capabilities of CakePHP. Most, if not all of its capabilities work wonderfully; although I personally can't get into $cacheAction (within the controllers). The $cacheAction property only works for static and non-user generated pages, in other terms, any content that changes depending on a logged in user wont work correctly with $cacheAction (unless you want thousands and thousands of cache files). So I stopped using $cacheAction all together in my latest application, and instead built a method that caches individual queries, instead of the whole page. All the modifications have been applied to the models find() method. To use this, place the following code within your app/app_model.php.

/**
 * Wrapper find to cache sql queries
 * @param array $conditions
 * @param array $fields
 * @param string $order
 * @param string $recursive
 * @return array
 */
public function find($conditions = null, $fields = array(), $order = null, $recursive = null) {
	if (Configure::read('Cache.disable') === false && Configure::read('Cache.check') === true && isset($fields['cache']) && $fields['cache'] !== false) {
		$key = $fields['cache'];
		$expires = '+1 hour';
		if (is_array($fields['cache'])) {
			$key = $fields['cache'][0];
			if (isset($fields['cache'][1])) {
				$expires = $fields['cache'][1];
			}
		}
		// Set cache settings
		Cache::config('sql_cache', array(
			'prefix' 	=> strtolower($this->name) .'-',
			'duration'	=> $expires
		));
		// Load from cache
		$results = Cache::read($key, 'sql_cache');
		if (!is_array($results)) {
			$results = parent::find($conditions, $fields, $order, $recursive);
			Cache::write($key, $results, 'sql_cache');
		}
		return $results;
	}
	// Not cacheing
	return parent::find($conditions, $fields, $order, $recursive);
}

In the next step, you would create a folder called sql within your tmp/cache/ and chmod the permissions to 777. Once you have created the folder, open up your app/config/core.php file and place the following code at the bottom (near the default cache settings).

Cache::config('sql_cache', array(
    'engine'		=> 'File',
    'path'		=> CACHE .'sql'. DS,
    'serialize'	=> true,
));

By default, caching will not work on your applications queries, you would need to set an additional "cache" option within your find(). Each SQL cache should have its own unique identifier so that it does not conflict with other queries. Also by default, queries will be cached for one hour and will be saved as a serialized array. The following examples explain how the cache option works.

// Cache query to /tmp/cache/sql/model-test_sql_query
$results = $this->Model->find('all', array(
	'cache' => 'test_sql_query'
));
// Cache query to /tmp/cache/sql/model-another_query that expires in 24 hours
$results = $this->Model->find('all', array(
	'cache' => array('another_query', '+24 hours')
));

What if I have a query that's used multiple times but each has its own limit (custom method), but uses the same cache slug? Simply give the cache slug a dynamic name like so:

// Cache query to /tmp/cache/sql/model-dynamic_query-15 
$results = $this->Model->find('all', array(
	'limit' => $limit, // 20, 30, etc
	'cache' => 'dynamic_query-'. $limit
));

I personally have found an increase in load times up to 150-200% faster using this method. This should only be applied to queries that are used on landing pages, and queries that do not change according to which user is logged in. Have fun.

Fixing a Models result array, when doing subqueries

This approach should no longer be used in the later versions of CakePHP. I highly suggest using the ContainableBehavior.

In some cases, you want to grab extra data in the find() method by calling an SQL statement like COUNT() AS, or SELECT(). When you do this, your extra data is not nested in the Model index of your resulted array. In the example below, we are doing a test find() and taking a look at the returned array.

// Find() query
$this->User->find('all', array(
	'fields' => array(
    	'User.*', 
        'COUNT(User.id) AS totalUsers'
   	)
)); 
/* Resulting array
[User] => Array (
    [id] => 1
    [username] => milesj
)
[0] => Array (
    [totalUsers] => 100
)*/

Now there are two ways to fix this problem, one is doing it in the afterFind() of your model, and the other is editing the core CakePHP DBO files. The first method can be found at the link below, and was written by a fellow baker, Teknoid. This technique would only apply to the model it was put in, the next technique applies it globally.

http://teknoid.wordpress.com/2008/09/29/dealing-with-calculated-fields-in-cakephps-find/

The second method is editing the resultSet() method of your datasource (does not apply to all datasources), which was brought to my attention by grigri. In my example, this technique will work for both MySQL and MySQLi, but I will be using MySQLi. Ee need to open the MySQLi datasource found at cake/libs/model/datasources/dbo/dbo_mysqli.php, copy the whole code and save our own version at app/model/datasources/dbo/dbo_mysqli.php. Once we have created our own file, we will navigate our way down to the method resultSet(). All we need to do is add another if statement in the while loop that looks for a result similar to Model__fieldName. Below you can see the before and after edits (only a part of the method):

// Old code block
while ($j < $numFields) {
    $column = mysqli_fetch_field_direct($results, $j);
    if (!empty($column->table)) {
        $this->map[$index++] = array($column->table, $column->name);
    } else {
        $this->map[$index++] = array(0, $column->name);
    }
    $j++;
}
// New code block
while ($j < $numFields) {
    $column = mysqli_fetch_field_direct($results,$j);
    if (!empty($column->table)) {
        $this->map[$index++] = array($column->table, $column->name);
    } else {
        if (strpos($column->name, '__')) {
            $parts = explode('__', $column->name);
            $this->map[$index++] = array($parts[0], $parts[1]);
        } else {
            $this->map[$index++] = array(0, $column->name);
        }
    }
    $j++;
}

This technique is probably the easiest to do, and will apply to all models. Once we have altered our datasource, we can change our find() method and our result should be working correctly now.

// Find() query
$this->User->find('all', array(
	'fields' => array(
    	'User.*', 
        'COUNT(User.id) AS User__totalUsers'
   	)
)); 
/* Resulting array
[User] => Array (
    [id] => 1
    [username] => milesj
    [totalUsers] => 100
)*/