(Miles Johnson) -> Blog -> Topic -> PHP

Converting a SimpleXML object to an array

This functionality can now be found within the Titon Utility library.

If you have been following my Twitter, you would of heard me complaining about converting a SimpleXML object into an array. I am still having that problem, so if you can get it working correctly (my test so far below), I would be greatly appreciative. If you have never used the SimpleXML object, it can be quite awesome when actually reading an XML document - but once it comes to converting it to something else, it comes straight from the darkest depths of hell. Every property of the object, is also a SimpleXML object, so on and so forth. Each property/object has a method children(), which returns more properties, or attributes() which returns attributes; weirdly enough, children() also return attributes. Furthermore, you can't just echo the object out to get a value, you have to turn it into a string. You can see where this can get quite difficult and confusing, as it always spits out data your not expecting.

After countless hours, I was able to get it to properly convert to an array... about 95% of the time... while keeping attributes and parent/children hierarchy. The only scenario where it doesn't convert properly, is when you have nodes within a node that has attributes (which is kind of rare in my opinion). Here's a small little example:

// Works just fine
<root>
	<node foo="bar">I'm a node!</node>
</root>
// Does not work
<root>
	<node foo="bar">
		<childNode>I'm here to make your life miserable!</childNode>
		<childNode>Me too!</childNode>
	</node>
</root>

Besides that little instance, I am able to properly turn an XML document with attributes, and multiple nodes with the same name, all into a perfectly replicated array. Here is the code I wrote to achieve such an amazing task (sarcasm).

/**
 * Convert a SimpleXML object into an array (last resort).
 *
 * @access public
 * @param object $xml
 * @param boolean $root - Should we append the root node into the array
 * @return array
 */
public function xmlToArray($xml, $root = true) {
	if (!$xml->children()) {
		return (string)$xml;
	}
	$array = array();
	foreach ($xml->children() as $element => $node) {
		$totalElement = count($xml->{$element});
		if (!isset($array[$element])) {
			$array[$element] = "";
		}
		// Has attributes
		if ($attributes = $node->attributes()) {
			$data = array(
				'attributes' => array(),
				'value' => (count($node) > 0) ? xmlToArray($node, false) : (string)$node
				// 'value' => (string)$node (old code)
			);
			foreach ($attributes as $attr => $value) {
				$data['attributes'][$attr] = (string)$value;
			}
			if ($totalElement > 1) {
				$array[$element][] = $data;
			} else {
				$array[$element] = $data;
			}
		// Just a value
		} else {
			if ($totalElement > 1) {
				$array[$element][] = xmlToArray($node, false);
			} else {
				$array[$element] = xmlToArray($node, false);
			}
		}
	}
	if ($root) {
		return array($xml->getName() => $array);
	} else {
		return $array;
	}
}

I know exactly where the problem resides also. Its the value index of the $data array. (The little bastard below).

$data = array('attributes' => array(), 'value' => (string)$node);
// Should be
$data = array('attributes' => array(), 'value' => xmlToArray($node, false));

A simple fix right? Nope! When you do that, it totally breaks... for some reason. The first line of the function (!$xml->children()) gets passed since the element passed does have children, since it has attributes; now I can never understand why attributes count as children when you have attributes(). I tried many different conditionals to get it working, I tried unsetting the attributes (but can't determine the property), and all these other routes... but to no avail. But I digress, seeing as how it works 95% of the time, and the case it doesn't work isn't used that much. However, if you can figure it out, I will be in your debt forever.

Minimalistic approach to class getters and setters

Getters and Setters are the backbone of many PHP classes (or any programming language class), as they allow you to alter and retrieve class properties during runtime. Many well thought and powerful scripts and frameworks make use of Getters and Setters, but is there a thing as too much? Or are there better and easier alternatives? (To make it easier on me to type, and you to read, I will refer to Getters and Setters as GnS from now on.)

By now everyone should know of the Zend Framework. It's a highly customizable and robust system built with multiple components. Zend follows the pure OOP paradigm in which most, if not all classes have an abstract, an interface, and tons of GnS. For a beginner, this might look like a huge cluster of code, as well as being very large in its documentation. But for an advanced user, it could be a god send. Personally, I find Zend's use of GnS to be too much, as it can easily be trimmed down and packaged accordingly.

Before we begin, I want to outline when and how GnS should be used. GnS should be used to alter protected properties only. Why protected you ask? If a property was public, then you can just alter the property manually without the need for a median method (unless of course the method does some manipulation on the argument). If a property is private, then that property should not be altered at all during runtime, as it is data specifically generated/built by the class internally. So that leaves protected properties to act as our configurable properties.

In a typical class, when dealing with the property $_name, you would have a method getName() and setName(). Now imagine you have a class with 15+ properties; you will immediately begin to realize the scale and amount of code required to do GnS. That's where our little friends __set(), __get() and __isset() come in handy. We can easily scale down the code from 30 methods (15 for getting, 15 for setting) to 3. Take the following before and after classes:

// Using individual methods
class User {
    protected $_name;
    protected $_email;
    public function getName() {
        return $this->_name;
    }
    public function setName($value) {
        $this->_name = $value;
    }
    public function getEmail() {
        return $this->_email;
    }
    public function setEmail($value) {
        $this->_email = $value;
    }
}
// Using magic methods
class User {
    protected $_name;
    protected $_email;
    public function __get($property) {
        return (isset($this->{'_'. $property}) ? $this->{'_'. $property} : null);
    }
    public function __set($property, $value) {
        if (isset($this->{'_'. $property})) {
            $this->{'_'. $property} = $value;
        }
    }
    public function __isset($property) {
        return isset($this->{'_'. $property});
    }
}

Does that not look a lot easier? Sure the code looks a bit "hacky", but its perfectly usable and valid code. Of course, there are a few downsides related to this approach. For example, you may want to format a string before assigning it within setName(). With the magic methods you can not do so. But that doesn't stop you from creating a setName(), along side using the magic methods. Furthermore, the get syntax is different; you simply call the property (assuming there is no $name that conflicts with $_name).

// Using individual methods
$name = $User->getName();
$User->setName('Miles');
// Using magic methods
$name = $User->name;
$User->name = 'Miles';
// Times when a set method is needed
public function setName($name) {
    $this->_name = ucall($name);
}

I would like to take this a step further, as I still believe that this is too "cluttered". The next approach solves the problem of conflicting property names, as well as reducing the code required. The approach is straight forward; simply create a global $_config (or $_data, what ever suits you) property that will deal with all the getting and setting of data.

class User {
    protected $_config = array(
        'name' => null,
        'email' => null
    );
    public function __get($property) {
        return (isset($this->_config[$property]) ? $this->_config[$property] : null);
    }
    public function __set($property, $value) {
        if (isset($this->_config[$property])) {
            $this->_config[$property] = $value;
        }
    }
    public function __isset($property) {
        return isset($this->_config[$property]);
    }
}

In the end, it really boils down to the architecture of your application, and your personal coding preferences. Each approach has its pro's and con's, but the best solution (in my mind) is combining them. You would begin by creating the global $_config property, and building the magic methods. If you ever need to customize a get or set, then you create a specific method for it.

PHP Pro Tip: Don't close your PHP documents with ?>

Recently on my Twitter I mentioned this exact tip, "PHP Pro Tip: Don't close your PHP documents with ?>.", and got many responses asking why or if it's possible. I will briefly explain the benefits of this technique and when it applies. Many assumed I meant that you should never close your PHP scopes with ?>, but that's not the case, I was merely stating that the closing tag (at the very very bottom of your PHP file) does not need to be there.

If you are within a template file that has multiple opening and closing PHP tags, then of course it would be required to close those. If you have a PHP class or file that's purely PHP and consists of no front-end markup, then the closing tag at the bottom of the page is optional. It even says so in the official PHP.net documentation.

The closing tag of a PHP block at the end of a file is optional, and in some cases omitting it is helpful when using include() or require(), so unwanted white space will not occur at the end of files, and you will still be able to add headers to the response later. It is also handy if you use output buffering, and would not like to see added unwanted white space at the end of the parts generated by the included files.

Now onto the benefits to this technique. The main benefit is that it solves the "unwanted white space" at the beginning of the next document, causing it to error out and spew HTTP headers. It does so by keeping the PHP scope open at the end of the script, and allowing included files to continue within the scope and not fail. It also means you don't have to spend the time making sure there is no white space or new lines at the end of your files.

Very handy if you ask me. So just a heads up with a little tip. Enjoy :]

Why pure OOP, just for the sake of doing pure OOP

Lately all I have seen is the "Pure OOP is the way to go mentality" without much reasons to back up why its beneficial. I write OOP code, but I don't write overly verbose, bloated and cluttered OOP code either. I attend Zendcon, I have my PHP certification, I read PHP blogs on a daily basis, I have had multiple OOP discussions with other developers, but in the end I still ask the question, "Why do you spend so much time writing and separating your objects? When in the end its unnecessary as you are not extending them in the future anyways."

Before I begin, I want to do a quick review on how I got onto this topic. It started a few days ago when Brandon Savage posted his article Why Great Development Tools Don't Seem To Be Written In PHP, in which I agreed with him. Further into the comments, many users mentioned the PHP bug tracker Arbit Tracker. After downloading the source and browsing the code, I came to the following conclusion and posted this comment on Brandons entry:

After looking at Arbit... Why do people always need to use SO MANY files and classes to do such a basic task? Its unbelievable how bloated some projects can get. There is a thing with being to modular.

Just take a guess at how many files are required for a basic issue tracker... just guess. There are 2,077 files and 402 folders, and remember this is still an alpha version so that can grow even larger. Basically my first reaction was "Seriously? Do we really need all these files and classes to create a simple and basic bug/issue tracker". I have nothing against Arbit, nor the developers (They know there PHP for sure), but there is something terrible wrong about this in my opinion. This is a "3rd party script that is installable on a users machine", yet looks like a full blown framework and application. When I download a script to install on my machine, I want it to be lightweight, usable, easy and customizable. I find something really wrong about this when it takes this many classes to do such a basic task, and on top of that it requires multiple dependencies like EZ Components and CouchDB (and from what I could find, this wasn't even in the source! Now add more files into the final count). After my comment everyone seemed to post the same response: "Well of course, its a pure OOP project", and "This is what happens when you do it in true OOP", and "Nothing beats an OOP setup!".

I mentioned that a project does not need an abstract, an interface or even an exception for every class in its system, and I received this lovely little response.

If you want superior software then you use the standard and that (just) happens to be object oriented methodologies.

Drop object oriented methodologies and you drop way too many other natural benefits to software development; so enough of your bullsh*t okay?

I find this response laughable. Please tell me what those natural benefits are? To expand on classes if you want to customize it or create your own? To extend them even further? Dependency Injection? Multiple other reasons... You get my point. That's what abstract and interfaces are for. So why create them when YOU KNOW you will never extend them further, or even bother creating child classes. It seems everyone is in this mentality of "Ill do pure OOP just for the sake of doing OOP."

I can see when abstract and interfaces are required, like Zend Framework for example, but even then its overkill. Very rarely does anyone ever extend or use the abstract/interfaces in Zend. I've worked at many jobs and on many projects that utilized Zend, and have even worked with individuals that do the "Pure OOP" approach, and still I have not seen them extend objects like they are supposed to be, or use OOP to its fullest. I can only recall very few instances when people actually extended the objects, and it was primarily for Auth classes to customize to fit their application.

Another thing I dislike about pure OOP approaches is the amount of code you have to write to do such basic tasks. Take for example the routing components of Zend:

$this->getRouter()->addRoute('user', new Zend_Controller_Router_Route('user/:username', array('controller' => 'user', 'action' => 'info')));

I mean, whats so difficult about the following approach (below)? Why do you need to create a whole new object as an argument, when the object is just going to be a toArray() anyways and set as a property? Oh I know, its because you want to maybe use a Route_Regex class, or a Route_Hostname class. Now my question is, why would those even need to be their own class? Because they each do a different purpose and have their own methods that overwrite the abstracts? I guess that's where we disagree then. I honestly don't see a reason why these should be "split" up. The Router should manage all routes and package the class appropriately. Breaking up the classes isn't being "Pure OOP", its just being overly modular.

Say and think what you wish, but there's nothing wrong with the following piece of code either. But you'll probably find something wrong with static methods as well, seeing as how it doesn't add any OOP functionality.

Router::addRoute('user', 'user/:username', array('controller' => 'user', 'action' => 'info'));

I also have a problem with the amount of getters and setters I see everywhere. Have you not heard of __call, __get, __set, __isset, or __unset? But I digress, that topic is for another entry at another time.

I'll stop ranting now, its a matter of preference. Id also like to note that CakePHP (a non-pure OOP approach, but still OOP) has the same functionality and customization as Zend and Symfony (both pure OOP approaches). So why do you think Cake is inferior to the others? Personally, I think it has more beneficial and useful functionality.

The awesomeness of 5.3 Namespaces

If you haven't heard of Namespaces, or even PHP 5.3, you need to stop living under a rock. I've had quite a bit of fun the past month messing around with namespaces in 5.3, and I have got to say, they are the greatest addition to PHP yet. If you are unfamiliar with namespaces, it's a feature that allows you to add another layer of ownership and packaging to functions and classes, similar to how classes add a layer onto methods (ala functions).

In most applications and frameworks, you would see classes with very long names (I'm pointing the finger at you Zend!). This was implemented so that the classes would not conflict with internal and external (3rd party vendor) scripts. This is no longer the case, we can simply use namespaces! Lets take a look at some code we would write in 5.2 and lower.

class App_Utilities_Session { 
	public static function get($key) { }
	public static function set($key, $value) { }
}
// Basically, annoying!
App_Utilities_Session::set('User', array());

You can see how this would get really annoying and time consuming having to write the long class names over and over again through out your application. We can now rewrite the code above using namespaces. Do note, that the namespace declaration has to be at the very top of your code; although comments and the declare() function can be above it.

namespace App\Utilities;
class Session { 
	public static function get($key) { }
	public static function set($key, $value) { }
}
// Still kind of annoying
\App\Utilities\Session::set('User', array());

Now you may be thinking how this would make any difference, it's still a long class name to write... actually its even longer! Yeah well, namespaces also come bundled with aliases, if you don't feel like writing such long strings. Aliases allow you to declare alternative names for namespaces within the current scope (file). To apply an alias, you would write the "use" command with the Namespace path. You could either call the namespace by its base class name, or you can give it an alias. For example:

use App\Utilities\Session;
// Now that is much easier!
Session::get('user');
use App\Utilities\Session as S;
// Even easier!
S::get('user');

Even though namespaces are great and add extra functionality, there are a few things you should know about. Firstly, any class or function that isn't namespaced, or is within the global scope (SPL, Exception, etc), must be prepended with a backslash. If the backslash is not present, PHP will believe the class is within the current namespace and it fail.

$d = new \DirectoryIterator();
$e = new \Exception();

You would also need to prepend any global function. However, while developing I found that the backslash is not needed, unless you have a custom method with the same name of a function within the global scope.

class Debugger {
	public static function log() { }
}
// Use the global log function
$o = \log();

Another benefit of using namespaces is when you structure your application and file paths to mirror the namespace names. For example, if I use the Session namespace above, I may have a folder structure like so.

// App\Utilities\Session
/app/utilities/session.php

This comes in handy when you have a custom function to convert a namespace into a file path, and vice versa. Furthermore, the get_class() method will now return the classname with the full namespace declaration.

$class = get_class($session);
// App\Utilities\Session

There are loads of implementations of features related to namespaces, I have only named off a few of the important ones. If you have not used 5.3 yet, I highly recommend you do so. Either by upgrading your server or installing a local server; my localhost of choice is XAMPP (it comes with 5.3 by default!). I also highly suggest looking at the namespace guide for more information.

Code snippets now available

So over the years I have written many small code snippets, functions and what have you, and thought it would be a good idea to release them to you guys. I use most of these snippets on my own projects and applications and are great to be re-usable everywhere. Most, if not all the snippets, will deal with PHP and Javascript, however I have thrown in some CakePHP and jQuery ones.

View all 21 code snippets

Ajax Handler

On top of releasing my code snippets, I have recently made my AjaxHandler component available. The component can be placed in any CakePHP application and then applied to specific controller actions to handle them as Ajax requests. The handler does nearly everything automatic and even responds with the correct data structure and content type.

Check it out and let me know what you think!

Download the Ajax Handler

Calling functions within your CSS files

I always thought CSS should have a little bit more power, like the use of inline programming functions and dynamic variables. That's exactly why I set out to write my Compression class. On top of compressing the CSS files, saving cached versions and assigning dynamic variables, the class can now call PHP functions that have been written inline within the CSS itself.

This allows for tons of new possibilities for allowing dynamic content, values and structure within boring old stylesheets. As an example, here is the PHP function, the CSS written with the function and the result.

// The PHP function
public function colWidth() {
	$args = func_get_args(); 
	$width = $args[0] * 100;
	return $width .'px';
}
// The CSS
.col1 { width: colWidth(5); }
.col2 { width: colWidth(3); }
.col3 { width: colWidth(1); }
// After being parsed
.col1 { width: 500px; }
.col2 { width: 300px; }
.col3 { width: 100px; }

This new feature has been released in the new Compression version 1.4, which is now available! You can view the updated documentation, or download the files to see the example usages, enjoy! Be sure to spread the word!

Download Compression v1.4

New scripts galore

It has been quite a while since I posted a real entry, I just haven't worked in CakePHP for the past few weeks. Hopefully these two new scripts and the new script versions will make up for it. Be sure to updated your Databasic if you are using it!

New Scripts

Compression
Compression is a light weight class that will load a CSS stylesheet, bind and translate given variables, compress and remove white space and cache the output for future use. Upon each request will determine if the cached file should be loaded if the original has had no modifications.

Formation
Formation is a lightweight class that can build all necessary elements for a form, process the posted data (post, get, request), validate the data based on a schema and finally return a cleaned result. Has logic for error handling and applying a class to invalid fields. Additionally, all validator functions are static and can be used externally.

Updated Scripts

Databasic v2.3
Added more support for specific operators and rewrote how statement conditions are parsed. View the full changelog.

ZendCon 09

I just bought my ticket for ZendCon, 2 hours before the discount ended, pretty stoked about that. This will be my first PHP/Developers conference and I am thoroughly looking forward to it. I never really had an interest in these before but my passion for PHP has grown tremendously this past year, so its only fitting that I go. I will be going with some friends and colleagues, so it should be a blast and not just me by my lonesome.

I am looking forward to taking the Zend certification test and the tutorial panels at the convention should teach me all I need to know before hand. On top of that, I'm quite excited to see Nate Abele talk about CakePHP (pretty awesome seeing as how everything else is Zend). If any of the small group of people reading my blog are attending, be sure to say hello and we can have lunch.

See you all at ZendCon in October.

Databasic 2.1, now with more operator support!

I got pretty bored the other day and also noticed this contest over at NetTuts, and thought to myself, "Why not enter Databasic into the contest?". Well that is my plan, but I also wanted to fix some problems and restraints in the current version. The new version supports AND/OR operators in the conditions, as well the column operators (!=, <=, etc) have been rebuilt. With this change the new version is not backwards compatible! Sorry, but it shouldn't be too hard to fix your scripts to work correctly.

Here's a quick example of the new operator support in conditions.

$conditions = array(
    'OR' => array(
        array('name' => 'Miles'),
        array('name' => 'Johnson')
    ),
    'status' => 'active',
    'age >=' => 21
);

Download the new 2.1!
View the full change log and features