Jun 5, 2008 10:10:16 PM
One of the Zend Framework's strongest drawing cards, as I see it, is its loosely-coupled structure. The name Zend Framework may be a misnomer, in fact, as ZF is more a set of reusable libraries than an actual application framework. I won't go into detail about the advantages of loose coupling, but a recent discussion on the ZF mailing list prompted me to investigate just how loosely coupled the framework is.
Measuring the level of coupling in a set of libraries is rather difficult in an interpreted language. Dependencies may be set at runtime and not in code, as is clear in this code snippet:
<?php $files = array( 'Zend/Db.php', 'Zend/Loader.php', 'Zend/Currency.php' ); require_once $files[rand(0,2)];
The example above it pretty useless, but it demonstrates that coupling cannot be measured to exactness by inspecting code alone. However, ignoring fringe cases such as this one, it is possible to gauge coupling to a reasonable degree by determining which classes require() which other classes.
Coupling as a directed graph
Coupling can be pictured as a directed graph, with each class being a node and each requirement of another class being an edge. For instance
X -> Y
means that somewhere in class X there exists at least one of:
<?php require_once 'X.php'; require 'X.php'; include_once 'X.php'; include 'X.php';
Level of coupling = graph density
It could be said, then, that the density of a directed graph representing the coupling of a set of classes is the level of coupling between those classes. The density of a directed graph is measured by
D = 2 |E| / |V| (|V| - 1)
where D is density, |E| is the number of edges in the graph, and |V| is the number of vertices. In this context, |V| is equal to the number of classes being examined, and |E| is roughly the number of require() statements in those classes that refer to other classes in the set. Clearly, the higher the number of vertices, or classes, relative to the number of edges, or require()s, the lower the density, and therefore the weaker the coupling.
Deep coupling
Using this form of measurement may not be truly indicative of coupling within the ZF because framework components may be scattered over multiple files. For instance, the coupling between Zend_Db_Table_Row and Zend_Db_Table_Rowset may be very strong, but it says nothing about how coupled Zend_Db is with the rest of the framework. Having said that, analyzing each file in its own right provides a useful starting point.
Determining the density of deep coupling within the ZF is as simple as counting the number of .php files (|V|) and the number of require_once() statements within those files (|E|). More than anything, the simplicity of this calculation is an indication of how internally consistent ZF code is. Using find, grep and wc on ZF 1.5.1, here are the results:
|V| = find /usr/share/php5/ZendFramework/library/Zend -name "*.php" | wc -l = 1075 |E| = grep -r 'require_once' /usr/share/php5/ZendFramework/library/Zend = 3138 D = 2 * 3138 / 1075 * 1074 = 6276 / 1154550 = 0.00543
Shallow coupling
It is more useful to consider coupling between ZF components than between their constituent classes. Due to ZF's naming conventions, it's easy to determine what component a class belongs to -- it's simply the first part of the file's path following Zend/. For instance, Zend/Db/Table.php is part of Zend_Db.
Generating the vertex/edge information for shallow coupling is a little more complex, so we'll do it in two stages. First, we'll generate a text file containing .php filenames followed by the files they require() using bash:
for i in `find /usr/share/php5/ZendFramework/library -name "*.php"`; do echo $i | sed -e 's/\/usr\/share\/php5\/ZendFramework\/library\///' >> output.txt grep 'require_once' $i | sed -e 's/^\s*//g' -e 's/require_once //' \ -e "s/[';]//g" -e 's/"//g' -e 's/^/#/' >> output.txt; done;
The script above produces a file which looks something like
... Zend/Form/Element.php #Zend/Filter.php #Zend/Validate/Interface.php #Zend/Form/Exception.php #Zend/Form/Exception.php #Zend/Form.php ...
meaning that in the file Zend/Form/Element.php, there exists the statement require_once 'Zend/Filter.php'.
The second step is to run through this file with PHP, determine components and count |V| and |E|.
<?php
// process_shallow.php
$file = file('output.txt');
$vertices = array();
$edges = array();
$cur = null;
foreach ($file as $line)
{
$line = trim($line);
$line = preg_replace('/\.php$/', '', $line);
// we only want the first two parts of the filename
$parts = explode('/', $line);
$component = $parts[0] . '_' . $parts[1];
// is this line a vertex or an edge
if (! preg_match('/^#/', $component))
{
// this is a vertex
$cur = $component;
if (! in_array($component, $vertices))
{
$vertices[] = $component;
}
}
else
{
// this is an edge
$component = preg_replace('/^#/', '', $component);
$edge = array($cur, $component);
if (! in_array($edge, $edges) && $cur != $component)
{
$edges[] = $edge;
}
}
}
echo sizeof($vertices).' vertices and '.sizeof($edges).' edges'."\n";
// for 1.5.1, output is
// 43 vertices and 138 edges
// D = 2 * 138 / 43 * 42 = 276 / 1806 = 0.1528
It's interesting that ZF is more coupled at the component level than it is at the class level. This is arguably because much of the functionality offered by the framework for a particular component is pulled in at runtime, and is not forced on the user in the code. Furthermore, density values are rather arbitrary when viewed in isolation. It would be a useful exercise to run the same test on other frameworks and to compare values.
Visualizing ZF shallow coupling
By modifying the above PHP script, it's possible to generate a graphviz-compatible directed graph file.
<?php
// ... insert above code but remove the last echo ...
echo 'digraph D {' . "\n";
foreach ($edges as $edge)
{
echo "\t" . '"' . $edge[0] . '" -> "' . $edge[1] . '";' . "\n";
}
echo '}';
You can now run that file and pipe it to dot, graphviz's graphing tool:
php process_shallow.php | dot -Tpng -o zf_coupling.png
the result can be seen here (738KB PNG).
Discussion
Subscribe to an RSS feed of these commentsKoen
Jun 7, 2008 5:27:35 PM
That's a lot of coupling. Even when components can be used in isolation, you'll have to be carefull that when you use them as such, you don't miss a case where the component does need another component (a dependency which you missed).Neil Garb
Jun 7, 2008 5:31:04 PM
Thanks for your comments, Koen. As I mentioned, these results have little meaning in isolation -- it's absolutely expected for there to be a degree of coupling within a solid framework. It's not a bad thing -- it just means that out of the box, the framework's components can work well together.Another modification which would have an impact on results is to consider only private or protected functions. Public functions are more likely to be 'optional', and are more likely to be the point of contact between two components.
Federico
Jun 8, 2008 11:26:06 AM
Neil, genius! Great way of getting all the dependencies, you saved me 1 or 2 days of planning and coding, as I was about to do the same. Great use of sed :)Neil Garb
Jun 8, 2008 1:53:47 PM
Thanks, Frederico. Glad you found this useful. To be honest I cheated a bit -- the regular expressions I used picked up only require_once 'file.php', and not require_once('file.php'). I manually edited my output.txt file and fixed the different formats (there weren't many).Your comment