Detecting cyclic arrays in PHP – A new approach

Part 2 of series Detecting cyclic arrays in PHP.

In Part 1 of this series I explained the problem of detecting cyclic arrays, and why it cannot be done in pure PHP. I also promised to present a solution that is both simple and correct for all possible arrays, which is what this post is about. But before that, let me restate the definition of cyclic arrays from the previous post: An array is cyclic if iterating all of its items and subitems does not terminate.

Here is an example of a cyclic array:

$array = [1, [2, 3]];
$array[1][1] = &$array;

Did you know?

The PHP standard library already performs checks to see if arrays are cyclic in order to avoid infinite recursion. One such example is json_encode (run code):

$array = [1, [2, 3]];
$array[1][1] = &$array;
json_encode($array);
echo json_last_error_msg() . ', ' . json_last_error ();
Recursion detected, 6

This already gives us an easy and correct solution, right (run code)?

function is_cyclic(array &$array) {
    json_encode($array);
    return json_last_error() === JSON_ERROR_RECURSION;
}
Stop sign
Photo by Wendelin Jacober from Pexels

We can do better

While being technically correct, serializing an array to JSON just to find out whether it may or may not be cyclic sounds wasteful. Thankfully, there are other functions that we can use. The one that I think has the least overhead is good ol’ count. It takes a not very well known second parameter, called $mode, that can either have the value COUNT_NORMAL or COUNT_RECURSIVE. This is what happens when we recursively count a cyclic array (run code):

$array = [1, [2, 3]];
$array[1][1] = &$array;
count($array, COUNT_RECURSIVE);
Warning: count(): Recursion detected

The documentation for count says that it will emit an E_WARNING when called with COUNT_RECURSIVE on a cyclic array. With this knowledge, we can set up a custom error handler to trap the warning and check if it was indeed caused by a cyclic array (run code).

function is_cyclic(&$array) {
    $isRecursive = false;
    set_error_handler(function ($errno, $errstr) use (&$isRecursive) {
        $isRecursive = $errno === E_WARNING && mb_stripos($errstr, 'recursion');
    });
    try {
        count($array, COUNT_RECURSIVE);    
    } finally {
        restore_error_handler();
    }
    return $isRecursive;
}

This is already our complete solution. Pretty neat.

Does this really work?

Yep, it does. There is, however, one important thing to consider: PHP is known for changing warnings into exceptions with new versions, and in fact, that is what happened with the upgrade from PHP 7 to PHP 8. This means that in the future the function may need to be slightly adjusted, which I am comfortable with. More importantly, this solution also works for the edge case that was described in Part 1.

Performance is also excellent because the count function barely adds any overhead (see count source code). In my benchmarks, it is considerably faster than a pure PHP solution, on top of being correct.

Does it also work for objects?

Sadly, there is no function like count to efficiently detect recursive objects. There is json_encode and var_dump, but as I already pointed out, they do a lot of unnecessary work. It would be nice if the standard library exposed a function to detect cycles in both arrays and objects.

That being said, lacking a library function to detect cycles in objects is noncritical since it is possible to implement a working algorithm in plain PHP.

Wrapping up

While coming up with my solution I did a lot of looking around in the PHP source code. I wanted to find out how the runtime is able to detect cycles and where this is surfaced in the standard library. It was only through this that I was even able to find out about the count function’s second parameter. I would also encourage everyone to dig a bit deeper and see for yourself how the sausage is made. Even if you have never written a single line of C code, you will probably be able to understand the gist of it and find something interesting.

I wish people were more fearless. There is too many people who don’t understand that it’s just code. It may not be the code that you wrote, but it’s just code. It follows the same rules. It may be in language that you are not particularly familiar with. But it’s still code running on a von Neumann architecture. There is nothing magic here.

Larry Osterman, https://channel9.msdn.com/Shows/Checking-In-with-Erik-Meijer/Checking-In-Larry-Osterman-26-Years-of-Programming-at-Microsoft

Code

You can find tests and benchmarks for the function on GitHub. I also included the broken implementation from Part 1 as well as the json_encode variant for comparison.

Series: Detecting cyclic arrays in PHP

One thought on “Detecting cyclic arrays in PHP – A new approach”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.