In this post I will show how you can leverage PHPStan to constrain your existing classes for better type safety. This may sound abstract at first, but I do have a particular application in mind that I want to showcase.
Let me set the background for you:
I am building an ecommerce application. It has various entities like products, orders, invoices, customers, and addresses which are related to each other.
Examples:
- A customer has zero or more addresses.
- An address belongs to a single customer.
- A customer has zero or more orders.
My goal is to represent entities and relations in PHP in a type-safe manner. In particular, I want to be able to represent selectively loaded relations in the type system. For example, a type for “customer with loaded addresses but without orders” which can be statically checked with PHPStan.
In this post I will be working with the following product entity as an example to illustrate my points:
readonly class Product {
public function __construct(
public string $name,
public int $price,
public Category $category,
public ?Manufacturer $manufacturer,
public array $reviews,
) {}
}
The Product
has the properties name
and price
as well has the following relations:
- A single required
Category
entity. - A single optional
Manufacturer
entity. - Zero or more
Review
entities.
The Problem: Selectively loading relations
The product is used in various places like in a detail page and list page. Pages generally only need to load a few select relations. The list page, for instance, only needs the category and manufacturer information, but no reviews.
Currently the Product
class does not support selectively loaded relations because the types are too strict. Let me fix that:
enum NotLoaded { case Value; }
readonly class Product {
public function __construct(
public string $name,
public int $price,
public Category|NotLoaded $category,
public Manufacturer|null|NotLoaded $manufacturer,
public array|NotLoaded $reviews,
) {}
}
A relation can now have the enum value NotLoaded::Value
. It allows us to identify which relations are loaded, and which are not. But this comes at a cost. The additional loading state makes reading relations more complex. Here is what one would have to do:
function renderProduct(Product $product): void {
if ($p->category instanceof NotLoaded) {
throw new LogicException('Category not loaded');
}
if ($p->manufacturer instanceof NotLoaded) {
throw new LogicException('Manufacturer not loaded');
}
$manufacturerInformation = match ($p->manufacturer) {
null => '',
default => " made by {$p->manufacturer->name} ",
};
echo "[$p->category->name] {$p->name}{$manufacturerInformation}for $ {$p->price}\n";
}
The function renderProduct
can only work correctly if the category and manufacturer relations are loaded. Assertions ensure that this precondition is satisfied. However, with a setup like this, PHPStan is unable to detect when renderProduct
is erroneously called with insufficiently loaded relations.
I really want the relation requirements to be visible in the type system in order to statically verify that renderProduct()
is called correctly. The type of the parameter $product
must be strict enough to guarantee that category and manufacturer relations are loaded.
I came up with a couple of approaches to tackle this issue.
Approach 1: Class per use-case
We can simply create a class for each combination of required relations.
readonly class ProductForListPage {
public function __construct(
public string $name,
public int $price,
public Category $category,
public Manufacturer|null $manufacturer,
) {}
}
readonly class ProductForDetailPage {
public function __construct(
public string $name,
public int $price,
public Category $category,
public Manufacturer|null $manufacturer,
public array $reviews,
) {}
}
The main advantage of this approach, besides statically guaranteed type-safety, is that it is easy to understand.
However, this approach has a couple of disadvantages:
- Tool-assisted refactoring and code analysis is made more difficult. Since there is no link between the two product classes in the language, tools do not see the logical connection between shared properties like
$name
. - Flexibility is limited. Say a function requires a product with a loaded category. What type should the function parameter have? Maybe the union of all product classes with a loaded category
ProductForListPage|ProductForDetailPage
? But then we would be spreading implementation details to places where they do not belong.
Approach 2: Interfaces
A standard way to address the disadvantages from our first approach would be inheritance—or better yet, interfaces.
interface ProductBase {
public string $name { get; }
public int $price { get; }
}
interface ProductWithLoadedCategory extends ProductBase {
public Category $category { get; }
}
interface ProductWithLoadedManufacturer extends ProductBase {
public ?Manufacturer $manufacturer { get; }
}
interface ProductWithLoadedReviews extends ProductBase {
/** @var array<Review> */
public array $reviews { get; }
}
readonly class ProductForListPage
implements
ProductBase,
ProductWithLoadedCategory,
ProductWithLoadedManufacturer
{
public function __construct(
public string $name,
public int $price,
public Category $category,
public ?Manufacturer $manufacturer,
) {}
}
readonly class ProductForDetailPage
implements
ProductBase,
ProductWithLoadedCategory,
ProductWithLoadedManufacturer,
ProductWithLoadedReviews
{
/**
* @param array<Review> $reviews
*/
public function __construct(
public string $name,
public int $price,
public Category $category,
public Manufacturer|null $manufacturer,
public array $reviews,
// ...
) {}
}
Here I am using the recently introduced property hooks for brevity, but it works just fine with good old trusty getter functions.
The common properties of the product classes are now linked by implementing common interfaces. Furthermore, since all relations have their own dedicated interface, it is possible to compose types that require a specific set of loaded relations:
function needsProductWithCategoryAndReviews(
ProductWithLoadedCategory&ProductWithLoadedReviews $product
): void {
}
This works well enough for the first level of relations. But what about nested relations? Let’s add an author relation to the review. Now we can have products with/without reviews with/without an author.
Here is how I would model it:
interface ReviewBase {
public string $text { get; }
public int $rating { get; }
}
interface ReviewWithLoadedAuthor extends ReviewBase {
public Author $author { get; }
}
/**
* @template-covariant TReview of ReviewBase = ReviewBase
*/
interface ProductWithLoadedReviews extends ProductBase {
/** @var array<TReview> */
public array $reviews { get; }
}
/**
* @implements ProductWithLoadedReviews<ReviewWithLoadedAuthor>
*/
class ProductWithReviewsWithAuthors
implements
ProductBase,
ProductWithLoadedReviews
{
/**
* @param array<ReviewWithLoadedAuthor> $reviews
*/
public function __construct(
public string $name,
public int $price,
public array $reviews,
// ...
) {}
}
By making the interface ProductWithLoadedReviews
generic over the type of review, it is possible to have products with reviews with authors and products with reviews without authors.
What do I make of this approach?
It’s horrible.
I believe it is ultimately futile to simulate structural type semantics on top PHP’s nominal type system. When applied in this generalized manner, the approach results in code that nobody wants to read or write.
Approach 3: Generics all the way
While the previous approach does not qualify as a general solution, which is what I am after, we can still learn something from it. It used generics to parameterize a product over the type of reviews it has. We can generalize this concept and introduce a template parameter for each relation. The template parameter is then used to constrain the relation type.
I will go into more detail later. First, here is the new code:
/**
* @template-covariant TAuthor of Author|NotLoaded = Author|NotLoaded
*/
class Review {
/**
* @param TAuthor $author
*/
public function __construct(
public readonly string $text,
public readonly int $rating,
public readonly Author|NotLoaded $author,
) {}
}
/**
* @template-covariant TCategory of Category|NotLoaded = Category|NotLoaded
* @template-covariant TManufacturer of Manufacturer|null|NotLoaded = Manufacturer|null|NotLoaded
* @template-covariant TReviews of array<Review>|NotLoaded = array<Review>|NotLoaded
*/
class Product {
/**
* @param TCategory $category
* @param TManufacturer $manufacturer
* @param TReviews $reviews
*/
public function __construct(
public readonly string $name,
public readonly int $price,
public readonly Category|NotLoaded $category,
public readonly Manufacturer|null|NotLoaded $manufacturer,
public readonly array|NotLoaded $reviews,
) {}
}
At first glance the templating may look intimidating but all templates follow the same pattern. Let’s take TManufacturer
as an example and break it down line by line:
@template-covariant TManufacturer
of Manufacturer|null|NotLoaded
= Manufacturer|null|NotLoaded
Line 1 declares the template TManufacturer
. The covariant modifier is obligatory but you can ignore it if you are not already familiar with it.
Line 2 constraints the possible types that TManufacturer
can take on. In this case TManufacturer
must match Manufacturer|null|NotLoaded
. If you need the manufacturer information to be loaded, you should specify Manufacturer|null
.
Line 3 defines the default type of TManufacturer
and is just there for convenience. If you do not care about the manufacturer, you can use the default Manufacturer|null|NotLoaded
, meaning “no guarantees”.
Here are some example usages of the generic product:
/**
* I do not care about the category.
* The manufacturer must be loaded.
* @param Product<*, Manufacturer|null> $p
*/
function func1(Product $p) {}
/**
* The manufacturer must be loaded and not null.
* @param Product<*, Manufacturer> $p
*/
function func2(Product $p) {}
/**
* Reviews must be loaded with their authors.
* @param Product<*, *, array<Review<Author>>> $p
*/
function func3(Product $p) {}
/**
* Reviews must be loaded but I do not care about their authors.
* @param Product<*, *, array<Review>> $p
*/
function func4(Product $p) {}
/**
* I do not care about any relations.
*/
function func5(Product $p) {}
I quite like this solution.
We are back to just a single class per entity, which is nice. It scales well with any number of relations and any nesting depth. Heck, we could even make every property a template if we really wanted to.
Still, there are two things I do not like:
For starters, runtime type-safety is limited. Templated types in combination with static analysis is great, but the runtime only cares about the actual type declaration. If the declared type is Manufacturer|null|NotLoaded
, then this is the only thing the runtime will check. Generics are effectively erased at runtime.
However, my my biggest criticism is about the generic type notation. In real world applications, entities have very many relations. Imagine a Product
with ten relations. If we only needed the 10th relation of type Color
to be loaded, then we would have to write the type like this:
/**
* @param Product<*,*,*,*,*,*,*,*,*,Color> $p
*/
function ugly(Product $p) {}
Now imagine having two or more relations of the same type Color
. You would have to remember the exact positions to distinguish relations of the same type.
/**
* @param Product<*,*,*,Color,*,*,*,*,*,Color> $p
*/
function confusing(Product $p) {}
The problem stems from template parameters being strictly positional. It would be great if we had named template parameters . In the case of regular PHP functions, named parameters really elevated the usability of default parameter values. The same would also be true for template parameters:
/**
* This is pseudo code.
* @param Product<TColor = Color> $p
*/
function sweet(Product $p) {}
Approach 4: Object shapes to the rescue
While generics come pretty close to providing structural type semantics, we might as well take a look at the two real structural types that PHPStan provides: array shapes and object shapes (function types are also structural but they are of no use for us here).
First of all, let me say that I am not planning to replace our product class with ad-hoc shapes.
/**
* @param object{name: string, price: int, ...} $product
*/
function doNotWriteThisCode(object $product): void {}
We lose all runtime type checking as well as tool-assisted refactoring and analysis. This is a maintenance nightmare.
Instead, we can use object shape intersections to constrain our existing class types. If you are familiar with TypeScript, then the following will look familiar:
class MyClass {
public function __construct(
public readonly int|null $value,
) {}
}
/**
* @param MyClass&object{value:int} $mc
*/
function myFunc(MyClass $mc): void {
\PHPStan\dumpType($mc->value); // int
}
By intersecting a class with an object shape, we are effectively narrowing common properties down to their largest common subtype: The type of value
inside the function is equal to (int|null)&int
which simplifies to just int
.
It also works equally well with nested objects:
class MyClass {
public function __construct(
public readonly int|null $value,
public readonly MyNestedClass $nested,
) {}
}
class MyNestedClass {
public function __construct(
public readonly string|null $value,
) {}
}
/**
* @param MyClass&object{nested:object{value:string}} $mc
*/
function myFunc(MyClass $mc): void {
\PHPStan\dumpType($mc->nested->value); // string
}
Alright. Looking good so far but there are some weak points.
For one, object shapes are not type-checked at runtime, same as with templates. Second, there is no IDE support for property renaming across classes and intersected object shapes. Refactoring in general will be more difficult.
Overall I believe the object shape approach can work well in practice.
Summary and outlook
The goal was to find a universal mechanism to constrain the types of class properties. Generics and object shape intersections turned out to be promising, granted, generics would only be usable if we had named template parameters.
What I plan to explore in future posts is the viability of applying these concepts in practice. The proof of the pudding is in the eating.