Type safety when working with unknown data

07 May, 2024

Thanks to TypeScript, we can increase the type safety of our JavaScript application. That doesn't mean that our TypeScript programs cannot "go wrong". This is because TypeScript programs are usually still everything, but not well-typed.¹

Explicit and Implicit Pitfalls

There are various pitfalls when working with TypeScript. With type assertions, the non-null assertion operator or the any type, we basically turn off TypeScript for parts of our program. The way how we avoid such pitfalls is quite simple: we avoid or even forbid the explicit usage of these disturbing features.²

But there are parts of our programs that are much harder to control. When receiving unknown data from any external source like a file or an API, we often just trust that the data we get is correct.

Let's think of some oversimplified JSON data we receive from somewhere.

const result: number = JSON.parse('{ "string": 5 }')

console.log(`The result type is: ${typeof(result)} 🎉`)

We all know that the result type is neither number nor string, but we told TypeScript to treat it as a number. The reason why we can do this is that JSON.parse() returns the any type. And no, unfortunately, the noImplicitAny compiler flag doesn't handle such cases.

A first honorable step to achieve more type safety in such situations is to help TypeScript with the truth.

const result: unknown = JSON.parse('{ "string": 5 }')

// 👇 We can also overwrite the return type on a global level.
declare global {
    interface JSON {
        parse(text: string): unknown;
    }
}

Similar to every HTTP client, JSON.parse() can return anything including primitives or null.³

In a more realistic scenario, we may have some type definition that describes what we expect to get delivered from an external API endpoint.

type Client = {
	id: string,
	displayName: string,
	zipCode: number
}

The problem is that TypeScript cannot guarantee that the data we receive from an external data source conforms to our type definition by design.⁴ Types and also interfaces don't exist at runtime. If we want to achieve more type safety, we must validate the data on our own.

Validating unknown data

TypeScript allows us to write functions that seem to be great for validating unknown data. So-called type guards are functions that return a type predicate. And a type predicated tells TypeScript that when our type guard function returns true, the function's parameter is of a given type. In other words, a type guard that just returns true corresponds to an assertion.

function isClient(something: unknown): something is Client {
	// 👇 DON'T DO THIS!
	return true
}

const something: unknown = { id: 25 }

if (isClient(something)) {
	// TypeScript expects now `something` to be of type `Client`.
	console.log(something.zipCode)
}

If we want to sleep well, our type guard should validate that the given parameter is an object with several properties of specific types.

function isClient(something: unknown): something is Client {
	return typeof something === 'object' &&
		something !== null &&
		'id' in something &&
		typeof something.id === 'string' &&
		'displayName' in something &&
		typeof something.displayName === 'string' &&
		'zipCode' in something &&
		typeof something.zipCode === 'number'
}

This is already quite complex and very prone to errors, even for a very simple type definition. There is no connection to our actual type definition. We can look for wrong property names or can expect wrong types, TypeScript will not complain.

That people often skip this step and just trust the data source seems comprehensible.

Connect Type Definition and Validation

A more practical approach would be to combine type definitions and their validation. In the following section, we use the Zod validation library to achieve exactly this.⁵

With Zod we can define validation schemas to describe the data our program expects.

import { z } from "zod";

const ClientSchema = z.object({
	id: z.string(),
	displayName: z.string(),
	zipCode: z.number()
});

With this schema, we now can write a type guard to check if some unknown data is a valid client response.

function isClient(something: unknown): something is Client {
	return ClientSchema.safeParse(something).success;
}

Or we can directly parse unknown data.

// 👇 This will throw an error if `something` is not a valid client response.
const reponse = ClientSchema.parse(something);

Since we already have a type definition of our client response, we can tell Zod to take it into account.

type Client = {
	id: string,
	displayName: string,
	zipCode: number
}  

const ClientSchema: z.ZodType<Client> = z.object({
	id: z.string(),
	displayName: z.string(),
	zipCode: z.string()
	// 👆 TypeScript will complain about this line,
	// 👆 because the ZIP code must be a number.
});

But we can also infer the client response type definition from the validation schema.

const ClientSchema = z.object({
	id: z.string(),
	displayName: z.string(),
	zipCode: z.number()
});

type Client = z.infer<typeof ClientSchema>;

That way, our validation schema and our type definition go hand in hand with almost no additional effort.

Zod offers a wide range of features that make it easy to describe even complex models. And even JSDoc comments don't get lost during the type inference.⁶

Conclusion

Handling unknown data in a type-safe manner can be time-consuming and error-prone. The Zod library supports to infer types from validation schemas, and therefore we can use validation schemas to model our data. This increases the type safety significantly when working with unknown data, while the time investment remains almost the same.

Milner, Robin (1978), "A Theory of Type Polymorphism in Programming", Journal of Computer and System Sciences, 17 (3): 348–375↩
Linters, for example, can support to avoid explicit pitfalls (e.g., no-explicit-any or consistent-type-assertions).↩
See the MDN Web Docs of JSON.parse()↩
Angular's HTTP client even offers generic functions that fool us into thinking the given type is the return value.↩
Other libraries will probably also support the described approach. A comparison between Zod and other validation libraries can be found on zod.dev.↩
At least Visual Studio Code displays JSDoc comments and block tags like @deprecated when using the inferred type.↩

#code_quality #type_safety #typescript