Migrating from immutable.js types to vanilla JS types

Author: Lasse Holmstedt

When we first started building our frontend at Alloy, we chose to use Immutable.js for our core data types. We were using JavaScript (ES2017) at the time, and having immutability guarantees was a key architectural choice. As we migrated to TypeScript, the language could provide these same guarantees for us, but we were stuck with immutable.js across all of our data types. It was not necessary to use it in our core type system, it had a runtime cost at load time, and added a layer of complexity to our frontend codebase we wanted to get rid of.

This article outlines how we successfully migrated out from immutable.js for our core data types.

Why move away from immutable.js?

Our earlier article outlines how we migrated from JS to TS. It also covers some of the problems we encountered on the way with immutable.js when used with our core data types – the types that we return from our backend’s REST API responses.

To be clear, Immutable.js is a great library and we still use it today, but not the best choice for your core data types for a number of reasons. There is a performance impact in doing the JS -> immutable.js conversion, its core use case of immutability can be enforced with TypeScript’s type checker, and the additional APIs don’t offer much that modern JS versions or late-stage proposals don’t also offer.

Automatically generated TypeScript data type definitions

The first step we took to move away from immutable.js came from Gwylim, one of our engineers during a hackathon. We found an elegant way to automatically generate our REST API endpoint definitions into TypeScript, based on two key technologies – OpenAPI annotations on the backend side, and by using TypeScript’s compiler API as the basis for a code generator.

We had already written the code for the core data types in a very predictable, mechanical way from the days of the initial JS -> TS conversion, and so the code lended itself for being automatically generated. However, the maintenance effort of updating a type definition in TypeScript as you’re touching the Java one, or vice versa, is low. As such, it wasn’t a key goal to generate our types on the fly from a single source of truth for a good while. But as your codebase grows, you’ll have more types, more engineers on board, and more risk of this being forgotten, potentially causing bugs. You’ll want to automate repetitive tasks like this away, at the right time.

For us, that right time came a bit by accident! At Alloy, we have quarterly hackathons that last two days each. One of our engineers took a day or two to explore code generation of API endpoint definitions with TypeScript, and in those couple of days, we had a working prototype ready. We suddenly had a fully working code generator for all of our REST API endpoints that would produce typesafe Promise APIs. Below is a sample of the generated API code:

export namespace Retailers {
  export function getAllRetailers(): Promise> {
    return apiRequestWrapper('get', 'retailers', {}, {});
  }
  export function saveRetailer(requestBody: Retailer): Promise {
    return apiRequestWrapper('put', 'retailers', {}, {}, requestBody);
  }
}

You’d then invoke the API simply by calling Retailers.getAllRetailers().then(retailers => ...). Straightforward, readable, typesafe and also eliminating any risk of typos with the API endpoint names and whatnot. This was great!
However, the corresponding generated types were not compatible with our then-existing types that used immutable.js. Below is an example of what the code generator would produce:

export type Retailer = {
  readonly id: number | null;
  readonly name: string;
  readonly vendorId: number | null;
};

However, below is what we had in the frontend at the time:

export class Retailer extends Record({
  id: undefined,
  name: undefined,
  vendorId: undefined
}) {
  static fromJS(value: IRetailer){
    return new Retailer(value);
  }
  static fromArray(array: any[]): List {
    return T.fromArray(array, Retailer);
  }
}

We could have generated most of the above code as well with some more effort, but it would have been throwaway code. We wanted to move away from using immutable.js and get rid of these convention-based converter functions like fromJS, as TypeScript supported our core requirement for immutability just fine. As we got the generator up and running , we set our sights on the quest to move away from immutable.js, step by step.

All things equal

One of the neat features of immutability is that two fully immutable objects can be shallowly compared for equality, and it’s very fast. If the keys differ or the values of any keys differ between the two objects under comparison, you can fail fast. Immutable.js has lots of smarts around this, too. If you store an object in an immutable Set or a Map, the library has a hashing function whose results it caches. However, it only does this once the set size reaches a threshold of about 30 items – most collections don’t have many objects, and hashing is actually more expensive in these cases.

Immutable.js is very well architected, and its internal object comparison functions use shared code to do equality comparison. Immutable.js follows these semantics: for objects that have an implementation of equals and hashCode, an equality comparison is attempted. If these functions are not implemented, only referential equality is checked for. Deep equality checks are only done if the equals method implements those – such checks are not done for value objects. This makes a lot of sense if you are using immutable.js objects for all your needs, but if you want to use vanilla JS objects, deep equality enforced at the library level will lead to issues, especially if you use JS objects as keys. After all, {} !== {}.

The Map and Set data types from immutable.js are very useful and don’t have a great equivalent for us, due to the standard library equivalents being mutable. We had no intention of fully moving away from those in the application code, but wanted to keep the data types as pure TS wherever possible. As such, we needed to change those equality semantics in immutable.js.

When we were implementing this, it was 2020 and immutable.js was still stuck in a release candidate state, so we made the decision to fork and add a couple of patches of our own. There are deep equality libraries for JS objects already out there, so we leveraged fast-deep-equal. The code changes are simple, and all available at https://github.com/alloytech/immutable-js.

Deep equality checks could quickly become unperformant if executed too often. This was a concern, but not a problem in our case. In practice, our largest objects persisted in the frontend never change, and so their hashes don’t need to be recomputed at runtime. In the end, we rather saved hundreds of milliseconds off the frontend load time by not having to convert some of those deeply nested JS objects to immutable.js ones.

Forking libraries is something you’d rather avoid as much as possible – every fork increases your potential maintenance burden. We might have chosen a different approach if immutable.js had been regularly updated.

Converting types, one-by-one

At the outset, we had about 250 core data types – models for all the things the user sees in the frontend, from retailers, data payloads and raw files to models about marketing events and demand plans. Our frontend codebase was about 100k lines of strongly typed TypeScript code. So we started this journey with a lot of work ahead of ourselves, if we wanted to call this done.

One of our core values is focusing on what matters – focusing on the highest-impact work. While having a robust data model in the frontend is important, this was not the only thing we needed to do, and arguably, the only user-facing issues one would see would be occasional loading times of abrupt freezes when loading very large data sets into Alloy. So this was more of a pure engineering challenge, and so putting a lot of time aside for just this would not be right. Another core value we have is iterate to excellence – finding a way to ship something small first, and take steps to take that work to completion, while constantly getting those results into production. That value applied well in a situation like this.

When you have a set of data models, they tend to form hierarchies and pyramid-like structures. All the baselayer data models that are reused frequently are the base of that pyramid. Examples of such models depend on the application. In Alloy’s case, these baselayer models include e.g. models for metrics that capture how e.g. sales numbers are computed. At the top level of this pyramid are your high-level models. Again, in Alloy’s case, these top-level models capture things like the concept of a user-facing dashboard – what the user sees in the analytics UI. You need to start this type of conversion work from the bottom level, and work your way to the top.

The baselayer models would be like the Retailer model above: all of its properties are primitives, and there is no nesting going on. The conversion for such methods was trivial and very mechanical. A substantial portion of our types leveraged primitive types like strings, numbers and booleans only, which made a lot of the work very easy.

The more interesting models were those at the second layer, like the example below:

type ICalendarEvent = {
  readonly createdAt: string | null;
  readonly expectedSalesRelativeLift: number | null;
  readonly products: ReadonlyArray;
  ...
}
export class CalendarEvent extends Record({
  createdAt: undefined,
  expectedSalesRelativeLift: undefined,
  products: undefined
}) {
  static fromJS(value: ICalendarEvent) {
    return new CalendarEvent(value).merge({
      products: AttributeFilter.fromArray(value.filters)
    });
  }
  static fromArray(array: any[]): List {
    return T.fromArray(array, Retailer);
  }
}

While redacted for brevity, this marketing calendar event class models real-life marketing events, like “30% off 200g Nutella jars at Walmart on the week of 2020-05-04”. The model contains a list of product filters that allow the event to be scoped to one or more products. The filter model is also part of our data model and reused elsewhere.

The above example illustrates one of the targets of our migration – the use of container types. In Immutable.js, rather than JavaScript’s native arrays or objects, you instead use Lists, Sets, Maps, and so on. In order to reduce our dependence on Immutable.js with our core types as much as possible, we wanted to eliminate these usages – including more complex behaviour like sorting, filtering, mapping, and reducing. The Immutable.js APIs for these either have no equivalent in the JS standard library, or often differ from their stdlib counterparts in API or behaviour. There are many Immutable.js specific APIs we were using that do not exist in standard JS arrays. How do you do such a conversion at scale?

It turned out to be quite simple – we were able to rely on regular expressions and a few step-by-step actions with the safety of TypeScript covering our backs! By this time, our adoption of TypeScript had been complete for a year, we had no implicit anys in the codebase, our dependencies all had typings thanks to the TS community. This allowed us to safely pull off this type of huge refactoring at will.

Immutable.js has convenient APIs we were using, like first(), last(), remove(), flatMap that JS-native arrays do not have. But there are libraries for improving support for all of this, and writing your own helper functions for the few ones we needed was straightforward. Seeing all of the issues one-by-one, investing into a dozen or so library functions to implement the convenience of immutable.js’s API to be used with arrays in an immutable way, and then moving on was a winning strategy.

We touched tens of thousands of lines of code as a part of this refactoring, and we had no major problems on the way. It was years later, but investing in fully typing our codebase paid off handsomely. It allowed us to execute this refactoring with confidence, could spread out this conversion across four months, convert models either in batches or one-by-one, depending on the amount of work needed, and finally review those changes in manageable chunks as well.

What could’ve been done better?

We shipped too much at once. In retrospect, one thing is obvious – we should have absolutely shipped the code generator with a bit of throwaway code so that it would have generated API wrappers like the Retailers API, and produced immutable.js-compatible objects from the start. We would have had to throw away a bit of that code later on, but that later on was 4 months later, a long enough time to do an extra few hours of work.

The problem was that for those four months, our codebase had 2 sources of truth for all our data types, which were often out of sync with each other. Inconsistency at such a fundamental level made contributing to the codebase confusing and frustrating at times, making it unclear which types you should be using and forcing you to sometimes litter type casts and guards in your code to pass the checker. This is less than ideal – you need to have consistency at the type system level. Shipping a bit of throwaway code would have helped us complete a key part of the migration first – invoking all of our REST API endpoints in a typesafe, non-explicit-any way, and go from there with the rest of the journey. In the end, we had no real troubles during these four months but looking back, moving slower could have reduced any potential risks even further.

Nullability issues with OpenAPI schema generation. We had started to use OpenAPI originally to produce online API documentation for the engineering team and for our stakeholders. To do this, we use swagger-core and related libraries on our Java backend side to annotate endpoint methods as well as the payload objects where necessary. As discussed above with e.g. the CalendarEvent example, our APIs often nest types. In these cases, it’s convenient to use OpenAPIs references to refer to another object by using the $ref property. For example, this is what the output JSON might look like:

"conversionAttribute": {
  "$ref": "#/components/schemas/Attribute"
},

The above field could correspond to the below Java definition:

abstract class Convertable {
  abstract Attribute conversionAttribute();
}

But shockingly, it could also correspond to this:

abstract class Convertable {
  @Nullable abstract Attribute conversionAttribute();
}

That is, when your types are referencing other types in your schema, the nullability is ignored by the OpenAPI spec generator! The nullability is handled correctly with all primitives like strings, booleans numbers and also with arrays, even if the array itself is a nested type. However, when the ref types are used as-is, like in the example above, there are problems. Losing track of nullability is problematic and makes it much harder to turn on strict nullability checks in TypeScript. OpenAPI 3.1 fixes this problem and should allow us to substantially improve here.

Conclusions

We started this major frontend refactoring project to simplify our architecture and rely on a language rather than a library. The project was successful and took about 4 months in total, with the work rolling along with the rest of our product priorities. It would have been impossible to do this work if we had rolled out only a partially type-safe version of TypeScript years back

The largest remaining problem we’ve got is the nullability of OpenAPI-produced object definitions for nested types, which prevents us from moving forward with strict null type checks. This issue is fixed in the schema spec’s version 3.1, but not yet supported by the swagger-core library. We’re eagerly awaiting for this to land upstream, as it will allow us to take our frontend a step further so we can better serve our customers.

About the Author:

Lasse Holmstedt

Related resources


Article

An interview with Gautam Gupta, Engineering

Gautam is a Software Engineer at Alloy. He is a University of Waterloo alum and has previously also worked at Shopify, Venmo, and Paytm.

Keep reading
Article

Alloy recognized in top 10 of CGT 2019 Readers’ Choice Awards

Alloy is named a best-in-class provider of AI and supply chain planning in Consumer Goods Technology's annual awards, alongside Google, IBM, Microsoft & SAP

Keep reading
Article

How customers use Alloy

See the top benefits and use cases of Alloy retail POS analytics software, according to results from our first customer feedback survey

Keep reading