At Hootsuite, we’ve been working on restructuring our front-end architecture using React and Flux. This has given us the opportunity to explore the benefits we gains by structuring the data on the front-end as immutable collections. As part of the Engagement team, a group of us are working on Streams, the part of the product our users directly interact with when they use the Hootsuite dashboard. This is one of the major chunks of the product being migrated over from Backbone and jQuery.
Flux is an architectural pattern that complements React by utilizing a uni-directional data flow. When a user interacts with a React view, the View fires an Action that goes through a Dispatcher to update a Store that holds the application’s data and state, which in turn updates the Views. Uni-directional data flow ensures that a change in application’s state is updated wherever the state is used without forcing the developer to update the code everywhere the state is used.
Making React Even Faster
As part of building out our new front-end, it was important that performance was a concern that we kept in mind. With customers relying on our product to be highly efficient and robust, this was going to be a primary concern as we moved over to React and Flux. A key improvement metric that can be looked into with regards to React is how often a component re-renders. That’s where immutable data comes into play and provides a better way to optimize the process of re-rendering.
The way React works is by maintaining its own fast implementation of the DOM tree called the virtual DOM. Whenever there is a change in the UI, React makes a new virtual DOM and compares it with the old one, and if they’re different, it updates the actual DOM, minimizing the number of mutations. To improve performance, we need to ensure that only the part affected by the changes in the data is re-rendered. To allow developers to do this, React provides a component lifecycle function called
shouldComponentUpdate which runs every time a component is re-rendered. We can couple this function with immutable data structures to ensure that a re-render only happens when the data has actually changed.
The Problem with Mutable collections
Let’s look at some of the problems that we may run into if incase we were dealing with mutable data and states within React. Imagine a Message component that takes the data necessary to render a message on the page. If a user was to edit the message text, we want to ensure that the component reflects the change, so we might have something like this in our lifecycle function that allows us to re-render only if the message text has changed when we receive a new list of messages.
Instead, we need to do a deep comparison to determine whether the values we’ve received have actually changed. However, with nested objects, we would need to do a recursive deep comparison operation that is really expensive. And since such a comparison would need to be done every time
shouldComponentUpdate runs, which React invokes very frequently, we would take a large performance hit.
ImmutableJS to the Rescue
What does that mean if you’re using Flux stores to manage the data as a state of the UI? Well, it means that the data within the Flux stores will need to be modelled to be immutable. With the state of our UI being immutable data, we gain the ability to perform very fast comparisons. This allows us to use such comparisons within the
shouldComponentUpdate without negatively affecting the performance.
One way that immutable collections achieve space and time efficiency is through sharing structure. Whenever a new collection is made from a new one, it reuses the parts that haven’t changed, and only creates the parts that are new, or have changed. This avoids the time lost spent copying an entire structure, making it time efficient. Moreover, the memory footprint is also smaller since the same parts are being used for this new structure. We can illustrate this through the following example of a tree structure where a we’re updating a node.
When we need to make changes to the blue node, we create a copy of it, marked by the orange one. It can still point to its old child since we don’t need to change that. However, the parent of the blue node needs to point to the new one, so we need to create a copy of it, marked by the red node, and point to the copy of the blue node, however, we can still reuse the parent’s left child. Notice how more than 50% of the structure is being shared.
Arrays and Objects as Immutable collections
Now that’s pretty cool but what about Arrays and Objects? Arrays, which are a list of values, and Objects, which are key value pairs are the bread and butter of our modern applications. Immutable provides two highly efficient implementations of these two data structures. Arrays are represented by Bitmapped Vector Tries, whereas Objects are represented by Hash Array Mapped Tries. Let’s take a look at what these two are and how they work.
Bitmapped Array Tries can be imagined as a tree of arrays. All the actual values are only stored in the leaves. In the example below, let’s say we wish to find the value ‘114′. First, we’ll convert the number to a binary. This gives us 01110010. We lookup the first two most significant bits. We get 01, which is the number 1, so we follow along index 1 of our first array. Next we lookup the next two significant bits and we get 11. That’s the number 3, so we we follow along the index 3 of our next array. Then we lookup 00, which corresponds to index 0. And lastly, we lookup 10, which corresponds to index 2. And voila! We’ve found our value ‘114′. If in case we wanted to store something at value ‘114′, say a string ‘foo’, we’d lookup value using the algorithm described and place the value in the 114th place. This implementation is used by the immutable collection called List and can be used as an Array.
Hash Mapped Array Tries work in a similar fashion, the keys and their corresponding values are only stored at the leaf nodes of the tree. If we wish to find a value associated to the key ‘m’, we first pass the key through a hash function. A hash function can be thought of a function that takes anything and returns a number. Let’s say that for ‘m’, our hash function gives us the value 45. We convert 45 to binary, and this time, instead of looking up the most significant bits, we lookup the two least significant bits first. Which, in this case, are 01. We go follow through to index 1, we lookup the next two least significant bits, which are 11, so we follow through to index 4. Then we lookup 10, which corresponds to index 10. And ta-da! We’ve found the key ‘m’. We lookup the value and return ‘munchkin’. A hash array mapped trie has the ability to increase its dynamically, unlike a conventional hash map. This makes dealing with collisions (adding two values to the same node) significantly less expensive. This implementation is used by the immutable collection called Map and can be used as an Array.
References & Additional Reading
- Lee Byron – Immutable Data and React (React.js Conf 2015)
- Jean Niklas L’Orange – Understanding Clojure’s Persistent Vectors (Part 1 of 4)
- Phil Bagwell – Ideal Hash Trees
The resources above were essential towards helping me understand immutable collections and their implementation details described in the GIF illustrations above. The illustrations of the tree structures were drawn in Sketch and then converted to a GIF using Photoshop.