Getting to Slack faster with incremental boot

8 minutes • Written 8 years ago

At Slack, we’re on a mission to make people’s working lives simpler, more pleasant, and more productive — improving the performance of our products falls into the “more productive” part of that mission. Today, we’d like to tell you about a change we’ve made to Slack’s web app to speed up initial load times — that is, how quickly we go from a blank window to being online and ready to work.

How Slack works on your desktop

Slack’s web app is what you’re using if you use Slack from a web browser or one of our desktop apps. It builds a complete model of your team client-side, so using Slack is quick and responsive: whether you’re switching channels, flipping through emoji, or looking at your team-mates profiles, Slack has all of the data it needs on-hand already, which means our users aren’t stuck waiting for API calls to happen before they can do what they want.

It does this using the rtm.start API method method of Slack’s real time messaging API. This single API call delivers all of the information we need to boot Slack: a complete copy of the model (all of the channels you have access to, all of the members on the team, all of the custom emoji, and so on) as well as a WebSocket URL to connect to for real-time updates. Depending on what information Slack already has loaded in local storage, rtm.start may be able to omit some data — for example, only sending you members who have changed since the last time you were online.

Keeping everything in a local model is great for app responsiveness, and it even helps Slack stay usable during service interruptions or problems with our users’ internet connections. However, building that model can be expensive, especially on large teams: each time you load Slack, it has to build that model in memory all over again. We cope with that by making heavy use of client-side caching to minimize how much data we have to transfer from our API, but on large enough teams, even that can’t help us completely: at the end of the day, our JavaScript code has to iterate over all of the channels, users, and messages on your team to get them ready for use.

Historically, all of this work has happened while our users are looking at an inspirational or fun loading message, like this:

Seeing these messages is certainly better than staring at a blank screen, and are an example of the sort of user experience that generally separates web apps from native apps. However, the less time our users spend reading these messages, the better: nothing is as inspirational as actually getting back to work, right?

Boot timeline

Fetch the HTML, which includes our loading screen.
Fetch the JavaScript and CSS assets.
Call the rtm.start API.
Process the data from rtm.start.
Build the client’s UI — the channels list, the message box, the sidebar, and so on — and then hide the loading screen.
Connect to the WebSocket — this marks as online and allows us to send messages and receive real-time updates.
Fetch history for the conversation you’re viewing (if we don’t already have it in local storage).
Display the messages in the current conversation. Content is visible and we’re fully loaded — we’re done!

How Slack works on mobile

Life on a mobile device is very different than on the desktop: devices and connections are often much slower, so it’s important to make it easy for people to get in, do what they need to do, and get out.

Rather than starting out by building a complete model, our mobile apps just fetch the information they need to render the first screen of the app. If you’re coming into the app after tapping on a notification, that means only fetching information about that DM or channel — much less data than a complete model.

As Slack teams get larger and larger, the difference between a partial view-based fetch and a complete model fetch gets bigger and bigger. We decided it was time to pursue this model on the web as well.

Incremental boot

Since its earliest days, Slack’s web app has expected to have a complete model in memory at all times. Adapting to a universe where this was no longer true was a daunting task to contemplate, so we introduced a two phase incremental boot process:

We start by fetching only the information needed for the initial channel/conversation. This initial payload includes information about the team and the logged in user, plus whichever channel we’re rendering. This includes recent messages in the channel and all of the users mentioned in those. We use all of this data to build a local model, exactly as if we had gotten a full team payload during an ordinary boot.

Since we don’t have a complete model, however, many parts of Slack are not usable: switching to other channels won’t work, and rendering new messages that arrive wouldn’t be guaranteed to work, either, so we don’t connect to our WebSocket. We also can’t render your channels list at this point, since we don’t know what to put in it. Here’s what Slack looks like at this moment:

We introduced a new metric, “content visible”, to account for users reaching this part of the boot. This partially loaded stage intentionally disables most of the UI until we have the rest of the data needed to render your team — for example, it doesn’t make sense to let users open up our Quick Switcher if there’s nothing to switch to.

At the same time we are fetching the data for that first phase, we start fetching the data for a full model boot via our old friend, rtm.start. We sit on this response until we get the first channel up on screen; at that point, we start doing another boot on top of the previous “incremental” boot, filling in all of the channels, users, and other objects that were omitted from the initial load. Once this is done, we’re ready to go — the partially loaded stage is over, so we enable the user interface, connect to our WebSocket, and remove the placeholder graphics.

At this point, Slack is ready to go — and hopefully, our users are none-the-wiser about our all of our behind-the-scenes hijinks.

Incremental boot timeline

Fetch the HTML, which includes our loading screen.
Fetch the JavaScript and CSS assets.
Call the single-channel-view and rtm.start APIs.
Process the data from the single-channel-view response.
Build the client’s UI and display the messages in the current conversation. Note: unlike the old boot process, we always have these messages, because they’re in the single-channel-view response. At this point, we also disable parts of the UI that aren’t ready yet — things like the channel sidebar or the message input that won’t work until we finish our entire boot process.
Hide the loading screen. Content is now visible.
Process the data from rtm.start — this includes all of the members on your team, the channels you’re in, your custom emoji, etc. Depending on what information your client has in local storage, this response may omit some subset of that data.
Re-enable the parts of the UI that we disabled earlier.
Connect to the WebSocket.
Client is now fully loaded — we’re done!

Impact

Prior to launching this feature, we re-evaluated the metrics we use when we think about our performance. We defined two key metrics:

“content visible” — this is the first moment that any of your messages are visible on screen and you can start reading them
“fully loaded” — this is the moment when the client has finished loading, the interface is interactive, and we are connected to our WebSocket.

Without this new feature, these two metrics happened to represent the same moment in time, because we didn’t hide our loading message until we had fully loaded.

After this project, content visible happened much sooner, because we hide the loading message as soon as we can render your initial channel. Because an incremental boot inherently does more (runs through the loading process twice), we were willing to accept a small negative impact to our fully loaded number if we were able to bring the content visible number down substantially.

Fortunately, we didn’t have to accept that trade-off: incremental boot does more work, but it parallelizes a lot of it behind a potentially slow API call, which led to a net win. Incremental boot not only gets content on screen faster: it finishes loading faster, too.

We rolled this feature out gradually, both to ensure we didn’t cause any problems for our users and to get enough performance data to make apples-to-apples comparisons. We were proud of the result: incremental boot brought the content visible metric down from around 7 seconds to under 5 seconds…

…while also bringing the fully loaded metric down from around 7 seconds to around 6 seconds.

On the average day, Slack’s web app is loaded 4.2 million times. Loading one second faster may not sound like much, but we are saving our users 49½ days worth of waiting, every single day. That’s great for this project, but our work here is far from done: we have other projects coming down the pipeline to improve performance and resource utilization further. We have a mature codebase and a broad feature set, and we know that people rely on us to get their work done. We won’t stop until Slack feels like magic.

Acknowledgements

Thank you to Scott Sandler and Jamie Scheinblum for supporting this feature on the API side of the house, to Patrick Kane and Johnny Rodgers for endless code reviews, to Hubert Florin for design support, and to Caitlyn Burke, Tomi Eng, and Kristina Rudzinskaya in QA Land for ensuring everything was ship-shape.

Want to help Slack solve tough problems and join our growing team? Check out all our engineering jobs and apply today. Apply now