Software performance is like a series of card tricks:

  • Do less up front.
  • Be really lazy.
  • Prepare in the background.
  • Be one step ahead of the user.

Whether doing magic with cards or a browser, it doesn’t hurt to have an ace up your sleeve. ♠️

This two-part series is about our work refactoring part of the Slack desktop client for performance. Topic highlights include avoiding and deferring work, adding smart preloading capabilities, the pitfalls of LocalStorage, and lessons learned while refactoring.

This round of improvements went live in mid-2016, but as performance is a continuous area of focus, there are always more plans in the pipeline.

The Slack Webapp: A 30,000-foot view

The Slack desktop client* builds and maintains a data model for the duration of your session. There are many elements to the model, but here are a few:

  • Channels (Public, Private, and Direct Messages)
  • Members (humans and bots comprising your team)
  • Messages (from members, or the server — e.g., join/leave events, profile updates etc.)

* The Slack real-time messaging app works in web browsers, and our installable desktop app also uses a web-based view of the same; hence, “Desktop” is an interchangeable term for both.

The client starts with an empty model, which gets populated by a combination of local cache (as applicable) and API calls. On startup, the client makes some initial API calls and connects to a Message Server via WebSocket. The WebSocket connection is the transport layer for receiving real-time messages — whether messages sent from other users on the team, or model update messages that the client needs to update state(e.g., a user joins/leaves a channel, updates their profile, uploads or shares a file, and so on.)

Early Days: Loading The Whole World (Or Almost All Of It) Up Front

Originally, Slack’s browser-based web app could effectively load the entire model up front without any notable performance issues. With most teams having less than 100 people, there were still a relatively small number of channels, message traffic was somewhat low, and client load time was quite fast.

As the size of the average Slack team increased, hotspots impacting performance in various areas became more evident. While it can be convenient to have the world at your fingertips, at some point the client cannot — and should not — know about, or need to render, absolutely everything right away.

On The Subject Of Laziness

The Slack desktop client builds and maintains an array of messages for each channel that you are a member of. The client receives new messages over the WebSocket as they are sent, and it can also fetch older messages via the Slack API.

Previously, we would make a channels.history API call and fetch the most recent messages for each channel that you are a member as soon as you opened Slack. For a small team with a limited number of channels, this was convenient; however, this pattern clearly does not scale well:

  • One API call for each channel = more time spent making requests at client start.
  • Flood of requests overwhelming the client’s self-imposed API queue: even if avoiding browsers’ own limits on parallel requests, the queue would take a while to process and would block other API calls.

Building rapidly and for scale are not necessarily at odds with each other, but it’s an interesting balance: building out many features simply and quickly, and refactoring for scale as you grow. Even when scale is considered up front, changes in usage patterns can impact the performance of your app on both backend and frontend — sometimes in interesting, and unexpected, ways.

Doing Less Up Front: “Stopping Fetch From Happening”

Refactoring the Slack web app to fetch messages for only the current, active channel (instead of all channels), seemed straightforward enough.

In the past, we needed to load message history for every channel in order to determine what the state of each channel was; i.e., whether you had any unread messages (channel name shown in bold), or mentions (channel name has a “badge” overlay with a number on the right.)

An example of the channel list UI. #coffee has unread messages, and #canada has a mention (badge) highlight.

A flood of API calls could mean waiting a while for the channel list to be populated with initial state. The more channels you are a member of, the longer the wait in order for the unread / mention state to show. If the channel at the bottom of your list had a mention, it could take quite a while before that state loaded in. Seeing the channels in your sidebar light up one-by-one made Slack feel slow for large teams.

Some time after Slack’s public launch, a users.counts API method was implemented for Slack’s mobile apps. This method provides a simple list of unread and mentions counts for each channel that your client needs to know about.

By moving to users.counts on desktop, we are able to “light up” the entire channel list with unread / badge state in a single request. This improves the perceived performance of the client: as far as the user is concerned, everything looks ready to go immediately.

At this point, we know which channels have unreads and/or mentions, but we do not have any actual messages loaded in the client. It makes sense to fetch messages for the channel currently in view, but no more than that.

Being Really Lazy (with Channel History)

We could do nothing further at this point, but there are times when it’s required (or, optimistic) to make a channels.history API call, fetch messages for a given channel and add them to the client model.

Common cases where we may call the channels.history API:

  • You view a channel for the first time.
  • You receive a new message on a channel not presently in view.

Previously, the Slack desktop client would define a “page” of messages as small as 30, and as large as 200 based on the number of channels you are in. The intent was to balance the number of messages fetched and cached by the client. As a recurring theme, this approach scales only up to a point.

While loading 200 messages per channel sounds useful if the user is only in a few channels, it is unlikely that users will regularly scroll back that far. If you don’t need it, don’t fetch it!

We experimented a little, and settled on the ever-magical 42 for a page of history. 42 messages covers a reasonable amount of conversation without going overboard, and is enough to fill the view on a large monitor. Additionally, most users have less than a full page of unreads on a per-channel basis when connecting or reconnecting. Whether browsing or catching up on a busy channel, older messages can always be fetched as the user scrolls back through history.

Preparing In The Background: Pre-fetching Messages At Scale

Thanks to the users.counts API, we know if a channel has unreads and/or mentions. Since you’re more likely to view unread channels first, we could lazy-load that message history in advance.

It is easy enough to queue up channels.history API calls, or perhaps batch them to fetch messages for multiple channels with a single request. However, we can be a little smarter about this if we know a bit more about your usage habits.

Taking scale into consideration again, we don’t want to necessarily pre-load everything; a client could have a lot of channels with unreads (e.g., you return from a week-long vacation). However, it’s reasonable to expect that unread Direct Messages (DMs) will be seen as high priority and likely to be read before others, given the 1:1 conversation context. Every unread DM counts as a “mention” and gets badged, so it is highlighted in the client.

Given that, we prefetch unread messages with the following priority:

  • Direct Messages (DMs)
  • Group DMs
  • Channels with badges (direct mentions or highlight words)
  • Channels with unreads, but no badges (up to a limit)

At present, the client will prefetch history for all unread DMs, group DMs, and badged channels because you are likely to be interested in those messages.

Considering very large teams, there is a point where pre-fetching should stop as it can do more harm than good. For example: if you’re a member of 50 channels and returning from vacation where most all have unreads, we do not want to load all of that state up front into the client — it could negatively impact performance (e.g., memory use). Even if you are a fast reader, it is going to take you time to read each individual channel. If there is a lot of volume, you may not even view all of them at once. In this case especially, it is smarter for us to rely on other tricks to stay one step ahead of you.

Being One Step Ahead: Simple Preloading Tricks

Not only do we know which channels to pre-fetch, we also learn which channels you view the most often (frequency), and most recently (recency) — hence, “frecency”. Frecency is informed by features like the Quick Switcher, and it becomes smarter during your normal use of Slack.

When we get the list of unread channels, we cross-reference the unreads with frecency data and sort the preloading queue, so the places you’re most likely to visit get loaded first.

For the curious, our engineering group has written previously about frecency as used in Slack: “A faster, smarter Quick Switcher.”

Keyboard shortcuts = pre-fetch cues

Both events and user actions can make great hints for preloading message history in channels.

If you use alt + up/down keys to navigate through the channel list, we can preload one ahead of you as applicable. If you use keyboard shortcuts, there’s a reasonable assumption you’re going to repeat that action.

Fetch on new messages

If at any time you get a new message in a channel, we can pre-fetch history and practically guarantee the channel will be synced before you view it.

Performance Timeline: Before / After Optimization

The following is a rough comparison of timelines from Google’s Chrome Dev Tools, showing the amount of script, rendering and layout work performed during the load of the Slack desktop client on an active team.

While we of course measure and compare individual metrics like “time to load” and “model set-up”, it’s useful to be able to visualize your app’s lifecycle and compare performance holistically.

The two graphs below are dense with information and will be broken down further, so don’t compare them too closely. At a high level, you can see there is less “noise” post-refactor due to the elimination, avoidance and deferral of work.

Before optimization: “noisy” flame chart activity

Before: Scripting and style and layout, oh my! Lots of blocking JS with deep call stacks (orange/teal), and layout (purple) due to DOM updates. Memory (heap) trends quickly upward with major garbage collection at~13 seconds.

After optimization: reduced flame chart activity

After: Reduction in blocking JS, DOM updates and style / layout. Memory is less “spiky” from reduced churn (e.g., fewer loops creating small / short-lived objects.) Major GC now at ~33 seconds.

Style and layout calculations

JavaScript-initiated DOM updates inevitably mean the browser has to spend time calculating style changes, and then perform layout work as part of updating the UI.

Before refactoring, there was a notable pattern of “thrashing” involving the channel list. During start-up, the client would repeatedly redraw the channel list with unread state (bold and/or badges) because it would fetch messages and then redraw the channel list for every channel you were a member of, one at a time.

Before: repeated channel history calls contribute to more UI thrashing — one fetch = one channel list redraw.

Post-refactor, the channel list’s initial state is fetched by a single users.counts call and updated all at once, eliminating an amount of redundant DOM and style/layout work.

After: Notably-reduced style + layout work following improvements to avoid channel history calls en masse.

Script activity

There is a certain fixed cost to “booting” the Slack client itself: We must load and parse our core JavaScript files from our CDN, register our JS modules and get the client started.

It is the repeated channel history fetches and related work following boot, however, which presented optimization opportunities.

Before: Client load, parse and boot, followed by repeated channel history calls and subsequent updates.

Post-refactor, there is a notable reduction of JS activity again thanks to users.counts — we know which channels are unread in a single call, and don’t have to call the API for every single channel (nor update the UI after each API call) to determine what’s new.

After: Reduced scripting work, thanks largely to the users.counts API and reduced channel history calls.

Memory usage: allocation and garbage collection

While hand-wavy and difficult to reliably test repeatedly, it’s good to have a “healthy-looking” JS memory profile. In this case, graphs show similar patterns of gradual allocation and garbage collection; there is no evidence of “GC thrashing” (i.e., a sawtooth pattern) from hot loops allocating a lot of memory or creating thousands of short-lived objects, etc. This is generally good.

All of these factors — rendering, script activity, and memory use — contribute to improving the bottom line of client load time, start-up and performance over time. Slack also aggregates some performance metrics from live clients (e.g., load times), as well. Performance data helps to inform how our app is behaving in the wild and at scale, above and beyond what we can infer with our own local tests.

Per our own performance metrics, clients got a 10% load time improvement across the board following this work. In the most extreme cases (e.g., our own super-size “stress test” teams), load time was reduced by 65%.

Conclusions

Presenting users with a fast, responsive UI for large Slack teams requires more planning, increased technical complexity and a little trickery, but is a worthwhile investment — whether scaling an existing implementation, or building from the ground up.

  • Do less up front.
    Don’t load all channels’ unread messages right away.
  • Be really lazy.
    Don’t fetch more than needed for channels with unreads. Prefer mentions and avoid prefetching too much, to keep the client model lightweight.
  • Prepare in the background.
    New messages received can present an opportunity to prefetch history.
  • Be one step ahead of the user.
    Use “frecency” and user actions to inform and prefetch the user’s next move.

To Be Continued …

This is just the first piece of our journey to increase client performance.

Part 2 will cover lessons learned from this refactoring in more depth—trade-offs with caching, LocalStorage, LZString compression, and more. Stay tuned.

In the meantime — while you’re here — we are always looking for people who like digging into the details, and working on interesting problems.