Skip to main content
Decorative honeycomb

SCALABILITY MONITOR

An internal tool used to allocate and monitor a scalable bank of processing machines

Role

Lead UX designer (solo designer) - owned end-to-end research, design, and testing of a critical internal infrastructure tool

Challenge

The company’s infrastructure expansion was overwhelming the outdated specifications of a vital legacy tool, which needed to be redesigned quickly to account for scaling far into the future.

Solution

A new, scalable tool that was rolled out within a matter of months and immediately improved users’ awareness and utilization of their processing assets.

THE PROCESS

Business Goal

As the bank of shared processing machines grew, limitations made the existing monitoring software became intolerably slow and inefficient. It was time for a redesign to better serve the employees tasked with maintaining vital computing resources.

Understanding Users

  • Interviewed all internal operators about their experiences with the tool, looking for gaps in the experience
  • Observed the operators at work to understand daily use
  • Researched other existing job-monitoring interfaces
  • Created and refined wireframes with the operators
  • Validated high-fidelity designs with the team before development began

Users Needed To...

  • Configure coordinator machine's settings and connections
  • Monitor processing machines' allocation and activity on a dashboard
  • Monitor and manage prioritization of a long list of queued jobs
  • Find failures so they could be diagnosed and resolved quickly
  • Retain all functionality from the existing implementation

Design Challenges

  • Designing a high-contrast alert system for critical failures within a dashboard displaying 50+ simultaneous data streams
  • Designing the layout to emphasize the most valuable information first
  • Implementing bulk-action workflows that reduced job management time

Development Story

The company needed to expand their bank of job processing machines; but as the capacity grew, it became clear that the old monitoring tool hadn’t been designed to handle so many machines effectively. Navigating from one machine’s details to the next was too slow, and operators couldn’t find errors quickly because each job had to be opened individually. A small team was formed to develop a custom monitoring software, which needed to roll out quickly to maximize the newly added processing power.

Having known nothing about sharing computing resources among jobs prior to this project, I rapidly onboarded by conducting deep-dive interviews with operators to map the legacy workflow. I circumvented access limitations by orchestrating live walkthroughs with operators, during which they would take screenshots for me and detail opportunities for improvement. I watched as they used the existing software, noting points of frustration and many inefficiencies within the interfaces.

With a few sessions quickly revealing opportunities for improvement, I drew up low-fidelity wireframes and a workflow diagram to ensure I understood the functionality and common use-cases sufficiently. I sat down with each operator and the assigned developers to gather feedback and negotiate for any new functionality that could be added to improve the operators’ efficiency. There wasn’t much time allocated for creating the tool, so not all usability recommendations would make it in. Despite the cuts, some of the most requested features would fit thanks to the reusable assets in the UX style guide, which had already been implemented by another development team.

Gallery

OUTCOMES

Outcomes

  • Reduced design and development time by leveraging the existing component library and UX style guide
  • Coordinated with developers in a team that had used the style guide to share reusable code with the new project
  • Delivered a stable, future-proof solution that required zero major redesigns for 4+ years

Key Takeaway

For internal tools, speed of adoption is as critical as feature completeness. By leveraging an existing design system and focusing on the top 20% of user pain points, we delivered a tool that operators adopted immediately, proving that even "boring" infrastructure needs human-centered design.