Meta

Meta

Changing practice and perspective in data center repairs

Changing practice and perspective in data center repairs

I led the design of the Meta Server Repair Console—a new platform built to streamline knowledge sharing, enhance collaboration across teams, and make global server repairs faster and more efficient.

I led the design of the Meta Server Repair Console—a new platform built to streamline knowledge sharing, enhance collaboration across teams, and make global server repairs faster and more efficient.

My Role

Product Design Lead

Skills

Product Design

User Testing & Research

Hackathon & Workshops Facilitation

Impact

~ 10 million dollars

of monthly cost reductions in server repairs

75.82%

Decrease of Closed Tickets Without Repeats

6%

Reduction in Employee Churn

Background

Background

Background

At Meta, we’re constantly pushing the boundaries of Generative AI and VR, but scaling these innovations requires a rock-solid server infrastructure. Reliability and capacity aren’t just goals—they’re necessities.

That’s where we faced a challenge. Our servers were struggling to keep up, leading to frequent crashes and disruptions. Engineers were caught in an endless loop of fixes, slowing progress and creating frustration across teams.

To break the cycle and build a future-ready system, we took a deep dive into the problem, uncovering the root causes and designing a solution to keep our infrastructure running smoothly at scale.

Finding the problem

Finding the problem

Finding the problem

To discover the perfect solution, we first needed to
define the right problem.

To discover the perfect solution, we first needed to define the right problem.

To discover the perfect solution, we first needed to define the right problem.

83

Internal Facebook Group Posts Read

8

Engineers Interviewed

16

On-Site Technicians Interviews

The Problem

The Problem

The Problem

Engineers are constantly reinventing the wheel when troubleshooting server issues due to a lack of easily accessible and searchable documentation of past problems and solutions.


The lack of a structured system for capturing and sharing knowledge creates inefficiencies and discourages contributions, resulting in repeated work and lost expertise.

Wasted Time

Server Downtime

Engineer Burnout

Higher Costs

Objective

Objective

Objective

Create a platform to ensure seamless knowledge sharing across teams to break down silos, promote collaboration, and design solutions that are easy to scale globally.

Create a platform to ensure seamless knowledge sharing across teams to break down silos, promote collaboration, and design solutions that are easy to scale globally.

Create a platform to ensure seamless knowledge sharing across teams to break down silos, promote collaboration, and design solutions that are easy to scale globally.

The Process

The Process

The Process

Over three months of discovery, we explored solutions for the core challenge and ways to seamlessly connect multiple systems—ticketing, FSA, and the Repair Console. Throughout this process, we engaged directly with our users, who were also key stakeholders, to gain deeper insights and refine our approach.

Challenge 1 | Action Plan Repository

Consistency, standards, and self-service

How might we help repair experts enforce best practices and create workflows that can quickly evolve and 

scale with our operations teams?

Metrics

Deprecate wiki runbooks

Decrease number of undiagnosed tickets

Challenge 2 | Ticketing Platform

The Great Jira Escape: We're Plotting Our Freedom

How can we eliminate Jira while ensuring tickets are efficiently assigned to the right person?

Metrics

Decrease mean time to resolve

Decrease number of undiagnosed tickets

Increase the accuracy of repair

Challenge 3 | Global Monitoring

Creating a global perspective

How might we bring all the data around all the problems together in one place so that we can recognize patterns earlier and fix problems faster.

Metrics

Decrease mean time to resolve

Decrease number of undiagnosed tickets

Lessons

Lessons

Lessons

Adapt, adapt, adapt

In business, constant change taught me the value of quick thinking and flexibility, emphasizing the need to stay agile and ensure our vision and design adapt to shifting priorities.

Do more than meet in the middle

I ensured we understood each stakeholder's and team member's expectations, meeting them where they are to collaboratively achieve our shared goals.

Remember to pause

Even with rapid deadlines and frequent changes in direction, I often had to remember that taking a step back helps me perform better.

Let's

talk

Good design moves fast. Great design moves people.
Let’s build something that matters.

Contact me

Adrian Mucha (FruitBat Design, LLC)

Working Remote From LA, © 2025

Let's

talk

We are always on the lookout

for great clients who are passionate about their products and customers.

Contact me

Adrian Mucha (FruitBat Design, LLC)

Working Remote From LA, © 2025

Let's

talk

We are always on the lookout

for great clients who are passionate about their products and customers.

Contact me

Adrian Mucha (FruitBat Design, LLC)

Working Remote From LA, © 2025

Let's

talk

We are always on the lookout

for great clients who are passionate about their products and customers.

Contact me

Adrian Mucha (FruitBat Design, LLC)

Working Remote From LA, © 2025

Let's

talk

We are always on the lookout

for great clients who are passionate about their products and customers.

Contact me

Adrian Mucha (FruitBat Design, LLC)

Working Remote From LA, © 2025