Measuring UX Content: Using Signals and Qualitative Insights

Editor’s note from Frontitude:

This article is part of our blog series, “Measuring UX Content,” where we explore how content design teams track, prove, and improve the impact of their work. We’re speaking with UX writers, content designers, and product leaders who are building the case for content with data, and doing it in thoughtful, user-centered ways.

‍

In this guest post, we’re excited to feature Justin Marks, a senior content designer and strategist with experience at companies like Intuit (Mailchimp) and JPMorgan Chase. At Mailchimp, he led efforts to simplify complex systems like audience management, using content to reduce confusion, support load, and friction. Justin shares his thoughtful approach to measuring UX content, blending qualitative insights, data signals, and AI tools to build scalable, user-centered content systems.

‍

If you’ve worked in UX writing long enough, you’ve probably run into this moment: you rewrite an error message or refine a CTA, and the product team nods. It “feels better.” But how do you actually prove it’s better?

‍

That’s the question I’ve wrestled with throughout my time as a staff content designer, most recently at Mailchimp, where I helped lead efforts to improve core audience features. We weren’t just naming features; we were trying to reduce user confusion, improve data structure usability, and scale clearer content across a decades-old platform.

‍

This post shares what I’ve learned about measuring UX content. Not as a one-off exercise, but as a system-wide practice.

‍

Language Confusion Is a UX Problem (and a Data Problem)

One of the biggest challenges I faced was in reworking some of the core taxonomy and terminology at Mailchimp. Our users were getting tripped up on terms like audience, tags, segments, and groups. Each had a clear internal definition. But in practice, the distinctions were blurry, and users were paying the price.

‍

When language creates confusion, it’s tempting to slap on a tooltip or tweak a headline. But what I learned is that no amount of content polish can fix broken architecture. In this case, we had to work closely with product and engineering to revise how the underlying data structure worked before we could truly simplify the UI copy.

‍

Measuring UX Content by Correlation

UX content is hard to measure because it doesn’t live in a vacuum. It’s part of a broader system, shaped by UI, logic, user behavior, and backend constraints. So while we’d love to run clean experiments that isolate copy alone, the reality is that rarely happens. What we get instead are signals: patterns that suggest the content is playing a meaningful role, even if we can’t scientifically prove it.

‍

That’s why I lean on correlation more than causation. In one case, we rewrote an onboarding flow after digging into support ticket trends. A few months later, error rates for that flow dropped by 20%. Was it all because of the copy? Can’t prove it. But the timing and focus of the change pointed in a strong direction.

‍

In another project, we surfaced help content inside the UI using contextual drawers. As those help clicks increased, support calls on the same topics decreased. Again, not definitive proof, but the inverse trend made a compelling case that the content was working.

‍

In these situations, I look for converging signals: a content intervention followed by a shift in user behavior. It’s not perfect science, but it’s often enough to guide smart decisions and get others on board.

‍

Why I Prioritize Qualitative Signals

When it comes to measuring UX content, not everything is captured in a dashboard, such as open support tickets or conversion rates. Some of the most powerful content insights I’ve found come from direct user feedback, especially during early-stage testing.

‍

In one project, we needed to upgrade our current terminology system related to audience management. I ran a series of card sort exercises, both closed and open, with users in our target industries. I asked them to match terms with definitions, and then tested how well they could apply those terms in real-world flows.

‍

What struck me was how much context influenced interpretation. A term that made sense in isolation got completely misread when placed next to similar words in the UI. These tests helped us validate the language we used and avoid terminology choices that looked good in a spreadsheet but failed in practice.

‍

To keep things efficient, I typically test with 10-15 participants through platforms like UserTesting. It’s enough to spot patterns, especially when testing microcopy or help content.

‍

And here’s the kicker: even when I already “know” the right answer, testing often surprises me. It keeps me honest and ensures that I’m designing for real users, not just content team consensus.

‍

Measuring Is a Team Sport

One thing I’ve learned is that measurement won’t happen unless you drive it, especially as a content designer.

‍

There’s a lot of organizational inertia around tracking features and flows, but not individual content decisions. That means you have to advocate for your copy to be included in analytics tagging. You need to build relationships with your PMs and product analysts. And sometimes, you need to be the one asking: Can we track this CTA? Can we log interactions with this tooltip?

‍

At Mailchimp, I spent a lot of time building trust across functions. That meant translating content performance into metrics product managers cared about. It meant working with UX research to align our language testing with broader usability studies. And it meant sometimes doing my own scrappy data analysis when the analytics team was stretched too thin.

‍

The more I showed how content decisions tied to business outcomes, like reducing support load or increasing task success rates, the more stakeholders started seeing content design as a strategic partner, not a last-mile polish layer.

‍

How I Use AI

One of the most exciting (and unexpected) shifts in my process came when I started using AI tools to scale research synthesis.

‍

We were working on content patterns for several business verticals: nonprofits, e-commerce, entertainment, and so on. Each had unique language preferences and value props. Normally, building persona-specific content guidance would have taken weeks.

‍

Instead, I prompted an AI model with our customer research and had it generate draft tone guidelines, value messages, and do/don’t examples for each segment. It wasn’t perfect, but it was a strong head start. I then validated the outputs with our researchers and built out vertical-specific content kits for the design team.

‍

That project showed me that AI isn’t just a copy generator, but it can be a research accelerator, a pattern spotter, and a tool for personalized guidance at scale. I don’t ship AI-generated content, but I absolutely use it to move faster and more confidently in the early stages.

‍

Final Thoughts

If there’s one thing I’ve learned, it’s that measuring UX content is about building a thoughtful, ongoing practice. The real magic happens when you combine signals from data with insights from users, and when you work across disciplines to connect the dots.

‍

Whether it’s validating language through early testing, spotting behavior shifts after a content change, or using AI to accelerate research, the goal is the same: create content that actually helps people.

‍

It’s not always clean or measurable in isolation, but when you push for it, advocate for it, and keep users at the center, content becomes a powerful driver of clarity, trust, and impact.