Robots & Runes

Robots & Runes had great content — book reviews, award histories, reading recommendations, buried under a navigation structure that didn't reflect how users actually think.

My Role

UX Researcher/Designer

Methods

Content inventory & audit · Heuristic evaluation · Card sort · Tree testing

Tools

UXtweak, Figma, Zoom, Google Sheets

The Problem

Robots & Runes is a community site for science fiction and fantasy literature, centered on Hugo and Nebula Award-nominated and award-winning works. For genre fans and researchers, it's a genuinely valuable resource. But the navigation structure had grown organically over time rather than being designed around how users look for things and the result was excellent content that was hard to reach.

Our task was to redesign the information architecture from the ground up, using iterative empirical methods to build a structure that reflected real user mental models rather than internal site logic.

Research Questions

How do users mentally categorize the types of content on a science fiction and fantasy literature site?

What labels are intuitive versus confusing for key navigation categories?

Does the revised architecture allow users to successfully locate specific content?

What We Did

Stage 1 - Content Inventory

I created a full content inventory of the existing site, cataloguing all pages, categorizing content by type, and flagging categories that were ambiguous or overlapping. This produced the baseline we needed to understand what the site actually contained and it generated the specific content items we'd use in the card sort.

The audit revealed seven top-level navigation categories, several of which overlapped conceptually or used labels that didn't clearly communicate what was inside them.

Stage 2 - Heuristic Evaluation

I led a structured heuristic evaluation with three evaluators who each independently scored the existing site across Nielsen's heuristics for two task scenarios: finding specific book details, and filtering by multiple criteria simultaneously.

A few issues stood out consistently across all three evaluators:

Navigation labels reflected the site's own organizational logic rather than user goals

Category boundaries were inconsistent — some sections mixed multiple content types with no clear organizing principle

Award-related content wasn't surfaced prominently enough in the navigation

Seven top-level categories created cognitive overhead, with several having overlapping scopes

These findings shaped what we included in the card sort and which tasks we prioritized in tree testing.

Stage 3 - Closed Card Sort (3 Rounds, n=12 per round)

We ran three rounds of closed card sorting, adjusting categories and labels between rounds based on what the data told us.

Round 1 revealed significant disagreement around most of the existing structure. The one exception: Awards. By Rounds 2 and 3, we consolidated seven categories into four, replaced descriptive labels with action-oriented ones, and separated author information from book listings — a distinction participants consistently treated as meaningful.

Stage 4 - Tree Testing (3 Rounds, n=16 per round)

We ran three rounds of tree testing against our evolving architecture. The numbers tell the story:

Round 1: 70.6% overall success - the filtering task alone had a 25% success rate

Round 2: 91.7% overall - the filtering task jumped to 100%

Round 3: 87.5% overall - a slight decline, but no new failure patterns emerged

The primary failure pattern in Round 1 was participants navigating to plausible-sounding categories that didn't contain what they expected. A labeling and structure problem, not a participant problem.

Site Map

The site map went through three drafts before reaching its final form. Each was a direct response to what the research was telling us.

Draft 1 - Too Many Doors

Our first draft tried to surface everything. The result was a flat, wide hierarchy with too many top-level entry points. Award content was split across multiple sections. The core problem: too many options and no obvious starting point for book discovery.

Draft 2 - Consolidation and Clarity

The second draft consolidated the top level into four categories: Books, Awards, Blog, and About, with clearer, bounded page types within each. The remaining gap: the filtering system still wasn't clearly defined.

Final - Built from Testing

The final architecture addressed what the tree test data showed most clearly. The filtering system was centralized under book discovery. Award content was consolidated under a single "Hugo & Nebula Awards" section. Labels were refined based directly on what we observed in testing.

The result was a four-category top-level structure: Browse & Filter Books, Hugo & Nebula Awards, Blog, and About This Site, where each category holds one clearly defined content type with no overlap.

The labeling shift across drafts was as consequential as the structural one. "Find Books By" versus "Browse & Filter Books" may seem minor, but one matches the mental model of someone exploring options while the other implies a lookup. That difference was measurable in tree test success rates.

Deliverables

The final deliverables included a content inventory, a heuristic evaluation scorecard, three rounds of card sort and tree test findings, and an annotated final site map documenting the full structural evolution.

Reflection

This was my first experience running an iterative IA research process, and the structure — card sort, revise, tree test, revise, repeat — was genuinely illuminating.

Quantitative data doesn't replace qualitative interpretation.

A 70.6% success rate tells you something is wrong. It doesn't tell you why. That required looking at navigation paths, analyzing where participants went when they failed, and cross-referencing with card sort agreement scores. The numbers guide the inquiry; they don't complete it.

Labels matter as much as structure.

Some of our biggest usability improvements came from changing the labels, not restructuring the hierarchy. That difference was measurable in tree test success rates.

Round 3 humbled me a little.

I expected our third round, after two rounds of careful iteration, to validate that we'd gotten it right. The slight decline was a reminder that user testing is always sampling from a distribution of possible users, and that good IA design is rarely "finished." It's converged upon.

If I extended this project, I'd start with an open card sort where participants create their own category labels rather than sorting into predefined ones. That would reveal more about the language users naturally reach for, and might surface label choices we hadn't considered.