The Jagged Frontier of AI
The chapter most AI guides skip, and the one that keeps you from shipping broken work.
Most AI guides for marketing teams will tell you how AI helps. They won't tell you when it hurts. This chapter does both.
The reason matters. The teams that use AI well aren't the ones with the most enthusiasm. They're the ones who know where the edge is. Where Cowork makes the team faster, and where it quietly produces worse work that nobody catches until the campaign is already running.
That boundary has a name now. The jagged frontier.
The Study That Should Change How You Think About AI
In 2023, Harvard Business School and BCG ran a randomized field experiment with 758 BCG consultants, roughly seven percent of the firm's individual contributor workforce. Researchers gave them eighteen realistic consulting tasks, randomly assigned access to GPT-4, and measured the results.
The headline numbers got most of the press attention.
- Consultants using AI completed 12.2% more tasks.
- They worked 25.1% faster on average.
- Their output was rated 40% higher quality by evaluators.
- Lower-performing consultants saw the biggest gains, closing the skill gap with their top performers.
These are the numbers everyone quotes. They're real. They're also half the story.
The other half is what happened on tasks designed to sit outside AI's capability zone. The research team built one specific task to test what happens when AI can't reliably solve the problem. Without AI, human consultants got it right 84% of the time. With AI, they only got it right 60 to 70% of the time.
On those out-of-frontier tasks, AI users performed 19 percentage points worse than the group with no AI access at all.
The AI didn't refuse to help. It produced confident, well-written, completely wrong answers. The consultants who trusted it shipped worse work than the consultants who never opened the chat window.
Why It's Called Jagged
The frontier between "AI helps here" and "AI hurts here" doesn't follow obvious patterns. Two tasks that look identical to a human can sit on opposite sides of the line. The HBS researchers chose the word "jagged" deliberately. The boundary isn't a clean wall. It's irregular, undefined, and shifts every time a new model ships.
For CPG marketing, that means a Cowork pipeline that nails a content brief on Monday might quietly produce a misleading competitive analysis on Tuesday. The output looks the same. The model sounds equally confident. The damage is invisible until someone checks.
Three patterns hit the jagged frontier hard in marketing work.
- Tasks requiring proprietary, non-public context. AI can't know that your VP of marketing hated the last brief because it leaned too quant. That context lives in your team's heads, not the training data.
- Tasks where being wrong looks the same as being right. Ask for a confident-sounding analysis of why DTC conversion dropped last quarter and you'll get one, whether the underlying reasoning is sound or not.
- Tasks where the data behind the claim matters more than the claim itself. If you can't verify the source, you can't ship the output.
The marketing team that skips this lesson keeps shipping confident-sounding work that quietly underperforms. The team that learns it builds a checklist for what goes to Cowork and what stays close.
Three Ways People Actually Work With AI
The same HBS research team published a follow-up study in late 2025 with 244 BCG consultants, looking at how people actually collaborate with AI day to day. Three working modes emerged. The names matter because they predict who gets better at their job and who slowly gets worse.
Centaurs (clear division of labor)
A Centaur splits work between human and AI based on what each does best. Like the mythical half-human, half-horse, the line between the two parts is clean. The human decides strategy and reviews output. The AI handles drafting, formatting, and assembly.
In marketing terms, a Centaur uses Cowork for the brief, then writes the strategic recommendation themselves. They run a PDP audit through Cowork, then translate the findings into the client conversation in their own words.
Roughly 14% of the BCG consultants worked this way. They kept their domain expertise sharp because they kept doing the strategic work themselves. Cowork was a force multiplier for everything around the judgment calls, but not a replacement for the judgment.
Cyborgs (integrated back-and-forth)
A Cyborg blends with the AI. Tasks aren't divided so much as interleaved. The human writes a sentence, the AI completes it. The AI drafts a paragraph, the human edits and reframes. The boundary is fuzzy by design.
In marketing terms, a Cyborg uses Cowork inside the work itself. They start drafting an email, hand off to Cowork to expand a section, edit the result, run it through a Skill to tighten the language, then revise the closing themselves.
About 60% of the BCG consultants worked this way, making it the most common mode. Cyborgs developed entirely new AI-fluency skills (the research calls this "newskilling") while keeping their domain knowledge intact. The trade-off is that Cyborg work depends heavily on knowing when to push back on the AI's output, which is harder than it sounds.
Self-Automators (full handoff)
A Self-Automator throws the whole task over the wall. They write a prompt, take the output, and ship it. There's no review loop. There's no editing. The AI does the work and the human does the delivery.
In marketing terms, this is the team member who asks Cowork to write the strategic memo and sends it to the client without reading it carefully. Or the one who runs the PDP audit and forwards the report without verifying the scores.
The research found Self-Automators were neither improving their domain expertise nor building AI fluency. They were getting worse at both jobs over time. The work was getting done, but the human was getting weaker. For a marketing team, that's the worst possible long-term outcome.
Matching Mode to Task
The most useful insight from the research isn't that one mode is better than another. It's that different work calls for different modes, and the team that picks the wrong mode for the task pays for it.
Here's the rough mapping for CPG marketing work.
- High-stakes strategic work (positioning, competitive analysis, brand decisions): go Centaur. Use Cowork for the inputs, do the synthesis yourself.
- Routine production work (briefs, drafts, formatting, audits, assembly): go Cyborg. Interleave with Cowork throughout. This is where the productivity numbers from the HBS study actually show up.
- Pattern-matching with verifiable outputs (data extraction, sorting, summarization, repeatable Skills): you can run Self-Automator carefully. Verify the first three runs, then trust the pipeline. This is where Cowork shines because the output is checkable.
- Anything client-facing without review: never Self-Automator. The work that goes out the door represents your team. AI-generated copy with no human review is a category-five risk to your brand voice and your client relationships.
The teams getting 5-10x leverage from Cowork are running all three modes. They're just running the right mode for the right task. That selection is the actual skill.
The Persuasion Trap
There's one more research finding that matters before you start delegating work to Cowork. It comes from a 2025 paper by the same HBS team called "GenAI as a Power Persuader".
When BCG consultants tried to validate AI outputs by pushing back ("are you sure?", "this seems wrong"), the AI didn't soften or admit uncertainty. It escalated. It generated more confident arguments. It produced more sources. It convinced the consultant the original wrong answer was right.
The researchers call this "persuasion bombing." It's the reason "knowing when not to trust" is harder than it sounds. The model that gave you the bad answer is also the model defending the bad answer when you challenge it.
The practical implication for Cowork is straightforward. Never validate AI's work with the same AI. If you're auditing a Cowork output, take it to a human, take it to a different model, or take it to source documents. Trusting Claude to grade Claude is how confident-sounding mistakes get shipped.
This is also why high-stakes work should never go Self-Automator. The whole point of having a human in the loop is that the human catches the persuasive-but-wrong output before it leaves the building. If the human isn't actually reading the work, the loop is closed but empty.
A CPG Marketing Cheat Sheet
Here's the actionable version of everything in this chapter.
Send to Cowork (high leverage, low risk):
- Brief generation from approved inputs.
- Formatting and copyediting to your style guide.
- PDP audits against a defined rubric.
- Content production where you'll review the draft.
- Data extraction from structured sources.
- Recurring reports with consistent inputs.
Use Cowork as a starting point, then take over (medium leverage, medium risk):
- Competitive analysis that requires industry context.
- Persona drafts you'll refine with real customer voice.
- Campaign concepts that need brand judgment.
- Strategic recommendations to leadership.
Don't send to Cowork (low leverage, high risk):
- Decisions about brand positioning or category strategy.
- Client conversations about sensitive issues.
- Statistical claims you can't independently verify.
- Anything where being confidently wrong is worse than being slow.
The pattern isn't complicated. The cost of getting it wrong is just much higher than the AI marketing trade press lets on. Most guides treat Cowork as a black box that produces value. The work that produces lasting leverage is the work of knowing where the box stops.
What's Next
Now that you know the jagged frontier exists, and you have the language to think about your collaboration mode, the next question is practical. For any given task in front of you, which Claude surface do you use? Chat, Code, or Cowork? They overlap. The right answer depends on the task type, the stakes, and where the output needs to land.
The next chapter is the decision framework. Five minutes of reading. Years of avoided mistakes.