Key takeaways:
- I built 8 free GTM diagnostic tools at sandsdx.com/lab/ using AI-assisted coding, with zero prior experience writing React or TypeScript
- The process took weeks of iteration, not the “build an app in 20 minutes” you see on Twitter. It’s real work. It’s just a different kind of work.
- Vibe coding is best understood as a collaboration pattern: you bring the domain expertise and product thinking, AI brings the implementation
- The tools generate real inbound leads and demonstrate GTM expertise in a way no blog post or PDF ever could
- B2B marketing teams sitting on ideas that require engineering support should be experimenting with this right now
The problem I was trying to solve
I run a fractional GTM practice for B2B SaaS companies. My site needed to do two things: explain what I do, and prove I can actually do it.
The “explain” part is straightforward. A homepage, a services page, some written perspectives. Standard stuff.
The “prove” part is harder. Every consultant has a website that says they’re strategic and experienced. Nobody reads those pages and thinks “yes, this person clearly knows what they’re doing.” They think “this person has a website.”
I wanted something different. I wanted people to interact with my thinking before they ever talk to me. Run their homepage through a diagnostic. See their positioning gaps mapped out. Get a buying committee analysis they could share with their team.
Free tools. Real analysis. Useful output.
The catch: I’m not a developer. I’ve never shipped a React component. I can read code well enough to follow along, but writing a TypeScript application from scratch was not in my skill set.
A year ago, this idea would have required hiring a developer or finding a technical co-founder. In early 2025, I decided to try building it myself.
What “vibe coding” actually looks like in practice
The term gets thrown around a lot, usually alongside a screen recording of someone building a to-do app in 90 seconds. That’s misleading.
Here’s what my process actually looked like.
I started with the Hero Checker. The concept was simple: paste a URL, scrape the hero section, and run it through Claude’s API with a detailed scoring rubric. Five categories, five points each, 25-point scale. Specific criteria for each score level so the analysis would be consistent and useful.
The scoring rubric was the easy part. I’ve audited hundreds of B2B homepages. I know what separates a clear hero section from a vague one. Translating that into a structured prompt took maybe an hour.
Building the actual tool took considerably longer.
I described what I wanted to Claude Code: a React component with a URL input, a loading state, an API route that scrapes the page and sends the content to Claude, and a results display with scores and recommendations. Claude Code scaffolded the whole thing. First version worked on the third try.
“Worked” is generous. It scraped pages inconsistently. The loading state was janky. The results layout looked like a homework assignment. The error handling was nonexistent.
This is the part the Twitter demos skip. The first output gets you maybe 60% of the way there. The next 40% is where all the real time goes.
I spent days iterating. “Move the score cards into a grid.” “Add a loading tip that rotates every few seconds.” “Handle the case where the scraper can’t find a hero section.” “Rate limit to five scans per day per user.” Each prompt got me closer. Each one also occasionally broke something that was working before.
Building all eight tools
After the Hero Checker, I had a rough mental model for how these tools should work. Paste input, hit an API, display structured results. The pattern repeated across all eight tools, but each one had its own complexity.
The Structured Data Detector was the simplest. Scan a URL for JSON-LD schema, meta tags, Open Graph data, Twitter Cards. Mostly parsing, minimal AI. It generates a developer-ready email with exactly what’s missing, which turned out to be the feature people use most. Marketers don’t want to learn schema markup. They want to hand their developer a specific list.
The Competitor Analyzer was more ambitious. A 60-point scoring framework across positioning, messaging, credibility signals, conversion design, and content strategy. I built the framework first as a document, then translated each section into scoring criteria for the AI prompt. The output needed to be specific enough that a product marketer could hand it to their team and say “here’s where they’re weak, here’s where we should attack.”
The AI Citation Tracker was the most technically interesting. It checks whether ChatGPT and Claude mention your brand when asked about your category. This required multiple API calls, careful prompt construction, and a results format that shows not just whether you’re cited but who gets cited instead. The competitive intelligence angle made this one spread the fastest.
Buying Committee Analyzer scans your site for coverage across six personas: champion, economic buyer, technical evaluator, end user, legal/compliance, and coach. Most B2B sites talk to one or two of these roles. The tool shows you exactly which ones you’re ignoring.
Brand Narrative Analyzer cross-references multiple pages from your site to detect contradictions in your origin story, problem framing, voice, and audience targeting. This one requires 2-5 URLs as input, which makes it higher friction but more valuable. You’d be surprised how often a company’s homepage tells a different story than their about page.
Positioning Differentiator takes your positioning statement and checks it against real competitor sites. It finds claim overlaps, shared language, and undifferentiated messaging. If you and three competitors all say “unified platform for modern teams,” this tool will show you that.
Messaging Drift Analyzer compares live website copy against your approved messaging framework. Paste your positioning doc, give it your URLs, and it shows where the published copy has drifted from the strategy. This one is useful for companies where the website has been edited by 15 different people over two years and nobody’s sure what it says anymore.
Each tool took between two days and a week of focused iteration. The AI analysis prompts were the most important part. Getting Claude to produce consistent, specific, actionable output required detailed system prompts with explicit scoring criteria and examples. I rewrote those prompts dozens of times.
What I actually learned
Domain expertise is the bottleneck, not coding ability. The hardest part of building these tools was never the code. It was knowing what a good competitive analysis looks like. It was defining the scoring criteria for hero section clarity. It was deciding which buying committee roles matter and what “coverage” means for each one.
AI can write a React component. It cannot tell you what a B2B positioning audit should measure. If I didn’t have years of experience running these analyses manually, the tools would produce generic output that nobody would use twice.
Prompts are products. Each tool’s value lives almost entirely in the system prompt that drives the AI analysis. The Hero Checker’s prompt is a detailed rubric with specific criteria for each score from 1 to 5. “A score of 5 for clarity means the headline alone states what the product does AND who it’s for. Zero jargon. One single clear message.” That level of specificity is what makes the output trustworthy.
I spent more time writing and refining prompts than I spent on anything else. The prompts are the product. The React components are just the delivery mechanism.
You will break things constantly. I lost count of how many times a change to one component broke something in another. AI coding tools don’t maintain a mental model of your full application. They work on what you show them. If you ask Claude Code to modify your API route, it might change a response format that three different components depend on.
I learned to describe changes carefully. “Update the loading state in the Hero Checker component. Don’t change the API response format or any other component.” Being explicit about what should stay the same is just as important as describing what should change.
Debugging is still debugging. When the Competitor Analyzer started returning empty results for certain URLs, I had to figure out why. The scraper was timing out on sites with heavy JavaScript rendering. The fix involved adjusting timeout settings and adding a fallback scraping method.
I didn’t write the fix myself. I described the problem and Claude Code wrote the solution. But diagnosing the problem required reading error logs, testing with different URLs, and understanding the difference between server-side rendering and client-side rendering. You don’t need to write code. You do need to think like someone who understands systems.
The architecture decisions are yours. Where should the API routes live? How should rate limiting work? Should the AI analysis happen server-side or client-side? What’s the right error state when Claude’s API is down?
Claude Code will make these decisions for you if you let it. Sometimes its choices are fine. Sometimes they’re not. I caught it storing rate limit data in a way that wouldn’t persist across Vercel’s serverless functions. I caught it making client-side API calls that exposed my Anthropic API key. These aren’t coding problems. They’re thinking problems.
Ship something small, then iterate. The Hero Checker was the first tool I launched. It’s also the simplest: one URL input, one API call, one results screen. I deliberately started there because it let me validate the pattern before building seven more tools on top of it.
If I’d tried to build all eight tools before launching any of them, I’d still be building. Each launched tool generated feedback that improved the next one. The Buying Committee Analyzer’s persona framework was directly informed by how people used the Competitor Analyzer.
What this can’t do
I want to be specific about the boundaries.
These tools run on Vercel’s serverless infrastructure. They handle real traffic. But they’re not the same as software built by a team of engineers. There’s no test suite. The error handling covers the common cases but not every edge case. If Anthropic’s API has an outage, the tools show an error message and that’s it.
Client-side rate limiting works for preventing casual abuse. It wouldn’t survive someone determined to hammer the API. For a set of free diagnostic tools, that trade-off is fine. For a paid product, you’d need real infrastructure.
I also can’t modify the tools as fast as a developer could. When I want to change something, I describe the change and iterate until it works. A skilled React developer would make the same change in a fraction of the time. The advantage of vibe coding is that I can do it at all, not that I can do it faster than a professional.
The AI analysis itself has limits too. Claude is good at structured evaluation against a rubric, but it’s working from scraped text. It can’t see your visual design. It can’t assess page load speed. It can’t tell you whether your site “feels” trustworthy. The tools are useful because the analysis is specific and actionable, not because it’s comprehensive.
What this means for other GTM teams
The lab tools have done what I built them to do. People find them through search. They run their sites through the diagnostics. They see the quality of the analysis. Some of them reach out to work together.
That’s a better demonstration of GTM expertise than any case study or testimonial page. A prospect who has seen their own positioning gaps mapped out in a Competitor Analyzer report already understands what I do. The sales conversation starts in a different place.
But the bigger point is about what’s now possible for any B2B marketing team.
If you have an interactive tool that would help your buyers but it’s sitting in an engineering backlog, you can build a functional version this week. If you want a diagnostic that qualifies prospects before they talk to sales, you can build it. If you need an internal tool that your team uses daily but nobody will prioritize, you can build it.
The constraint has shifted. It used to be: can we build this? Now it’s: do we know what to build?
That second question is where GTM expertise matters. A marketing team that deeply understands their buyers, their competitors, and their value proposition can build tools that demonstrate that understanding. The code is the easy part. The thinking is the hard part.
It always was.