Open Knowledge Format: Google's Quiet AI Search Shift

I have a habit that probably drives my family crazy at dinner: when Google ships something new, I don't read the press release first. I read the product page. Then I go find the GitHub repo. Then I look at what got quietly rebranded or reorganized in the weeks around the announcement. It's a rule I've followed for years because I've found that what Google builds tells you far more about where search is heading than anything they say in a blog post.

That habit is what put the Open Knowledge Format on my radar on June 12.

On the surface, it looked like a routine enterprise data announcement. Google rebranded a product called Dataplex into something called Knowledge Catalog, a tool it had spent years positioning as back-end data infrastructure that nobody outside a data engineering team ever thought about. The new positioning: an “always-on context engine for AI agents.” The product barely moved. The positioning moved completely. And buried inside that launch, in a post written by Google Cloud tech leads Sam McVeety and Amir Hormati, was a new open specification called OKF that I think has real implications for anyone running a content-driven website.

Here's what I've found after going deep on the spec, the GitHub repo, and the practical tooling that's already appeared around it.

TL;DR: Google launched the Open Knowledge Format (OKF) on June 12. It's a simple markdown-plus-YAML spec for representing website content in a format AI agents can read directly. It layers on top of sitemap.xml and LLMs.txt rather than replacing them, and it preserves the internal link relationships that flat scraping throws away. It's v0.1, nothing crawls for it yet, and skipping it today carries no cost. But the implementation overhead is low, the structural upside compounds over time, and Google putting their name on a spec is how the roadmap gets written. For WordPress site owners already managing AI readability, the emerging challenge is fragmentation across multiple machine-readable layers, which is exactly the problem an integrated approach to LLMs.txt and content context files is designed to solve.

What OKF Actually Is (And Why the Simplicity Is the Point)

The Open Knowledge Format, or OKF v0.1 as it currently stands, is not a complex new technology. It is about as simple as a spec can get, and that simplicity is the whole point.

At its core, OKF is a directory of markdown files. Each file represents a piece of content, whether an article, a concept, or a data set, and carries a short block of YAML at the top. That YAML block contains a handful of structured fields: the type of content it is, a title, a description, a link back to the original resource, some tags, and a timestamp. Then the body of the file is just clean markdown. No special syntax, no proprietary runtime, no SDK to install. An AI agent can read an OKF bundle directly. No scraping, no browser renderer, no JavaScript to navigate.

Google frames OKF as formalizing what's been called the “LLM-wiki” pattern, the markdown-and-frontmatter approach that AI researcher Andrej Karpathy helped popularize over the past year. If you've spent time in Obsidian, Notion, or any of the agent-readable file conventions that have cropped up recently, the shape will feel familiar.

Here's what a single OKF file actually looks like in practice:

---
type: Article
title: Why LLMs.txt Matters for WordPress Site Owners
description: How to make your site readable for AI agents, and why the signals matter now.
resource: https://brianwinum.com/llms-txt-wordpress/
tags: [llms.txt, ai-search, wordpress, seo]
---

# Why LLMs.txt Matters for WordPress Site Owners

The body of the article, as clean markdown...

That's it. A bundle is just a folder of files like that, plus an index.md that lists them so an agent can see what's available before opening everything. Google's full specification fits on a single GitHub page. (One quick note to avoid confusion: a couple of unrelated specs share the OKF acronym, including a long-running open-data foundation. This is specifically Google's Open Knowledge Format.)

One thing to be clear about before we go further: this is a v0.1 draft spec, published June 12, 2026. Nothing is actively crawling the web for OKF bundles yet. Google built this originally for enterprise data teams sharing tables and metrics across organizations, not for content marketers publishing blog posts. Applying it to a content website is a repurposing of the spec. A sensible one, but a repurposing. Google itself calls v0.1 a starting point rather than a finished standard, and says the format will evolve as more producers and consumers come online. Anyone telling you this is a drop-everything emergency is overselling it.

That said, the early-mover case here is strong, and I'll explain exactly why below.

One more thing worth addressing directly: the GitHub repo carries a disclaimer that reads “this is not an official Google product.” I've seen people use that to dismiss the spec entirely, and I think that's a misread. Google uses that same disclaimer on nearly all of their open-source repos, including major AI and developer tooling. What it means is that the code comes without enterprise support guarantees. It does not mean the format is unofficial. Google wrote the spec, Google published it on the Google Cloud blog, and Google built it into a live product. The disclaimer is lawyers doing their job.

The Layered Machine-Readable Web: Where OKF Sits

To understand why OKF matters, it helps to think about the machine-readable web as a stack of layers that have been assembling themselves quietly over the past few years.

SEO practitioner Suganthan Mohanadasan articulated this layering cleanly in his breakdown of the OKF spec, and it's the framing I keep coming back to. The first layer is sitemap.xml, which tells a crawler which URLs exist. The second is LLMs.txt, which points an AI agent at the handful of pages you most want read. Think of it as the signal that says if you only have time for a few things, start here. OKF is a third layer on top of both. It hands over the content itself, every page converted to clean markdown and cross-linked into a graph an agent can actually walk.

These three things stack rather than compete. They serve different functions at different stages of an AI agent's interaction with your site.

It wasn't only the data-engineering crowd that took notice when this dropped. SEO analyst Marie Haynes called it “really big news” and pointed out that the interlinked markdown files essentially function as a living wiki that agents can read and navigate. When recognizable names in our field react to an enterprise data announcement, that's usually a signal the implications run wider than the original audience.

What I find most interesting about OKF, and this connects directly to things I've been writing about around Internal Knowledge Graph architecture, is what the spec does with internal linking. A flat scrape of your website throws away the relationship layer. It grabs the text on each page but loses how the pages connect to each other. OKF preserves those connections. When you build your bundle, each markdown file links to related files the way your pages link to each other, and an agent reading your bundle sees your content along with the architecture of how your ideas relate to one another. That relationship layer is where topical authority actually lives.

If you've been building content with a structured approach to topic clusters and internal linking, you already have most of what a well-formed OKF bundle requires. The bundle is essentially an expression of the knowledge structure you've built. If you haven't been building that way, the OKF question is secondary. The underlying structure is the thing to solve first.

Why This Validates Bets You've Already Been Making

Here's where I think this gets interesting for practitioners who have been paying attention.

The YAML frontmatter fields in an OKF document (type, title, description, resource, tags) are structured authorship and expertise signals. Publishing an OKF bundle is, at the machine level, a formal declaration of what you know, what you've written, and how it all connects. That overlaps closely with what we've been working toward using E-E-A-T authorship signals and Trust Stacking. The difference is that OKF makes it legible to AI agents in a format they can consume directly, rather than leaving them to infer it from page structure and anchor text.

I've been using a framework I call Trust Stacking in my Authority Amplifier Pro course. The idea is that authority signals compound across layers, and that the practitioners who win in AI-influenced search are the ones who have built credibility that's visible at multiple levels: the domain level, the authorship level, the content structure level. An OKF bundle adds one more layer to that stack: the content graph level.

Google is open about the ambition here. The team describes the format as designed to be “the lingua franca it can be exchanged for tomorrow.” That's a big claim for a v0.1 spec, and whether it holds depends entirely on adoption outside Google. But it tells you how Google is thinking about it, and that thinking is what I pay attention to.

The analogy I keep reaching for is schema markup. Schema took the better part of a decade to become genuinely consequential in search results. The practitioners who implemented it early didn't see overnight ranking improvements. What they built was a structural advantage that compounded as Google's ability to parse and use that data matured. The early adopters weren't lucky. They were paying attention to where Google was investing. OKF is the same shape of bet. Google has now named it and standardized it, which means the infrastructure to consume it will follow. Being early here is reading the roadmap, not gambling on a hunch.

What This Means for WordPress Site Owners Specifically

If you run a WordPress site, the practical implementation question gets interesting, and I want to lay out the full landscape of options here, including the ones that have nothing to do with me.

There are already free tools available. Suganthan Mohanadasan, whose layered-stack framing I referenced earlier, has built both a free web tool and a free WordPress plugin that generates an OKF bundle from your published posts and pages and serves it at yoursite.com/okf/. It's a solid free option, and I'd rather tell you it exists than pretend it doesn't.

The problem I'd point you toward, and the one I've been chewing on a lot in the context of my own LLMS Amplifier plugin, is fragmentation. Most WordPress sites that are paying attention to AI readability are now managing multiple machine-readable layers independently: a robots.txt that may or may not be properly configured for AI crawlers, an LLMs.txt file that needs to be kept current as content changes, and now potentially an OKF bundle on top of that. Each of those can and does go stale independently. Your LLMs.txt can be pointing to articles you've revised or removed. Your OKF bundle, if generated manually or updated separately, can fall out of sync with what's actually on your site.

That's the problem LLMS Amplifier is designed to solve at the LLMs.txt layer: automated generation and maintenance that stays synchronized with your WordPress content without requiring you to remember to update it manually after every publish. The direction I'm working toward with the plugin is exactly this kind of unified AI context layer, where your LLMs.txt, your supplemental context files, and your structured content representation all update together when your site changes, rather than running three separate maintenance workflows.

If you're already using LLMS Amplifier, this is the direction I'm building toward. If you're not, and you're a WordPress site owner who is serious about AI search readiness, it's worth a look at what a more integrated approach to this layer looks like.

How to Think About OKF Right Now, and What to Actually Do

Let me give you the practical version of the prioritization question, because most of the coverage of OKF is going to fall into one of two unhelpful camps: either breathless “implement this immediately” urgency or dismissive “it's too early, ignore it” skepticism. Neither one serves you.

The straight answer is this: skip it today and nothing breaks. There is no penalty for waiting. No agent is going to penalize your site for the absence of an OKF bundle this week or this month.

What tips the calculation toward acting earlier is the gap between cost and upside. The cost of implementation is low. The free tools make it a day's work or less for most sites. The structural upside, while not immediate, is real: you end up with a cleaner, machine-readable representation of your content graph, and the internal-linking audit that comes out of generating a bundle is valuable on its own, even if no AI agent ever opens a single file.

Who should act sooner: practitioners who have already invested in topical authority content, WordPress site owners who are already managing LLMs.txt, and anyone whose competitive advantage depends on being visible in AI-generated responses rather than just traditional search results. If you've built a content library with real depth and structure, turning it into an OKF bundle is mostly a matter of surfacing what's already there.

Who can reasonably wait: sites still working on content fundamentals, businesses where the content is thin or unfocused, and anyone for whom the cognitive overhead of adding another layer right now would distract from higher-priority work.

The frame I come back to is this: Google named it. They published the spec. They built it into a live product. That sequence of events tells me the infrastructure to support and consume OKF will follow, even if the timeline is measured in months or years rather than weeks. I've been in this industry long enough to know that the practitioners who benefit most from these inflection points aren't the ones who wait for certainty. They're the ones who move early, while the cost is low and the field is still uncrowded.

The machine-readable web has been building itself layer by layer for years. OKF is the newest floor, and I'd rather know my way around it before the crowd shows up.

Visited 3 times, 2 visit(s) today

Brian Winum

Google Just Standardized the Machine-Readable Web, and Almost Nobody Noticed

What OKF Actually Is (And Why the Simplicity Is the Point)

The Layered Machine-Readable Web: Where OKF Sits

Why This Validates Bets You've Already Been Making

What This Means for WordPress Site Owners Specifically

How to Think About OKF Right Now, and What to Actually Do

What OKF Actually Is (And Why the Simplicity Is the Point)

The Layered Machine-Readable Web: Where OKF Sits

Why This Validates Bets You've Already Been Making

What This Means for WordPress Site Owners Specifically

How to Think About OKF Right Now, and What to Actually Do

Share This

You May Also Like