Content Inventory: What It Is and How to Build One
A content inventory is a structured catalog of every piece of content on your website, with metadata for each item. Before you can improve your content, run a meaningful audit, or prepare for a site migration, you need to know exactly what you have. That’s what a content inventory gives you.
This guide explains what a content inventory is, what to include in one, how to build it step by step and what to do with it once it’s complete.
What Is a Content Inventory?
A content inventory is a comprehensive, quantitative record of all assets on a website, including pages, documents, images and videos, typically organized in a spreadsheet or database. Each row represents one URL or asset. Each column captures a specific attribute: the title, format, topic, primary keyword, performance data and owner. It serves as a foundational map used before site redesigns, content audits, or migrations to identify, organize, and manage existing digital assets.
The goal is simple visibility. You can’t make smart decisions about content you can’t see clearly.
Content inventory vs. content audit
These two terms are often used interchangeably, but they mean different things. A content inventory is quantitative: it catalogs what exists. A content audit is qualitative: it evaluates how well that content is performing and whether it should stay, be improved, or go.
The inventory comes first. The audit follows from it. You can run a basic audit alongside your inventory work, but the catalog itself is the foundation everything else builds on.
Why You Need a Content Inventory
A content inventory is essential for auditing, identifying SEO gaps, streamlining website migrations, improving content accuracy and reducing redundant, outdated or trivial content (ROT). Most content teams underestimate how much they’ve published. Sites that have been running for a few years routinely hold hundreds or thousands of URLs, many of which nobody has reviewed recently. A content inventory makes the full picture visible so you can act on it.
Site migrations
A comprehensive URL list is non-negotiable before any CMS migration or domain change. Without it, pages get dropped, redirects get missed and organic traffic disappears. The inventory becomes your migration checklist.
Content strategy and resource planning
Knowing what you already have prevents you from creating duplicate content and helps you identify real gaps. If you’re planning a content calendar, start with the inventory so new pieces fill actual gaps rather than covering ground you’ve already covered.
Content decay and orphaned pages
Content decay is gradual. Pages that ranked well three years ago may now sit at position 40 with zero clicks, invisible to your team because nobody checked. Orphaned pages, those with no internal links pointing to them, often go completely unnoticed. A content inventory surfaces both. You can flag decaying pages for a refresh and add internal links to orphaned ones before they hurt your crawl efficiency.
Keyword cannibalization and duplication
When multiple pages target the same keyword, they compete against each other in search. The inventory reveals which pages overlap so you can consolidate or differentiate them intentionally.
Internal linking gaps
An inventory with incoming internal link counts shows you which high-value pages are undersupported. Fixing those links is one of the fastest ways to improve page authority without writing a single new word.
What to Include in a Content Inventory
A content inventory spreadsheet typically includes URLs, page titles, content types, owners and last-modified dates as core fields, often extended with topic, primary keyword, performance data (page views, organic clicks) and a status column for audit decisions. The fields you track depend on your goals. An SEO-focused team will want different data than a UX team planning an information architecture review. That said, these columns give you a solid baseline most teams can build from.
| Field | What it captures | Why it matters |
|---|---|---|
| URL | Full page address | Primary identifier for every row |
| Page title / H1 | On-page title or heading | Signals topic focus; flags missing or duplicate titles |
| Meta description | Current meta description text | Flags missing or truncated descriptions |
| Content type | Blog post, landing page, guide, case study, etc. | Helps segment and filter for different reviews |
| Topic / category | Subject area or site section | Groups related pages for thematic analysis |
| Primary keyword | Main target search term | Identifies cannibalization and coverage gaps |
| Organic clicks (30d) | GSC click data for the last 30 days | Shows which pages are earning traffic now |
| Organic impressions (30d) | GSC impression data | Flags pages with visibility but no clicks |
| Page views (GA) | All-channel traffic from Google Analytics | Captures non-organic traffic value |
| Publish date | When the page first went live | Identifies old content that may have decayed |
| Last updated | Most recent edit date | Flags pages that haven’t been touched in years |
| Word count | Approximate length | Helps spot thin pages and length outliers |
| Author / owner | Who wrote or owns the page | Routes feedback and update tasks correctly |
| Incoming internal links | Number of internal pages linking here | Shows link equity distribution and orphaned pages |
| Status | Keep / improve / merge / redirect / remove | Turns the inventory into an actionable plan |
| Notes / next action | Free text for team comments | Keeps decisions tied to the data row |
You don’t need every column from day one. Start with URL, page title, primary keyword, organic clicks and status. Add fields as your review deepens. For sites with documents, PDFs, or media assets, add a content type column early so those files are clearly separated from HTML pages in your inventory.
How to Create a Content Inventory (Step by Step)
A content inventory is a detailed, spreadsheet-based catalog of every asset on a website, covering URLs, titles, types and metadata. The process of building one takes a few hours for a small site and a few weeks for a large one. These steps apply at either scale.
Step 1: Define your scope and goals
Before pulling a single URL, decide why you’re doing this and what you’ll do with the output. A team migrating to a new CMS needs a different inventory than a team running a quarterly SEO review.
Common goals:
- Prepare for a site migration
- Find underperforming pages to refresh
- Identify content gaps before building a new content calendar
- Diagnose crawl inefficiency or duplicate content
Decide on scope: the full domain, a subdomain, a specific section, or a content type. For large sites, work in sections. Trying to inventory everything at once usually stalls partway through.
Step 2: Export your URLs
Start with your XML sitemap as a baseline. Then run a crawl to catch pages that aren’t in the sitemap, including orphaned pages your sitemap missed.
Good options for the crawl:
- Screaming Frog SEO Spider: crawls every URL on the site, exports title tags, meta descriptions, H1s, word count, response codes and internal link counts in one CSV
- Ahrefs Site Audit or SEMrush Site Audit: adds organic performance data and technical issue flagging
- CMS export: WordPress and most CMSes let you export a post/page list as CSV, useful as a quick starting point
Export everything to CSV. You’ll clean it up in the next step.
Step 3: Build your spreadsheet
Open a new Google Sheets file. Paste your URL list into column A. Add column headers across row 1 using the field list above or a subset that matches your goals. Freeze row 1, apply filters to every column and you have a working inventory structure.
If you’re working with a team, share the sheet with edit access and assign column ownership so multiple people can contribute without overwriting each other.
Step 4: Collect metadata and performance data
The crawl export from Screaming Frog will pre-fill title tags, meta descriptions, H1s, word counts and internal link counts. Copy those columns into your sheet.
For performance data:
- Export URL-level data from Google Search Console (Performance report, export to CSV) for organic clicks and impressions per page
- Export page-level data from Google Analytics (GA4: Reports > Pages and screens) for overall page views
Use VLOOKUP or XLOOKUP in Sheets to join the GSC and GA data to your URL list by URL column. This gives you a single row per page with all the signals you need.
Step 5: Categorize and flag
With the data populated, go through the list and fill in the qualitative columns: content type, topic, primary keyword, author and status. This part requires human judgment. You can’t automate whether a page should be kept or removed.
Use the status column to flag each page:
- Keep: performing well, no changes needed
- Improve: has traffic potential but needs a refresh or expansion
- Merge: overlaps with another page; consolidate and redirect
- Redirect: outdated or duplicate; redirect traffic to a stronger page
- Remove: no traffic, no strategic value, no path to either
Add a notes column for specific next actions: „update stats,“ „merge with /related-page/,“ „add internal links from /topic-hub/.“
Tools for Building a Content Inventory
Building a content inventory is best accomplished using a combination of automated crawling tools, analytics data, and spreadsheets for organization. No single tool does everything. Most teams combine two or three to get a complete picture.
Screaming Frog SEO Spider is the most efficient starting point for URL collection. It crawls your entire site and exports a spreadsheet of every URL with title tags, meta descriptions, H1s, response codes, word counts, and internal link counts. The free version handles up to 500 URLs. The paid version (~$259/year) removes that limit.
Ahrefs Site Audit and SEMrush Site Audit combine crawl data with organic performance data and technical issue detection. They’re stronger than Screaming Frog for spotting cannibalization, content decay, and page-level traffic trends, but they cost more and aren’t always necessary for smaller sites.
Google Search Console is the most accurate source for organic click and impression data per URL. It’s free. Export the full Performance report filtered to the domain to get a complete URL-level traffic snapshot.
Google Analytics 4 gives you all-channel traffic data per page. Use the Pages and Screens report and export to CSV.
Google Sheets or Airtable work well as inventory home bases. Sheets has the advantage of being free and widely familiar. Airtable offers more visual filtering and gallery views if your team prefers that format.
For most teams running their first inventory, Screaming Frog plus Google Search Console plus Google Sheets is enough to start.
Content Inventory Template
A content inventory spreadsheet typically uses Google Sheets or Excel as the primary tracking tool. Copy this structure into a new Google Sheets file. Fill in the first few rows manually to validate the format, then batch-fill the rest from your crawl export and GSC data.
| Field | Example value | Source |
|---|---|---|
| URL | https://example.com/blog/seo-tips/ | Crawl / sitemap export |
| Page title | 10 SEO Tips for 2025 | Crawl export (title tag) |
| H1 | 10 SEO Tips That Still Work in 2025 | Crawl export |
| Meta description | Here are the SEO tips that move rankings… | Crawl export |
| Content type | Blog post | Manual |
| Topic / category | SEO | Manual |
| Primary keyword | seo tips | Manual / GSC top query |
| Organic clicks (30d) | [from GSC export] | Google Search Console |
| Organic impressions (30d) | [from GSC export] | Google Search Console |
| Page views (30d) | [from GA4 export] | Google Analytics |
| Publish date | YYYY-MM-DD | CMS / crawl |
| Last updated | YYYY-MM-DD | CMS / crawl |
| Word count | [from crawl export] | Crawl export |
| Author / owner | Jane Smith | CMS / manual |
| Incoming internal links | 7 | Crawl export |
| Status | Improve | Manual review |
| Notes | Refresh stats section; add internal links from /seo-guide/ | Manual |
What to Do After the Inventory
The inventory itself is not the destination. It’s a map that shows you where the problems are and where the opportunities are. Once it’s complete, the next step is to act on what you found.
Run a content audit
Take the pages flagged as „Improve“ or „Merge“ and evaluate them qualitatively. Does the content still answer the current search intent? Is it thin compared to what’s ranking now? Does it have factual errors or outdated information? This is the audit layer, and the inventory makes it structured rather than arbitrary.
Apply the keep/improve/merge/redirect/remove framework
Prioritize by traffic potential first. Pages with strong impressions but low click-through rates are often the fastest wins. Pages with zero clicks and no keyword targeting are candidates for removal or consolidation.
Assign each status action to an owner and set a deadline. An inventory without assigned next steps stays a spreadsheet and nothing else.
Update your internal linking
Use the incoming links column to find your strongest pages, those with high traffic and many internal links, and make sure they link out to related pages you want to grow. Use it to find orphaned pages with zero incoming links and connect them to relevant hubs or categories.
Maintaining Your Content Inventory
Maintaining a content inventory means treating it as a living document rather than a one-time project. Sites change constantly. New pages get published, old ones decay and the status of any given URL shifts over time. The inventory only stays useful if it stays current. Establish a recurring audit schedule and update the inventory regularly to reflect new content creation and flag outdated items.
For most sites, a quarterly update cycle works well. Every three months, run a new crawl, pull fresh GSC and GA data, and update the status column for pages that have changed. Flag newly published URLs. Mark pages whose traffic has dropped significantly as candidates for review.
Assign one person as the inventory owner. Without ownership, updates get skipped and the spreadsheet drifts into irrelevance within a year. That person doesn’t need to do all the work, but they’re responsible for keeping the cycle going and looping in content owners when pages need attention.
Use the „last updated“ column in your inventory itself as a freshness signal. If a page hasn’t been touched in two or more years and traffic has declined, it’s likely a content decay candidate. Catching those pages before they drop completely is far easier than recovering them after the fact.
Frequently Asked Questions
How long does a content inventory take?
Small sites with fewer than 200 URLs can often be inventoried in a day or two. Sites with 500 to 2,000 pages typically take one to three weeks when done alongside other work. For sites over 5,000 URLs, plan for at least a month and work in sections. Crawl tools automate data collection; the time cost is mostly in categorization and status flagging.
Do I need a special tool, or is a spreadsheet enough?
A spreadsheet is enough for most teams. Screaming Frog (free up to 500 URLs) plus Google Sheets covers the full workflow without any paid subscriptions. Dedicated tools like Ahrefs or SEMrush add value for larger sites or teams that want organic data built into the inventory automatically, but they’re not required to get started.
What’s the difference between a content inventory and a content audit?
A content inventory is a quantitative list of what exists. A content audit evaluates the quality and performance of what’s on that list. The inventory answers „what do we have?“ The audit answers „is it any good and what should we do with it?“ You need the inventory before you can run a meaningful audit.
How often should I update my content inventory?
Quarterly works well for most active content teams. If your site publishes daily or runs frequent campaigns, monthly updates keep the data more accurate. For smaller sites that publish less often, twice a year is usually sufficient. The important thing is that updates happen on a schedule, not only when something goes wrong.
What should I do with low-performing pages I find?
First, check why the page is underperforming. Low traffic on a new page is normal. Low traffic on a two-year-old page that used to rank is a decay signal. Check the GSC impression trend: if impressions are falling, the page is losing ranking position. If impressions are stable but clicks are low, the title or meta description may need improvement. Pages with no impressions and no internal links are candidates for removal or consolidation into a stronger related page.