How to Analyze Your Substack Data Using Perplexity or Claude
Below I take you through a series of prompts that you can use in LLMs like Claude or Perplexity. I prefer Perplexity myself because it is more ethical and private, but Claude works too. Each one gives you different information, and I go through what the information means below as well.
Part one is analyzing your dashboard article/lives/video/audio statistics—everything you post and send out to subscribers from your Substack.
Part two is analyzing your growth.
PART I: Substack Posting Data Analysis Prompts for Perplexity or Claude
These prompts are built to analyze your Substack posting stats. Each one is written to paste directly into a fresh Claude or Perplexity chat along with their CSV.
Subscribes and estimated_value are tied to revenue.
Open_rate and engagement_rate are diagnostic. They tell you whether the work itself is landing.
Views and signups reflect overall subscriber growth.
The prompts below are ordered to build from diagnostic to strategic, ending in one that pulls it all into a plan.
You’ll need the CSV file from your dashboard:
Stats → Posts → Display (check all boxes on the right) → 3 dots → Download CSV
1. Open Rate
Diagnoses subject line and send-time performance—the levers most directly in your control.
Here is my Substack post-level stats CSV. Rank every post by open_rate, high to low, and pull out the top and bottom quartile. For each group, look at the title text, post_date (day of week and time), and audience segment. Tell me what the high-open-rate posts have in common in their titles and timing that the low ones don’t. Then give me three specific things to test in my next five subject lines based on what you find. Don’t guess at causes that aren’t supported by the data — if a pattern is weak or the sample is too small to trust, say so.
2. Views
Separates reach from open rate—views capture web and app readers beyond the email open, so this prompt surfaces patterns rather than prescribing fixes.
Here is my Substack post-level stats CSV. Rank posts by views and identify which topics, formats, or titles consistently land above my median views versus which fall below. Note whether view counts spike around specific posts (possible shares or restacks) or stay flat and proportional to open_rate. Compare views across the audience segments (everyone vs only_paid). Just show me the pattern — don’t tell me what to do with it yet.
3. Engagement Rate
Surfaces what content resonates, without forcing a prescriptive fix. Engagement is closest to voice and craft.
Here is my Substack post-level stats CSV. Rank posts by engagement_rate rather than views or open_rate. Identify the posts with high engagement relative to their reach — meaning the ones that hit hardest with the audience they got, even if that audience was small. Look for common threads in the titles or topics of the high-engagement posts. Present the pattern in plain terms. Don’t recommend a content strategy from this alone; I want to see what the data shows first.
4. Free vs. Paid Content Performance
Diagnoses whether paywalled content is earning its placement, and whether free posts are pulling people toward converting to paid.
Here is my Substack post-level stats CSV. Split the posts by audience (everyone vs only_paid) and compare open_rate, engagement_rate, views, and estimated_value across the two groups. Tell me whether my paid-only content is outperforming my free content on the metrics that matter, or whether the paywall is sitting in front of work that isn’t actually differentiated enough to justify it. Be direct about what the numbers show, even if it’s not the answer I’d want.
5. Free Subscriber Growth (Signups)
Identifies which posts function as subscriber-generating engines and why.
Here is my Substack post-level stats CSV. Rank posts by signups. Identify the posts that generated free subscribers most efficiently relative to their views (a high signups-to-views ratio matters more here than raw signup count). Look at what those posts have in common — topic, audience segment, title pattern. Recommend what to replicate in upcoming posts to keep growing the free list.
6. Paid Conversion (Subscribes)
The highest-leverage metric for revenue growth.
Here is my Substack post-level stats CSV. Rank posts by subscribes. Identify which posts converted readers to paid most effectively, and check whether those posts share a topic, structure, audience segment, or position in a series. Compare the subscribes-to-views ratio across posts rather than raw counts, since that tells me which content actually persuades versus which just gets seen. Give me a clear recommendation on what kind of post to run more of if paid conversion is the goal.
7. Revenue Per Post (Estimated Value)
Checks for a mismatch between what performs well and what actually produces revenue—often the most uncomfortable and most useful finding.
Here is my Substack post-level stats CSV. Rank posts by estimated_value. Then check whether the highest-revenue posts are also the highest in views, open_rate, or engagement_rate, or whether there’s a mismatch — meaning some posts perform well on attention metrics but don’t convert to money, or vice versa. Name the mismatch clearly if one exists. Tell me which two or three posts in this dataset generated the most revenue per view, since that ratio is more useful to me than total revenue alone.
8. Title and Headline Patterns
Surfaces the correlation between title construction and performance, with a light recommendation—not a formula.
Here is my Substack post-level stats CSV. Look at the titles of my top-quartile posts by open_rate and views versus my bottom-quartile posts. Identify any structural patterns — length, use of numbers, questions versus statements, specificity versus abstraction, emotional versus informational framing. Tell me honestly if the sample is too small or too mixed to draw a real pattern from. If a real pattern exists, name it plainly, without turning it into a generic headline formula.
9. Cadence and Timing
Checks whether publishing frequency and gaps affect post performance.
Here is my Substack post-level stats CSV. Look at post_date across all posts and calculate the gap in days between each post and the one before it. Check whether longer gaps between posts correlate with lower open_rate or views on the post that follows the gap. Also note any day-of-week pattern in performance. Show me the pattern plainly — I want to know if inconsistency is actually costing me before I decide whether to change my schedule.
10. Full Synthesis and Growth Plan
Ties every metric together, prioritized by revenue leverage. Run this one last, after the others.
Here is my Substack post-level stats CSV. Using views, open_rate, engagement_rate, signups, subscribes, and estimated_value together, tell me which single lever — reach, open rate, engagement, free growth, or paid conversion — would move my revenue the most if I improved it by 10 percent, based on where the current numbers are weakest relative to the others. Rank the five levers by leverage, not by how easy they’d be to fix. Then lay out a 90-day plan built around the top two levers, specifying what to test, what to measure, and what would count as evidence it’s working. Don’t pad the plan with generic content advice — every recommendation should trace back to something in this specific dataset.
PART II: Substack Growth Analysis Prompts for Perplexity or Claude
These prompts will help you analyze Substack’s growth sources (Date, Source, Category, Unique visitors, New subscribers, New revenue).
Go to your Dashboard → Growth → three dots. Download an all-time file and a last-30-days file.
Each prompt below specifies which one to use.
The most valuable numbers here aren’t the raw traffic counts; they’re the ratios. A source that brings in a lot of visitors but converts almost none of them into subscribers is weaker than a small source that converts at a high rate. So most of these prompts ask the LLM to calculate conversion efficiency (new subscribers ÷ unique visitors) and revenue efficiency (new revenue ÷ unique visitors) per source, rather than just ranking by volume.
1. Channel Conversion Efficiency
Identifies which sources are actually worth the effort, independent of how much traffic they send.
Here is my all-time Substack growth source CSV. Group the rows by Source, and for each one calculate total unique visitors, total new subscribers, and the conversion rate (new subscribers divided by unique visitors). Rank sources by conversion rate, not by raw visitor count. Tell me which sources are converting well even at low volume, and which sources are sending a lot of traffic but converting almost no one. Recommend which two or three sources deserve more of my time based on conversion rate alone.
2. Channel Revenue Efficiency
Goes one layer past subscriber growth to ask which channels actually pay.
Here is my all-time Substack growth source CSV. Group by Source and calculate total new revenue, total unique visitors, and revenue per visitor for each source. Rank sources by revenue per visitor. Tell me clearly if there’s a source that’s excellent at generating subscribers but weak at generating revenue, or the reverse. Recommend where I should be putting effort if the goal is revenue growth specifically, not just list growth.
3. Category Mix and Dependency Risk
Rolls channels up to the category level to check for over-reliance on one type of traffic.
Here is my all-time Substack growth source CSV. Group by Category and calculate what percentage of total unique visitors, total new subscribers, and total new revenue each category represents. Tell me if my growth is concentrated in one category — for example, almost entirely from Substack’s internal discovery versus external search or direct traffic. If there’s meaningful concentration, name the risk plainly: what happens to my growth if that category slows down. Recommend one category I’m underusing that’s worth testing.
4. 30-Day Momentum Check
A real-time read on what’s currently working, using the recent export rather than the full history.
Here is my most recent 30-day Substack growth source CSV. Group by Source and identify which channels are currently driving the most unique visitors, new subscribers, and new revenue. Flag any source with a sudden spike or a sharp drop within the 30 days. Tell me what’s working right now, in plain terms, without comparing it to historical data yet.
5. All-Time Baseline vs. Recent Performance
Compares the long-term pattern to the current 30 days to catch shifts before they become a trend.
I’m giving you two files: my all-time Substack growth source CSV and my most recent 30-day growth source CSV. Calculate each source’s average monthly performance from the all-time file, then compare that to what the 30-day file shows. Tell me which sources are currently outperforming their historical average, which are underperforming it, and which are new sources that didn’t show up in the longer history at all. I want to know what’s changing, not just what’s true on average.
6. Search Traffic Breakdown
Separates traditional search engines from AI-driven search referrals, since that split increasingly matters for where visibility work should go.
Here is my all-time Substack growth source CSV. Filter to rows where Category is Search, then break down performance by the individual Source within that category (for example, Google, Bing, Brave, or any AI-driven search referrers that appear). Tell me which specific search sources are sending visitors, and which of those are converting to subscribers or revenue. If any source names suggest AI search or answer engines rather than traditional search, call that out separately, since that traffic behaves differently than classic search.
7. Dead Weight Sources
Finds channels that are consuming attention or cross-posting effort without producing results.
Here is my all-time Substack growth source CSV. Group by Source and identify any source with a meaningful number of unique visitors but zero or near-zero new subscribers and zero new revenue over the full period. List those sources plainly. Tell me if any of them are worth continued investment based on volume alone, or if they’re safe to deprioritize.
8. Revenue Concentration
Checks how dependent total revenue is on a small number of channels.
Here is my all-time Substack growth source CSV. Calculate total new revenue across all sources, then determine what percentage of that total comes from the top three sources versus everything else combined. Tell me plainly how concentrated my revenue is. If it’s heavily concentrated, name what that means for risk if one of those top sources declines.
9. Full Synthesis and Channel Investment Plan
Ties the all-time and 30-day pictures into a prioritized plan. Run this one last.
I’m giving you two files: my all-time Substack growth source CSV and my most recent 30-day growth source CSV. Using both, identify the sources with the strongest combination of conversion rate, revenue per visitor, and recent momentum. Rank the top three channels by overall leverage — meaning where continued investment is most likely to move subscribers and revenue, not just traffic. Then lay out a 90-day channel plan: what to do more of, what to test, and what to stop spending time on. Every recommendation should trace back to a specific number in these files, not general advice about growing on Substack.
All my best,



