Captain Landseed

Climate & Conservation Finance Intelligence
STATIC Built: May 21, 2026
25
Experiments Run
12
Significant Results
5
Active Experiments
n=71
Avg Sample Size

Completed Experiments

SIGNIFICANT
Experiment Control (A) Variant (B) Lift ↕ p-value Winner 95% CI n
Question vs Statement Hooks
Question-style hooks dramatically outperform statements. Adopted as default f...
Statement opening
312
Question opening
847
+171.5% 0.003
highly sig.
B 112.3% — 230.7% 89
Thread Length: 5 vs 7 Tweets
7-tweet threads marginally outperform 5-tweet threads. Effect is small but co...
5-tweet thread
534
7-tweet thread
589
+10.3% 0.031
significant
B 1.2% — 19.4% 112
Data Visualization in Thread
Charts increase engagement substantially. Now included by default for all dat...
Text-only data
412
Inline chart + text
723
+75.5% 0.008
highly sig.
B 48.2% — 102.8% 67
Source Citation Style
Inline citations boost perceived credibility. Adopted for all threads with sc...
Sources at end
0.72
Inline source links
0.84
+16.7% 0.022
significant
B 5.1% — 28.3% 54
Posting Time: Morning vs Afternoon
Afternoon UTC posting catches both EU evening and US morning audiences. Set a...
08:00-10:00 UTC
4200
14:00-16:00 UTC
6100
+45.2% 0.011
significant
B 22.8% — 67.6% 78
Milestone Framing
Milestone framing more than doubles engagement. Now auto-applied when data mi...
Standard framing
312
Milestone/record framing
691
+121.5% 0.012
significant
B 68.4% — 174.6% 54
Carbon Price Context
Historical context significantly improves carbon price thread performance. Al...
Price only
289
Price + historical comparison
467
+61.6% 0.018
significant
B 28.3% — 94.9% 62
Emoji Usage in Hooks
No statistically significant difference. Emoji usage does not meaningfully af...
No emojis
478
1-2 relevant emojis in hook
512
+7.1% 0.142
not sig.
None -3.4% — 17.6% 85
Tagging Authors in Threads
Tagging authors significantly boosts engagement via retweets. Implemented for...
No author tags
356
Tag paper authors when on X
623
+75.0% 0.006
highly sig.
B 38.2% — 111.8% 48
Thread Conclusion Style
CTA conclusions nearly double reply counts. Adopted for threads targeting com...
Summary conclusion
8.2
Call-to-action conclusion
14.7
+79.3% 0.009
highly sig.
B 34.1% — 124.5% 72
Single vs Multi-Topic Threads
Multi-topic synthesis does not significantly outperform single-topic. Focus o...
Single focused topic
445
Two related topics synthesized
461
+3.6% 0.287
not sig.
None -6.8% — 14.0% 58
Uncertainty Communication Style
No significant difference in perceived credibility between numeric and plain ...
Numeric confidence intervals
0.81
Plain language uncertainty
0.79
-2.5% 0.412
not sig.
None -9.3% — 4.3% 64
Platform-Specific Formatting
Platform-specific formatting yields significant gains. LinkedIn prefers longe...
Same format cross-platform
387
Platform-optimized formatting
542
+40.1% 0.015
significant
B 16.7% — 63.5% 90
Correction Acknowledgment Placement
Pinned correction replies boost credibility perception. Adopted as standard c...
Correction at thread end
0.76
Correction as pinned reply
0.88
+15.8% 0.027
significant
B 3.2% — 28.4% 42
Weekend vs Weekday Publication
Weekend posts significantly underperform. Pipeline now avoids Saturday/Sunday...
Weekday only
5800
Include weekend posts
4100
-29.3% 0.004
highly sig.
A -42.1% — -16.5% 96

Active Experiments

RUNNING
Expert Quote Inclusion
A: No expert quotes  vs  B: 1-2 expert quotes per thread
Measuring: engagement · Started Feb 4, 2026
23/60 samples
Cross-Domain Linking
A: Single-topic thread  vs  B: Cross-reference related topics
Measuring: profile_visits · Started Feb 7, 2026
14/50 samples
Follow-Up Timing
A: No follow-up thread  vs  B: 24h follow-up with new data
Measuring: retention · Started Feb 9, 2026
8/40 samples
Newsletter Preview as Hook
A: Standard thread hook  vs  B: Thread hook teasing newsletter deep-dive
Measuring: newsletter_signups · Started Feb 10, 2026
6/50 samples
Bluesky vs X Simultaneous Posting
A: X-first, Bluesky 2h delay  vs  B: Simultaneous cross-post
Measuring: combined_engagement · Started Feb 11, 2026
3/45 samples

Archived (Non-Significant)

p ≥ 0.05

Experiments that completed but did not reach statistical significance. These null results are just as informative — they tell us what doesn't matter.

Experiment A B Lift p-value 95% CI n Conclusion
Hashtag Count: 2 vs 5 2 hashtags per thread
4850
5 hashtags per thread
4920
+1.4% 0.681 -8.2% — 11.0% 74 Additional hashtags provide no measurable impression gain. Archived; using 2-3 hashtags for readability.
Formal vs Conversational Tone Academic formal tone
398
Conversational but precise tone
421
+5.8% 0.198 -4.1% — 15.7% 80 Tone difference does not significantly affect engagement. Maintaining current balanced tone.
Thread Numbering Style 1/7 style numbering
0.62
No numbering
0.59
-4.8% 0.334 -14.2% — 4.6% 66 Thread numbering does not significantly affect read-through rate. Keeping numbering for clarity.
Link Placement: Mid-Thread vs End Source links in tweet 3-4
34.2
All source links in final tweet
31.8
-7.0% 0.245 -18.9% — 4.9% 56 Link placement position has no significant effect on click-through. Using inline links per citation style experiment.
Alt Text Detail Level for Charts Brief alt text (key takeaway only)
502
Detailed alt text (full data description)
518
+3.2% 0.478 -7.6% — 14.0% 44 Alt text detail level does not significantly affect general engagement. Using detailed alt text for accessibility.
Phase 22 Predictive Intelligence · 12/25 experiments reached statistical significance (p < 0.05) · 5 archived (non-significant)
navigate Enter open Esc close