• autonomous edge
  • Posts
  • ⬛️ a year of collecting ai reports, so you don't have to... 📚

⬛️ a year of collecting ai reports, so you don't have to... 📚

no time to read? there’s an audio 30-min recap of everything you need to know

i just spent the last year consuming what i think is every high-signal report on genai, agents, autonomous business adoption, innovation, eval, tooling...

from obscure x threads to enterprise strategy decks to the weirdest ai conference whitepapers, to things shared in our invite-only community… 

here's some of the intel and standout insights that keep surfacing:

no time to read or go through everything?


want all the reports?

here’s a link to the full folder with all the reports and a few papers (will probably be adding more on a regular basis)

now for some of the insights that stood out to me:

→ for most organizations, the top benefit achieved through genai is something other than efficiency, productivity, or cost reduction, such as increased innovation [deloitte-moving from potential to performance-stateofgenai-q3-report.pdf].

→ despite established carbon footprint and water consumption goals, most companies haven't quantified genai's environmental impact [deloitte-moving from potential to performance-stateofgenai-q3-report.pdf].

→ agent struggles on agentbench are potentially due to limitations in long-term reasoning, decision-making, and instruction-following [hai_2024_ai-index-report.pdf].

→ many organizations are increasing genai investments because they've already seen strong value [deloitte-moving from potential to performance-stateofgenai-q3-report.pdf].

→ in some cases, open retrieval models outperform proprietary embedding models from major labs [hallucination index report.pdf].

→ a significant portion of the most popular benchmarks might contain errors, potentially underestimating model capabilities [hai_2024_ai-index-report.pdf].

→ simply spending more time does not improve human detection of audio deepfakes [hai_2024_ai-index-report.pdf].

→ anthropic's "contextual embeddings" can substantially reduce retrieval failure rates in rag [state of ai report - 2024 online (1).pdf].

→ ai agent performance can vary significantly across different tasks within the same benchmark [hai_2024_ai-index-report.pdf].

→ "fully grokked" models, despite good performance on comparison tasks, can struggle with out-of-distribution generalization in composition tasks [hai_2024_ai-index-report.pdf].

→ human crowd performance in detecting audio deepfakes is comparable to top automated detectors [hai_2024_ai-index-report.pdf].

→ llms are poor self-correctors, highlighting a limitation in improving their own reasoning [hai_2024_ai-index-report.pdf].

→ industry continues to dominate frontier ai research, significantly outpacing academia in producing notable models [hai_2024_ai-index-report.pdf].

→ risks from ai, particularly around privacy, data security, and reliability, are becoming a top concern for businesses globally [wef_future_of_jobs_report_2025.pdf].

→ chatgpt ranked itself at no. 25 among history's most influential technologies, between railroads and 5g [hai_2024_ai-index-report.pdf].

→ a significant percentage of respondents believe generative ai is overhyped [gallup_culture_of_ai_benchmark_report.pdf].

→ "taker" use cases, where companies use off-the-shelf genai software, are a primary approach to utilizing the technology [deloitte-moving from potential to performance-stateofgenai-q3-report.pdf].

→ even with efforts to encourage it, the most severe negative behaviors from reward misspecification in models were rare in one study [hai_2024_ai-index-report.pdf].

→ the development of a 15t token dataset by hugging face using commoncrawl produces llms that outperform other open pre-training datasets [hai_2024_ai-index-report.pdf].

→ the perception of critical thinking enaction when using genai correlates negatively with confidence in ai doing the task [lee_2025_ai_critical_thinking_survey (1).pdf].

→ ai agents using the react framework break down complex questions into smaller steps and adjust their approach based on findings [mastering ai agents - galileo.pdf].

→ ai agents get better if you literally ask them to self-reflect. think about that for a second.

→ the real edge? deeply embedding ai into core processes, not just experiments on the side.

→ low-cost models like gemini 1.5 flash are winning at rag. price ≠ performance anymore.

→ open-source retrieval sometimes beats big lab stuff. wild.

little bonus

this one isn’t included in the folder but it’s also pretty wild

stripe’s 2024 annual letter shows a mind-blowing shift in real numbers:

→ $1m arr in 10 months (vs 37 months for traditional saas)
→ $5m arr in 24 months (vs 3+ years for traditional saas)..and there's more..

then there are the outliers:

→ lovable hit $17m arr in 3 months
→ bolt reached $20m arr in 2 months
→ cursor crossed $100m+ arr in 3 years

these are no longer outliers, seem it's the new normal. it’s a new cursor.

ai-native startups launch with product-market fit baked in:

→ infra is ready. they spin up on open-source models, cloud apis, and cheap inference: smaller backend team required
→ demand is built-in. buyers want ai, understand it, and have budget waiting
→ teams are small. 3 people shipping global software, hitting milestones that used to take 30

the traditional startup arc we're so used to: mvp, iterate, find pmf, scale: is serisouly put into question here

you launch, you grow, you monetize. all at once.

getting similar echo's out of yc batches. build product, talk to customers, sell...but 20x faster

this changes how things work:

→ go-to-market looks more like tiktok growth hacking than enterprise sales
→ we’re entering an era where micro-saas can become mega-saas fast?
→ capital efficiency actually means something now: because infra is cheap and distribution is fast
→ vc models will adapt. rounds will happen earlier, faster, smaller: or much later, once something’s already winning
→ solo founders with the right agent stack and a good idea can build $10m+ businesses without ever hiring

i wanna call this the 'consumerization of enterprise'.

it’s definitely the compression of the startup lifecycle.

pretty wild times.

and hope you enjoyed this piece!

⬛️

david

👋 by the way, we run a community called genai ⚫️ circle. it’s where i learn everything i share. it’s invite-only. as a subscriber of this newsletter you can apply to join. check it out here: www.thegenaicircle.com

ps: ❤️ enjoyed this piece?

and here are some previous pieces you might have missed