Rami's Readings #94 - š¤ 5 AI Predictions for 2025 āØ
5 AI predictions for 2025, the latest on AI, LLMs, DeepSeek, New Tools, Papers, VC, Hardware, and more.
Welcome to Ramiās Readings #94 - a weekly digest of interesting articles, papers, videos, and X threads from my various sources across the Internet. Expect a list of reads covering AI, technology, business, culture, fashion, travel, and more. Learn about what I do at ramisayar.com/about.
Happy New Year! I hope yāall had a fantastic start to the year! My year started with crazy travel, even by my standards. For the curious, I traveled to Boston, NYC, MontrĆ©al, Milan, Paris, Milan, NYC, Boston, Seattle, Boston and now back home in Redmond (Seattle).
AI started off with aā¼ļø bangerā¼ļø thanks to DeepSeek. Happy Lunar New Year! Stay tuned as we dive into DeepSeekās R1, and much more from the past few weeks later in the newsletter.
First, grab a mug ā Iād like to share 5 AI predictions I collected from builders who are shipping AI systems at scale in 2025.
š¤ 5 AI Predictions for 2025 āØ
Priya Sundararaman
Priya is a Digital & AI leader with 20+ years of experience driving enterprise-scale innovation, including AI solutions for Amazon, Walmart, and State Farm and holds 13 AI patents.
In the last hundred years, humanity mastered the art of communicating with machines through code. Today, we're witnessing a remarkable reversal: machines are speaking the language of humans. This shift, driven by the rapid advancement of Generative AI creates a transformation in our relationship with AI systems.
The challenge now lies not in instructing AI, but in forging a symbiotic partnership with it. We must learn to collaborate with these intelligent systems, leveraging their strengths while preserving our unique human qualities and values.
I anticipate that 2025 will be a pivotal year in this journey. We will likely establish a comprehensive blueprint for what we call āhuman-AI interactionā (HAII). This blueprint will address ethical considerations, define boundaries, and create guidelines that ensure humans not only partner with AI but also take the driver's seat. This approach will be crucial not just for technological advancement, but for the sustainable progress of humanity.
James Villarrubia
James is a White House Presidential Innovation Fellow at NASA, renowned public speaker and CTO.
2025 will be the year of Agents taking center stage as they empower non-technical users to build systems through meta-logic business layers. These new businesses will create opportunities but also new security challenges, as CISOs must account for additional middlemen in their backend processes. On the consumer side, smaller, on-device models will gain traction, reducing dependency on network traffic. Meanwhile, I'm hoping that API specs will start to be rewritten to be, by default, consumed by both developers and AI agents.
Nidhi Verma
Nidhi Verma, Senior Director of Engineering at JPMorganChase, specializes in system design, architecture, and product development.
Edge AI and Real Time AI Processing - Edge AI will revolutionize real-time decision-making across industries, driving advancements in trading and market analysis within the financial sector, grid simulation and monitoring in the energy industry, and smart cities and healthcare. Powered by advanced AI chips and 5G, it enables faster, localized data processing, reducing reliance on cloud infrastructure.
Andrei Oprisan
Andrei Oprisan is a technology leader at agent.ai with 15+ years of experience delivering scalable eCommerce, marketing, and AI-driven products, a Columbia alumnus, published author, speaker, and mentor in machine learning and tech innovation.
In 2025, businesses will deploy AI employees: AI-powered synthetic professionals with specialized domain expertise, capable of devising strategic roadmaps, engaging in vendor negotiations, and even mentoring junior staff. These autonomous agents will transcend routine task automation to shape boardroom decisions and influence corporate culture. As AI shifts from a reactive tool to a proactive collaborator, organizations may see a tension between machine-driven insights and established executive intuition. Companies that embrace this friction and incorporate AIās emergent creativity will outpace competitors. But the real differentiator will be rigorous oversight - building governance frameworks that balance high-impact AI contributions with accountability for potential ethical and legal missteps.
Yours Truly, Rami Sayar
Rami Sayar is an accomplished engineering leader delivering generative AI-powered experiences at Microsoft AI from zero to one to worldwide scale.
Throughout 2024, I cataloged through my newsletter, Ramiās Readings, a series of increasingly powerful open source LLMs optimized to run on devices with consumer-grade hardware. Thanks to engineering prowess, my own desktop equipped with an Nvidia RTX 4090 is now overkill for running most cutting-edge models. At CES 2025, Edge AI was all the buzz with nearly every OEM marketing their local AI solutions designed to run on their hardware. Letās not forget Appleās M4 chips are equally capable of running state-of-the-art LLMs with incredible speeds and energy efficiency thanks to MLX. In 2025, Edge AI isnāt just a trendāit will go mainstream.
š¤ AI Reads
This weekās newsletter is exceptionally long, so I clustered the links and limited comments to the most important reads. My comments and notes are in parenthesizes and italicized.
DeepSeek V3
Notes: DeepSeek V3 paper and model on HuggingFace. V3 is 671B parameter MoE that outperforms GPT-4o and Claude-Sonnet-3.5. It is incredibly fast, cost very little to train relatively speaking, fairly open-source, and demonstrates what I repeat in this newsletter. Chinese AI talent is underappreciated. I have highlighted DeepSeekās first LLMs over several of my newsletters and DeepSeek-Coder-v2 was my go to for a while. Now, this model played a huge role in the next release: R1.
DeepSeek R1 š„š„š„
Notes: DeepSeek R1 paper and models on HuggingFace. This release from the DeepSeek team is huge and rightly has everyone in industry and government ablaze. (I read so many jokes - my favorite was āDeepSeek released open AIā). š¤£ This release is actually multiple models: R1-Zero, R1, and then a series of distilled models. The reasoning performance matched OpenAIās o1 model for a fraction of the training cost. Why do you need billions in GPUs and funding anymore?
How did they get there? DeepSeek applied RL to DeepSeek V3 getting R1-Zero. R1-Zero highlighted that you can develop reasoning performance through reinforcement learning (RL) alone. No need for human feedback or supervised fine-tuning at firstā¦ shocking! But R1-Zero suffered issues, so DeepSeek did it again but starting with high-quality CoT data to get reasoning capabilities faster, before then applying SFT and more RL to get R1. R1 is the game changer, but again very large (671B). Soā¦ they distilled it: DeepSeek-R1-Distill-Llama-70B outperforms OpenAIās o1-mini. AND it is all MIT-licensed.
There is true engineering prowess on display, showcasing that mathematics, computer engineering fundamentals, and computer science still reign supreme. Case in point, llama.cpp is an example of thisā¦ driving AI to the edge without requiring data centers of GPUs. Just a reminder, DeepSeekās engineers are mostly coming from a quant hedge fund.
Chinaās cheap, open AI model DeepSeek thrills scientists. (Nature)
Chinaās DeepSeek Shows Why Trade Wars Will Be Hard to Win (From Tyler Cowen)
Soā¦ I have DeepSeek-R1-Distill-Qwen-32B running on my 4090 giving me quasi-o1 performance as I write! š
If you need a video:
New Models:
Modern BERT. (Replacement for old BERT.)
SmallThinker-3B-preview. (Fine-tuned Qwen 3B model.)
New Infrastructure Startups / Products / Platforms:
Helicone (YC-backed.)
Unsloth.ai (Open source fine-tuning platform, picking up steam.)
opik (Open source LLM evaluation framework.)
Kiln AI (Another open source fine-tuning platform.)
New Tools / Libraries / Apps:
OpenAI Introduced Operator (An important release, but not as ground-breaking as anticipated.)
MarkItDown (Microsoft-backed Document to Markdown library for RAG.)
Project Mariner (Googleās agent in the browser similar to OpenAIās Operator.)
Aider (AI pair programming in terminal. I will give it a try with my local LLM setup.)
New Papers / Articles to Read:
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
New LLM optimization technique slashes memory costs up to 75%. (From Tokyo, Sakana is doing something interesting.)
Fast LLM Inference From Scratch. (Building an inference engine using C++ and CUDA from scratch, no libraries.)
Monolith: Real Time Recommendation System With Collisionless Embedding Table. (Oldie, relevant for those interested in TikTok.)
Byte Latent Transformer: Patches Scale Better Than Tokens. (Meta showing that byte transformers are feasible.)
New Hardware:
Nvidia Project Digits: The Worldās Smallest AI Supercomputer. (I signed up and will try to get one the minute it comes out.)
š¼ Business Reads
Ghosn Says Honda Deal Talks Indicate Nissan in āPanic Modeā
Notes: Not an incorrect take, but obviously biased.
AI Power Needs Threaten Billions in Damages for US Households
Notes: Is this a buy signal for APC (Schneider Electric)?
AI Startup Funding Hit a Record $97 Billion in 2024
Notes: There are billions in dry powder left.
OpenAI, Oracle and SoftBankās Stargate. Mukesh Ambani Plans Worldās Biggest Data Center. Metaās Will Spend Up to $65 Billion.
Notes: Sighā¦
Rolex, Patek Used Watch Prices Fell to Three-Year Low in 2024
Notes: For my friends in NYC.
Zuckerberg Wears $900,000 Watch
Notes: Meanwhileā¦
Candy Crush, Tinder, MyFitnessPal: See the Thousands of Apps Hijacked to Spy on Your Location
Notes: Incredibly concerning given the recent telecom hacks.
There is so much more but I will share in next weekās newsletter. Signing off from Redmond, WA. Happy Sunday and Lunar New Year!