Rami's Readings #102 - LLM Post-Training and Nvidia DGX Spark
The latest on AI, LLMs, Deep Learning, LLM Post-Training, Open Source Inference Servers, AI Investment, Nvidia DGX Spark (previously called Project Digits), and more.
Welcome to Rami’s Readings #102 - a weekly digest of interesting articles, papers, videos, and X threads from my various sources across the Internet. Expect a list of reads covering AI, technology, business, culture, fashion, travel, and more. Learn about what I do at ramisayar.com/about.
Apologies for the late-night delivery—I'm back stateside and experiencing the reverse of the jet lag from last weekend. 😭
👋🏼 Welcome New Subscribers
Hello! A hearty thank you for subscribing to Rami's Readings! There are quite a few new subscribers this week, thanks to a recommendation from The AI Ethics Brief. I am thrilled to have you on board! In this newsletter, I curate the best papers, tweets, and articles I have read during the week focusing on LLMs, AI, economics, business, and technology news. You can learn more about me on my website.
📈 Top Recent Editions According to Substack
🤖 AI Reads
Deep Learning is Not So Mysterious or Different
Notes: From NYU Professor, Andrew Gordon Wilson. This paper shows that Deep Learning isn’t magic or immune to faults found in other models.
LLM Post-Training: A Deep Dive into Reasoning Large Language Models
Notes: LLM Post-Training is capturing much attention, especially after R1’s release. I find survey papers are a fantastic way to get a mental map of the world. Survey papers are under appreciated. This paper does a good job and has an accompanying GitHub repository.
The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models
Notes: Another great paper showing how the first few tokens play an outsize impact.
How Well Do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach
Notes: From Columbia Business School.
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Notes: Supposedly outperforms GRPO used by DeepSeek.
Honouring Abhishek Gupta: Memorial on April 10, 2025
Notes: I miss you buddy. 😭 Abhishek was the founder of the
.tensorlakeai / indexify: A Realtime Serving Engine for GenAI Applications
Notes: Neat open source project!
predibase / LoRAX: Multi-LoRA Inference Server That Scales to 1000s of Fine-tuned LLMs
Notes: I'm not sure who this open-source project is aimed at, but it's neat!
What I Learned Building a Free Semantic Search Tool for GitHub and Why I Failed
Notes: Excellent long read showing how hard it is to build semantic search engines. Includes hot tips for pgvector and embedding models.
State-of-the-Art Text Embedding via the Gemini
Notes: Speaking of embedding models, Gemini released a new SOTA embedding model.
💼 Business Reads
Why Organizations Must Continue AI Investment Despite Budget Pressures
Notes: From my colleague at MIT, James Villarrubia. He shared his AI predictions for 2025 in #94. My favorite quote from the article:
Imagine telling a 2032 graduate, "We don't use AI here" – it would be like telling a graduate today, "We don't use computers for spreadsheets."
Nvidia Looks to Expand AI Reign With New Chips, Personal Supercomputers
Notes: I was excited for Project Digits, but the hardware specs and pricing disenchanted me. The M4 chips provide faster inference speeds, thanks to higher memory bandwidth, and doubles as a useful consumer computer. Nvidia’s software stack might eke out some additional benefits and it obviously supports CUDA… but it’s too expensive for what you’re getting. Asus’ version has a better price point at $2,999.
Dubai Luxury Frenzy Has Buyers Snapping Up $240,000 Watches
Notes: For my friends in NYC, there is always a luxury market open somewhere in the world.
🔀 Other Reads
HTTP/3 is Everywhere but Nowhere
Notes: Great analysis explaining why HTTP/3 support is missing from many programming languages and frameworks, despite its advantages for handling mobile traffic. HTTP/3 brings a promising successor for WebSockets: WebTransport. However without Safari support, it’s not useful for production scenarios yet.
Apple’s Long-Lost Hidden Recovery Partition From 1994 Has Been Found
Notes: Software history is real history.
Signing off from Redmond.
Get on the Referral Leaderboard!
If you enjoy Rami’s Readings, it would be incredible, amazing, star-worthy if you invited friends to subscribe and read with us. If you refer friends, you will receive fantastic benefits reflecting my gratitude for your contributions.
How to participate:
1. Share Rami’s Readings. You'll get credit for new subscribers when you use the referral link below or the “Share” button on any post. Send the link in a text, email, or share it on social media with friends.
2. Earn benefits. When more friends use your referral link to subscribe, you’ll receive the following.
Get 1 Hour Virtual Coffee Chat for 5 referrals - We can chat about anything you fancy, including LLMs, AI, Tech, Business, Economics, etc.
Get a 4 Hour In-Person Coffee Chat for 20 referrals - I will ✈️ fly out to meet you anywhere in the continental United States for an in-person chat at your preferred coffee shop.
Get 8 Hours of LLM and AI Mentorship for 50 referrals - I will mentor you and help you grow into a LLM and AI expert. I would be happy to ✈️ fly out and meet with you for 8 hours, or if you prefer virtual sessions - glad to do that as well.
To learn more about how Substack operates referrals, check out Substack’s FAQ.