Rami's Readings #78 - LLMs, Game Theory & Persuasion
The latest on AI, LLMs, Persuasion, Game Theory, Smaller Models & Synthetic Data, Mamba, Text2SQL is Not Enough, tinybox, AI Ethics, and more.
Welcome to Rami’s Readings #78 - a weekly digest of interesting articles, papers, videos, and X threads from my various sources across the Internet. Expect a list of reads covering AI, technology, business, culture, fashion, travel, and more. Learn about what I do at ramisayar.com/about.
Thank you to everyone who asked about my hand surgery last week. I’m feeling better this week, though I still have about six weeks left for recovery. I’m happy to say that I was able to type out this entire newsletter!
🤖 AI Reads
Conversational AI Powered by Large Language Models Amplifies False Memories in Witness Interviews
Notes: From the MIT Media Labs. The body of research on LLMs & Game Theory, and LLMs & Persuasion is expanding fast, and generally worries me. 😨 (See next paper too).
Persuasion Games using Large Language Models
Notes: In #39, I shared a paper on how LLMs (GPT-4 in that case) can play repeated games (Game Theory). This paper from IIT & Tata Consulting is an interesting experiment on structuring persuasive multiple-agents.
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Notes: From Google DeepMind, showing that smaller, weaker models can generate high-quality synthetic training data. The evidence continues to point that companies panicked in 2023-2024, and over-invested in massive GPU farms (see previous newsletters about Edge AI).
Our findings reveal that models finetuned on WC-generated data consistently outperform those trained on SE-generated data across multiple benchmarks and multiple choices of WC and SE models. These results challenge the prevailing practice of relying on SE models for synthetic data generation, suggesting that WC may be the compute-optimal approach for training advanced LM reasoners.
* WC = Weaker but cheaper models. SE = Stronger but expensive models.
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Notes: Two weeks ago I shared FalconMamba. Another paper using the Mamba framework from Peking University.
Text2SQL is Not Enough: Unifying AI and Databases with TAG
Notes: From Berkeley and Stanford. This paper is very interesting and shows LLMs have a long way to go with tabular data.
We propose Table-Augmented Generation (TAG), a unified and general-purpose paradigm for answering natural language questions over databases. The TAG model represents a wide range of interactions between the LM and database that have been previously unexplored and creates exciting research opportunities for leveraging the world knowledge and reasoning capabilities of LMs over data. We systematically develop benchmarks to study the TAG problem and find that standard methods answer no more than 20% of queries correctly, confirming the need for further research in this area.
Salesforce Released Multiple Large Action Models: xLAM-8x22b-r
Notes: If you are building GenAI or AI Agent applications and looking for Salesforce-specific function calling, investigate these models.
Tinygrad’s tinybox - Portable GPU Cluster (AMD & Nvidia)
Notes: I built my own mini-GPU cluster with off-the-shelf hardware. BUT if anyone is looking to get me a birthday present, my birthday is coming up! 😄
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Notes: Open-source video generation model from Tsinghua University and Zhipu AI. Videos look as good as Pika, Luma and others. Anyone have more info about Zhipu AI?
Qwen2-VL: To See the World More Clearly
Notes: Video understanding model from the Alibaba group.
💼 Business Reads
The World’s Call Center Capital Is Gripped by AI Fever — and Fear
Notes: Please read. Afterwards, reflect on the persuasion papers above. AI Ethics continues to be underrated, we need more. Why I recommend:
A Humble Proposed Chicago School of Economics Canon
Notes: From X.
Strava and Letterboxd Surge as Users Crave Social-Media Refuge
Notes: I love Strava, but don’t have many friends on it. Reply to connect!
CrowdStrike VP Set to Testify to Congress on IT Outage
Notes: AFAIK most business are back to normal. Is anyone still impacted?
Signing off from 203° F Coffee Co.