Mixtures of Experts Unlock Parameter Scaling for Deep RL
about 7 hours ago
I provide various consulting and advisory services. If you‘d like to explore how we can work together, reach out to me through any of my socials over here or reply to this email.
7 Days of agent framework anatomy from first-principles: Day 1
about 7 hours ago
I love the no-framework, code sharing, and insights. I have to re-read this a bunch of times to fully integrate w/ the insights, but there were lots of great things to think about. I’ll be reading through the series with a lot of excitement. This is a series of articles on interacting with large lan
What it’s Like to Work in AI and Advice from 10 AI Professionals
about 7 hours ago
An excellent community effort by Logan Thorneloe to poll the AI community on Substack on what they do (of course I also shared my thoughts). Really loved Sergei Polevikov, ABD, MBA, MS, MA 🇮🇱🇺🇦 ‘s comment in particular- What advice would you give to someone wanting to work in ML/what other impor
The Artificial Investor—Issue 37: Is this the end of the current AI wave?
about 7 hours ago
Interesting analysis of the whole scaling debates, from a market/investor perspective. Wonderful work by Aristotelis Xenofontos.
Distinguishing Ignorance from Error in LLM Hallucinations
about 7 hours ago
Large language models (LLMs) are susceptible to hallucinations-outputs that are ungrounded, factually incorrect, or inconsistent with prior generations. We focus on close-book Question Answering (CBQA), where previous work has not fully addressed the distinction between two possible kinds of halluci
AI Data Centers, Part 2: Energy
about 7 hours ago
Meant to share this earlier, but here is an excellent overview of the market by Eric Flaningam. Among the many bottlenecks for AI data centers, energy might be the most important and the most difficult to address. IF estimates of data center energy consumption turn out to be true (or even in the vic
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
about 7 hours ago
Large language models (LLMs) are expensive to deploy. Parameter sharing offers a possible path towards reducing their size and cost, but its effectiveness in modern LLMs remains fairly limited. In this work, we revisit “layer tying” as form of parameter sharing in Transformers, and introduce novel m
Google drops new Gemini model and it goes straight to the top of the LLM leaderboard
about 10 hours ago
Google's latest AI model, Gemini-Exp-1114, has topped the Imarena Chatbot Arena leaderboard, surpassing OpenAI's GPT-4o and o1-preview reasoning model. The leaderboard, previously known as the LMSys arena, allows AI labs to compete their models in a blind head-to-head competition, with users voting
Elon Musk's xAI raising up to $6 billion to purchase 100,000 Nvidia chips for Memphis data center
about 10 hours ago
Elon Musk's artificial intelligence company, xAI, is reportedly raising up to $6 billion at a $50 billion valuation to acquire 100,000 Nvidia chips for a new supercomputer in Memphis. The funding, expected to close next week, is a combination of $5 billion from Middle Eastern sovereign funds and $1
Mistral unleashes Pixtral Large and upgrades Le Chat into full-on ChatGPT competitor
about 10 hours ago
French startup Mistral has launched Pixtral Large, a 124-billion-parameter model, and upgraded its chatbot, Le Chat, to compete directly with OpenAI's ChatGPT. Pixtral Large, an open-source multimodal AI, excels in text and visual data processing, and can handle up to 30 high-resolution images per i
Extism is bringing Wasm to the unwashed masses
about 15 hours ago
Remember the days before jQuery where you had to sniff out browser support to do basic things like make an XMLHttpRequest or manipulate a DOM node? If you do, then that pain in your lower back is most likely due to a weakness in your hamstrings, and has nothing to do with your back – but also, you m
Sanity isn’t just another CMS
about 15 hours ago
While most CMS platforms have been building increasingly “drag-and-droppy” tools to help marketers spin up cookie cutter landing pages, Sanity is creating a true developer platform to help companies implement strategic content operations at scale. And their brand new Winter Release introduced 4 majo
Mistral has entered the chat
about 15 hours ago
The Mistral team introduced new powers to le Chat, including web citation, a canvas for ideation, and many more.
OpenAI's imminent launch of Operator
about 15 hours ago
OpenAI is planning to move beyond simply answering questions with the imminent launch of “Operator”.
Introducing Prompt Canvas: a Novel UX for Developing Prompts
about 15 hours ago
LangChain introduced Prompt Canvas, a way to collaborate with an AI agent to build and optimize your prompts.
IBM and Llama: Working to enable AI builder creativity globally
about 15 hours ago
IBM and Meta are implementing the combined power of IBM’s watsonx AI platform and Llama to help businesses reach their AI goals.
Amazon Puts $110M Into Academic Generative AI Research
about 15 hours ago
Amazon stated that they will invest $110 million in university-led research into generative AI to help drive breakthroughs in the field.
Nvidia launches H200 NVL high-performance GPU to power AI supercomputing
about 15 hours ago
Nvidia announced the availability of its newest data center-grade GPU to power AI and high-performance computing.
Google Kubernetes Engine supports 65,000-node clusters
about 15 hours ago
In anticipation of even larger language models, Google introduced support for 65,000-node clusters in Google Kubernetes Engine.
Mistral has entered the chat
about 15 hours ago
The Mistral team introduced new powers to le Chat, including web citation, a canvas for ideation, and many more.
OpenAI's imminent launch of Operator
about 15 hours ago
OpenAI is planning to move beyond simply answering questions with the imminent launch of “Operator”.
Introducing Prompt Canvas: a Novel UX for Developing Prompts
about 15 hours ago
LangChain introduced Prompt Canvas, a way to collaborate with an AI agent to build and optimize your prompts.
IBM and Llama: Working to enable AI builder creativity globally
about 15 hours ago
IBM and Meta are implementing the combined power of IBM’s watsonx AI platform and Llama to help businesses reach their AI goals.
Amazon Puts $110M Into Academic Generative AI Research
about 15 hours ago
Amazon stated that they will invest $110 million in university-led research into generative AI to help drive breakthroughs in the field.
Nvidia launches H200 NVL high-performance GPU to power AI supercomputing
about 15 hours ago
Nvidia announced the availability of its newest data center-grade GPU to power AI and high-performance computing.