Sitemap

WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 5–11 May

15 min readMay 12, 2025
Photo by Roman Kraft on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. All the Weekly News stories are also collected here:

Weekly AI and ML news - each week the best of the field

75 stories

Artificial intelligence is transforming our world, shaping how we live and work. Understanding how it works and its implications has never been more crucial. If you’re looking for simple, clear explanations of complex AI topics, you’re in the right place. Hit Follow or subscribe for free to stay updated with my latest stories and insights.

Research

  • Chain of Draft for Efficient Reasoning. Chain of Draft is a concise reasoning strategy that significantly reduces token usage while matching or exceeding Chain-of-Thought accuracy across complex tasks.
  • RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation. This paper reveals that adding too many tools to AI agents can backfire, causing prompt overload and reduced accuracy. To fix this, RAG-MCP uses a retrieval-based method that selects only the most relevant tool schemas from a large external index, keeping prompts concise and effective. It cuts prompt size by over half and triples tool-selection accuracy, enabling scalable, efficient multi-tool agents without retraining.
  • Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models. This paper introduces LS-Mixture SFT, a method that fine-tunes LLMs on both long and trimmed chain-of-thought reasoning to reduce verbosity without sacrificing accuracy. By training on a 50/50 mix of detailed and concise reasoning paths and prompting for balanced outputs, the s1-mix-32B model achieves up to 6.7 points higher accuracy with 47% shorter responses across tasks like MATH500 and AIME24 — proving efficient reasoning doesn’t require overthinking.
  • Absolute Zero: Reinforced Self-play Reasoning with Zero Data. Absolute Zero introduces a self-supervised learning approach where an LLM generates and solves its own reasoning tasks without human data, using only code execution for feedback. By evolving task difficulty and optimizing for learnability, a unified model trained with Task-Relative REINFORCE++ achieves state-of-the-art results in coding and math benchmarks, outperforming models trained on human-curated examples and demonstrating strong generalization and scalability.
  • Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions.” Rethinking Memory in AI” introduces a unified taxonomy of memory in LLM agents, dividing it into parametric, contextual-structured, and contextual-unstructured types, with six core operations: consolidation, indexing, updating, forgetting, retrieval, and compression. Analyzing 30,000+ papers, the framework guides when to store, graph, or edit memory, offering a precise toolkit for building more reliable, long-lived AI systems that adapt across sessions and domains.
  • HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking. HyperTree Planning (HTP) replaces linear chains of thought with hierarchical hypertrees to improve LLM planning accuracy by up to 3.6x. It decomposes complex queries into subtasks using a top-down approach, expands branches with rule libraries, and prunes candidates using model-based scoring. Without hand-crafted examples, HTP outperforms chain, tree, and agent methods on benchmarks like TravelPlanner and Blocksworld, pointing to hypertrees as a scalable future for LLM-driven planning.
  • L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning. A new method from Carnegie Mellon, Length Controlled Policy Optimization (LCPO), trains models to reason accurately within user-specified length limits, addressing inefficiencies from overly long or prematurely short outputs. Their 1.5B L1 model balances accuracy and compute use, outperforming previous methods by up to 20% and even rivaling GPT-4o at matched reasoning lengths — despite being 30× smaller. LCPO points to length control as a key advance for efficient, lightweight AI reasoning.
  • Code Retrieval using LoRA. Researchers introduce a LoRA-based fine-tuning method for code search that reduces trainable parameters below 2% while improving retrieval accuracy by up to 9.1% for Code2Code tasks.
  • IDInit: A Universal and Stable Initialization Method for Neural Network Training. A new initialization technique, IDInit, ensures stable convergence in deep neural networks by maintaining identity transitions in both main and sub-stem layers.
  • The Leaderboard Illusion. Chatbot Arena’s benchmarking shows bias stemming from hidden private tests and unequal data access. Companies like Google and OpenAI have broad access, while open-source models get far less, leading to overfitting instead of real model progress.
  • Actor-Critics Can Achieve Optimal Sample Efficiency. A new actor-critic RL algorithm has achieved near-optimal sample efficiency using offline data and targeted exploration, addressing long-standing challenges in hybrid RL settings.

News

Resources

  • Towards multimodal foundation models in molecular cell biology. The development of multimodal foundation models, pretrained on diverse omics datasets, to unravel the intricate complexities of molecular cell biology is envisioned.
  • These are the most-cited research papers of all time. Some studies have received hundreds of thousands of citations, Nature’s updated analysis shows.
  • Which programming language should I use? A guide for early-career researchers. Computer scientists and bioinformaticians address four key questions to help rookie coders to make the right choice.
  • MCP is Unnecessary. MCP primarily handles advertising and calling functions like OpenAPI but does so with a more simplified design. Though both can deliver comparable results, MCP stands out for its smaller scale and ease of use. Its adoption is driven more by social factors than by technical needs.
  • Empowering LLMs with DeepResearch ability. WebThinker is a deep research framework fully powered by large reasoning models (LRMs). It enables LRMs to autonomously search, deeply explore web pages, and draft research reports.
  • Efficient Federated Unlearning. FUSED introduces sparse unlearning adapters to selectively remove knowledge in federated learning, making unlearning reversible and cost-efficient.
  • Attention Distillation for Diffusion-Based Image Stylization. This approach improves image generation by utilizing self-attention features from pretrained diffusion models and applying an attention distillation loss to refine stylization and speed up synthesis.
  • Google SpeciesNet. Google’s SpeciesNet is an open-source AI model designed to identify animal species from camera trap photos. Previously used in Wildlife Insights, it aims to expand biodiversity monitoring efforts.
  • Cognition KEVIN-32B. KEVIN-32B is a reinforcement learning-based model for multi-turn code generation that surpasses current models in generating CUDA kernels. It improves kernel accuracy and performance by refining intermediate feedback and applying effective reward distribution. Its multi-turn training setup enhances problem-solving, especially for complex tasks, compared to single-turn methods.
  • How to train an AI model without falling into GDPR pitfalls? AI model developers can meet GDPR requirements during development by using anonymous data or applying pseudonymization. When full anonymization isn’t possible, they should strengthen data security and uphold individuals’ rights. Publicly communicating how data is used is also advised for greater transparency.
  • Quantization with AutoRound. AutoRound is a post-training quantization method that boosts low-bit model accuracy while preserving performance and efficiency.
  • LLMs for Time Series: A Survey. This survey examines how cross-modality methods adapt large language models for time series analysis, emphasizing data alignment, integration, and effectiveness in downstream tasks across various fields.
  • Synthetic Data QA Framework. This evaluation toolkit offers unified metrics to measure the quality and privacy of synthetic data across different data types, utilizing distributional and embedding-based approaches.
  • DDT: Decoupled Diffusion Transformer. Encoder/Decoder implementation of a Transformer with a Diffusion model as the decoder. It seems to work reasonably well on imagenet generation.
  • Nvidia Radio Embedding Models. Nvidia has a suite of text and image embedding models that match SigLIP in many cases.
  • Pathology with DINOv2. The Mahmood Lab, using Meta’s DINOv2, has developed open-source AI models for pathology, improving disease detection and diagnostics.
  • PyTorch Role in the AI Stack. PyTorch has grown from a research-focused framework into a core platform driving generative AI. The PyTorch Foundation has broadened its scope to include related projects and promote scalable AI development.
  • Osmosis self-improvement via real-time reinforcement learning. Osmosis is a platform enabling AI self-improvement through real-time reinforcement learning. The team has open-sourced a compact model that matches state-of-the-art performance for MCP and can be run locally.

Perspectives

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

--

--

Salvatore Raieli
Salvatore Raieli

Written by Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence

Responses (1)