WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

ML news: Week 18–24 March

Stability AI maker leaves the company, Microsoft devours Inflection AI, and much more

Salvatore Raieli

20 min readMar 25, 2024

Photo by Priscilla Du Preez 🇨🇦 on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

GitHub — SalvatoreRa/ML-news-of-the-week: A collection of the the best ML news every week…

A collection of the the best ML news every week (research, news, resources) — GitHub — SalvatoreRa/ML-news-of-the-week…

github.com

You will find the news first in GitHub. Single posts are also collected here:

Salvatore Raieli

Weekly AI and ML news - each week the best of the field

View list

63 stories

Research

ScoreHMR: Score-Guided Diffusion for 3D Human Recovery. We present Score-Guided Human Mesh Recovery (ScoreHMR), an approach for solving inverse problems for 3D human pose and shape reconstruction. ScoreHMR mimics model fitting approaches, but alignment with the image observation is achieved through score guidance in the latent space of a diffusion model. Here, we show the application of our approach on videos, utilizing keypoint detections and score guidance with keypoint reprojection and temporal smoothness terms.
Cappy: Outperforming and boosting large multi-task language models with a small scorer. A little model called Cappy has been taught to accept instructions and a candidate’s completion, then calculate how well the completion satisfies the instructions by returning a score. It performs better on this job than significantly bigger models, indicating that it may be applied as a generation and training feedback mechanism.

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation. demonstrates how LLM reasoning and generation in long-horizon generation tasks can be greatly enhanced by iteratively revising a chain of thoughts with information retrieval; the key idea is that each thought step is revised with pertinent retrieved information to the task query, the current and past thought steps; Retrieval Augmented Thoughts (RAT) is a zero-shot prompting approach that offers notable improvements over baselines that include vanilla RAG, zero-shot CoT prompting, and other baselines. RAT can be applied to various models such as GPT-4 and CodeLlama-7B to improve long-horizon generation tasks (e.g., creative writing and embodied task planning).
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. outlines Quiet-STaR, a generalization of STaR that enables language models (LMs) to acquire reasoning skills that are more scalable and general; Quiet-STaR gives LMs the ability to produce justifications for each token to explain the future text; it suggests a token-wise parallel sampling approach that enhances LM predictions by producing internal thoughts effectively; REINFORCE is used to improve the rationale creation.

*The demonstration of the instruction-following pre-training of multi-task LLMs, e.g., FLAN. Pre-training tasks under this paradigm improves the performance for unseen tasks.* *source*

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM. suggests combining expert LLMs with a mixture of expert LLMs as a more computationally efficient way to train LLMs. This method, called BTX, is shown to be more effective than training a single specialized LLM or a larger generalist LLM. It works by first training (in parallel) multiple copies of a seed LLM with specialized knowledge in different domains (i.e., expert LLMs), then combining them into a single LLM using MoE feed-forward layers. Finally, the entire unified model is fine-tuned.
Large language models surpass human experts in predicting neuroscience results. suggests using BrainBench as a benchmark to assess LLMs’ capacity to forecast neuroscience outcomes; discovers that LLMs outperform experts in forecasting the results of experiments; an LLM that has been modified based on neuroscience literature has been demonstrated to do even better.
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer. Comprehensive literature analysis faces a problem due to the scientific literature’s constant increase. Because of their ability to summarize, LLMs present a viable option; yet, they are not well-suited to the multimodal aspects that are common in scientific information. Uni-SMART (Universal Science Multimodal Analysis and Research Transformer) was created to fill this vacuum by understanding and analyzing the intricate multimodal data found in scientific publications.
Mechanics of Next Token Prediction with Self-Attention. Predicting the next token is a straightforward goal that triggers complex actions. This work discovered that the problem could be divided into two parts: soft composition and hard retrieval. This allowed for good overall performance and in-context learning, and the single self-attention layer was trained using gradient descent.

Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection. By combining visual transformers with knowledge distillation, YOLOX-ViT presents a novel method for object recognition in underwater robots.GroupContrast.GroupContrast combines semantic-aware contrastive learning with segment grouping to redefine self-supervised 3D representation learning.
Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification. With an emphasis on object-centric information, this study presents a novel approach to object detection in photos captured from a variety of spectrums, including RGB, near-infrared, and thermal imaging. The goal is to increase recognition accuracy by mitigating the effects of background noise.
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation. Stable Diffusion 3 is a potent model for creating images. Latent Adversarial Diffusion Distillation, which keeps picture production quality constant while reducing the number of diffusion stages to 4, is shown in this study.

Distilling Datasets Into Less Than One Image. Poster Dataset Distillation (PoDD): We propose PoDD, a new dataset distillation setting for a tiny, under 1 image-per-class (IPC) budget. In this example, the standard method attains an accuracy of 35.5% on CIFAR-100 with approximately 100k pixels, and PoDD achieves an accuracy of 35.7% with less than half the pixels (roughly 40k)
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control . MineDreamer is an artificial intelligence (AI) bot that uses cutting-edge language and vision models creatively to obey intricate commands in the Minecraft universe.
DreamDA: Generative Data Augmentation with Diffusion Models. DreamDA presents a novel method of data augmentation by creating high-quality, diversified synthetic visuals that closely resemble the original data distribution using diffusion models.
Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models. The Interactive Reasoning method known as Chain-of-Spot (CoS) greatly improves the way Large Vision-Language Models (LVLMs) analyze and comprehend pictures. With CoS, LVLMs may obtain precise visual information without sacrificing picture quality by concentrating on specific regions of interest inside images in response to predetermined inquiries or commands.
StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On. A new method for image-based virtual try-on is called StableVITON. This approach takes advantage of the creative capacity of diffusion models that have already been trained while paying attention to garment details. StableVITON discovers semantic correspondences in the latent space of a pre-trained model between clothing and the human body.

Diffusion-based Video Translation. FRESCO is a unique method that greatly enhances the spatial-temporal consistency in video translation tasks by combining intra-frame and inter-frame correspondences.
Generalized Consistency Trajectory Models. With the introduction of Generalized Consistency Trajectory Models (GCTMs), this effort improves the capabilities of diffusion models for tasks such as image restoration and editing. By translating between any two distributions in a single step, these models simplify the procedure and enable remarkably accurate and efficient image modification.
Introducing SceneScript, a novel approach for 3D scene reconstruction. A model developed by Meta Reality Labs may convert visual input into a three-dimensional (3D) representation of a scene. The 70m parameter model has exceptional stability and operates rapidly on the device.
Scalable Diffusion Models with State Space Backbone. A novel kind of diffusion model known as Diffusion State Space Models (DiS) uses a state space backbone for image data instead of the conventional U-Net. These models are effective at producing high-quality photos with little computing work and can manage long-range relationships.
PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns. PuzzleVQA is a dataset created to evaluate big multimodal models such as GPT-4V’s capacity for abstract thinking.

Is Apple ready to launch its own AI?

MM1 appears to be a sign that Apple is intent on accelerating on AI

levelup.gitconnected.com

News

Open Release of Grok-1. We are releasing the weights and architecture of our 314 billion parameter Mixture-of-Experts model, Grok-1.
Did OpenAI just accidentally leak the next big ChatGPT upgrade?OpenAI may have accidentally leaked details about a new AI model called GPT-4.5 Turbo. The leak suggests that GPT-4.5 Turbo will be faster, more accurate, and have a larger knowledge base than its predecessor.

Claude 3 Haiku: our fastest model yet. Today we’re releasing Claude 3 Haiku, the fastest and most affordable model in its intelligence class. With state-of-the-art vision capabilities and strong performance on industry benchmarks
Midjourney debuts feature for generating consistent characters across multiple gen AI images. The popular AI image-generating service Midjourney has deployed one of its most oft-requested features: the ability to recreate characters consistently across new images.
Apple researchers achieve breakthroughs in multimodal AI as company ramps up investments. Apple researchers have developed new methods for training large language models on both text and images, enabling more powerful and flexible AI systems, in what could be a significant advance for artificial intelligence and for future Apple products.
Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images. Today we are releasing Stable Video 3D (SV3D), a generative model based on Stable Video Diffusion, advancing the field of 3D technology and delivering greatly improved quality and view-consistency.

Google researchers unveil ‘VLOGGER’, an AI that can bring still photos to life. Google researchers have developed a new artificial intelligence system that can generate lifelike videos of people speaking, gesturing, and moving — from just a single still photo. The technology, called VLOGGER, relies on advanced machine learning models to synthesize startlingly realistic footage, opening up a range of potential applications while also raising concerns about deepfakes and misinformation.
Microsoft has added the GPT-4 Turbo LLM to the free version of Copilot. Microsoft is boosting the performance of its Copilot generative AI chatbot today. It has been confirmed that all free Copilot users can now access the GPT-4 Turbo large language model from OpenAI.
Korean researchers power-shame Nvidia with new neural AI chip — claim 625 times less power draw, 41 times smaller. The new C-Transformer chip is claimed to be the world’s first ultra-low power AI accelerator chip capable of large language model (LLM) processing.
Inflection co-founders leave for Microsoft AI. Karén Simonyan and Mustafa Suleyman are leaving Inflection to launch Microsoft AI. The next CEO will be Sean White. Additionally, a few Inflection senior team members are joining Microsoft AI.Lilac acquired by Databricks.Lilac is a scalable, user-friendly tool for data scientists to search, cluster, and analyze any kind of text dataset with a focus on generative AI.

IBM and NASA build language models to make scientific knowledge more accessible. In a new collaboration, IBM and NASA created a suite of efficient language models by training on scientific literature. Based on the transformer architecture, these models can be used in a variety of applications, from classification and entity extraction to question-answering and information retrieval. These models achieve high performance across a variety of domains and can respond promptly. We have open-sourced the models on Hugging Face for the benefit of the scientific and academic community.
Introducing RAG 2.0. A technique for adding knowledge to a language model that can become stale is called retrieval augmented generation, or RAG. Unfortunately, outside of demonstrations, the current paradigm of “frozen RAG,” in which just a portion of the pipeline is trained and the model itself is not updated, performs badly. This blog describes the next generation of RAG, where all the components are fine-tuned for the job at hand. In this system, an open model such as Mistral 7B can perform better than the conventional GPT-4 RAG.
Fitbit Using Google Gemini for New AI That Could Become Your Fitness Coach. Google is training Gemini on health data, and it’s creating a new AI model for the Fitbit app that can give advice tailored to your needs.

https://sites.google.com/view/chain-of-spot/

Stable Diffusion maker leaves Stability AI. Robin Rombach helped build the tech that made Stability AI famous, now he’s leaving the company
Introducing Copilot4D: A Foundation Model for Self-Driving. Waabi’s Copilot4D is a ground-breaking foundation model that advances the capabilities of autonomous machines by using LiDAR data to comprehend and forecast the 3D dynamics of the environment across time.
NLX Raises $15M in Series A Funding. In March 2024, NLX extended its Series A funding to $15M, adding Comcast Ventures.
Triton Puzzles. Triton is an alternative open-source language that allows you to code at a higher level and compile to accelerators like GPU. This set is puzzles is meant to teach you how to use Triton from first principles in an interactive fashion. You will start with trivial examples and build your way up to real algorithms like Flash Attention and Quantized neural networks. These puzzles do not need to run on GPU since they use a Triton interpreter.

https://rlawjdghek.github.io/StableVITON/

New Breakthrough Brings Matrix Multiplication Closer to Ideal. Researchers from Tsinghua University and UC Berkeley have made great strides in matrix multiplication, introducing a novel method that has already inspired improvements. Significant time, power, and cost savings in a variety of applications could result from this development in a fundamental computer procedure. Since the previous milestone in 2010, this is the most significant advancement in lowering the computational cost of matrix multiplication.
OpenAI could release GPT-5 in a few months: Report. OpenAI could release GPT-5, the next generation of its groundbreaking large language model, in a few months, according to a new report.
Beijing court’s ruling that AI-generated content can be covered by copyright eschews US stand, with far-reaching implications on tech’s use. The Beijing Internet Court ruled that an AI-generated image in an intellectual property dispute was an artwork protected by copyright laws. That decision is expected to have far-reaching implications for future AI copyright disputes, which could eventually benefit Chinese Big Tech companies.

https://www.mmlab-ntu.com/project/fresco/

Japan’s premier AI lab launches its first model. Sakana AI develops cutting-edge models for Japanese language, vision, and picture production. In order to evolve foundation models without the need for costly retraining, it introduced an evolutionary model merging. The model merging and a description of the process are now available.
Cohere’s Command-R Enterprise Model Coming to ai.nvidia.com. The RAG-optimized Command-R model from Cohere, which is intended to help enterprises transition to large-scale production, will soon be available in the freshly released NVIDIA API catalog.
Biden-Harris Administration Announces Deal with Intel for AI Chips. Biden-Harris Administration Announces Preliminary Terms with Intel to Support Investment in U.S. Semiconductor Technology Leadership and Create Tens of Thousands of Jobs
Apple’s AI ambitions could include Google or OpenAI. The iPhone maker is in ‘active’ talks to bring Gemini to the iPhone and has also considered using ChatGPT.
World’s first major act to regulate AI passed by European lawmakers. The European Union’s parliament on Wednesday approved the world’s first major set of regulatory ground rules to govern the mediatized artificial intelligence at the forefront of tech investment. Born in 2021, the EU AI Act divides the technology into categories of risk, ranging from “unacceptable” — which would see the technology banned — to high, medium, and low hazard.

Tabula Rasa: Large Language Models for Tabular Data

Tabular data are everywhere, why and how you can use LLMs for them

levelup.gitconnected.com

Resources

tlm — Local CLI Copilot, powered by CodeLLaMa. tlm is your CLI companion which requires nothing except your workstation. It uses the most efficient and powerful CodeLLaMa in your local environment to provide you with the best possible command line suggestions.
Multi-node LLM Training on AMD GPUs. The whole stack of technologies, including schedulers, model training software, and more, that Lamini employs to train models on AMD GPUs is described in this blog article.

https://ai.meta.com/blog/scenescript-3d-scene-reconstruction-reality-labs-research/

clarity-upscaler. A state-of-the-art image upscaling tool.
musiclang_predict. Music Lang is an API and set of models that generate music.
Optimizing Technical Docs for LLMs. Capa.ai provides guidance on how to organize LLM documentation, including how to include troubleshooting FAQs, self-contained code snippets, segmentation into sub-products, and community forum creation.
lamini/earnings-calls-qa. This dataset contains transcripts of earning calls for various companies, along with questions and answers related to the companies’ financial performance and other relevant topics.
Knowledge Conflicts for LLMs: A Survey. A summary of the prevalent problem of knowledge conflict that arises while working with LLMs; the survey article divides these conflicts into three categories: intra-memory, inter-context, and context-memory conflict. It also offers insights into the sources of these conflicts and possible solutions.
Enhancing RAG-based application accuracy by constructing and leveraging knowledge graphs. A practical guide to constructing and retrieving information from knowledge graphs in RAG applications with Neo4j and LangChain

How to Evaluate Your RAG System? Retrieval Augmented Generation (RAG) is a powerful technique that enhances output quality by retrieving relevant context from an external vector database. However, building and evaluating an RAG system can be challenging, especially when it comes to measuring performance. In this post, we’ll explore the most effective metrics for each stage of your RAG pipeline and how to use them to evaluate your whole system.
Anthropic Prompt Library. Although Claude 3 has been widely used, these models use a somewhat different prompting technique. Anthropic has compiled a list of user prompts that are effective for a wide range of assignments and subjects.
Pretraining 16 language models on different tokenizers. One peculiarity of contemporary language modeling is that the model is not trained until the tokenizer has been trained. The second peculiar truth is that, on vast scales, vocabulary size doesn’t appear to matter all that much.
LLM4Decompile. Reverse Engineering: Decompiling Binary Code with Large Language Models

https://www.androidauthority.com/chat-gpt-4-5-turbo-3425326/

Under The Hood: How OpenAI’s Sora Model Works. In this blog post, we dive into some of the technical details behind Sora. We also talk about our current thinking around the implications of these video models. Finally, we discuss our thoughts around the compute used for training models like Sora and present projections for how that training compute compares to inference, which has meaningful indications for estimated future GPU demand.
Quiet-STaR. A reasoning framework called Quiet-Star enhances language models’ capacity to produce accurate results. An eight-step model per token has been given along with the code.
MoE-Adapters4CL. Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset. Through extensive experiments across various settings, our proposed method consistently outperforms previous state-of-the-art approaches while concurrently reducing parameter training burdens by 60%.
LlamaGym. Fine-tune LLM agents with online reinforcement learning

https://www.anthropic.com/news/claude-3-haiku

Stylized image binning algorithm. This is a tutorial on utilizing a JavaScript binning method to create an image processing application that looks like pixel art and has customizable interactive web features like sliders. By averaging pixel brightness inside bins, the binning technique transforms photos into stylized, pixelated artwork by utilizing parameters like bin size and spacing. The approach entails efficiently optimizing looping structures and modifying pixel data on HTML canvas components.
TorchTune. TorchTune is a native-Pytorch library for easily authoring, fine-tuning and experimenting with LLMs.MVFA-AD.Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

Does it Really Matter Grok?

Musk claims he has open-source Grok, but it does matter or is it just another move in larger play?

levelup.gitconnected.com

Perspectives

What I learned from looking at 900 most popular open source AI tools. By examining the GitHub stars of well-known AI models, we can uncover some fascinating patterns. The majority of open-source AI tools appear to be geared toward apps and infrastructure.
LLM inference speed of light. This article explores the “speed of light” theoretical limit for transformer-based language model inference and emphasizes the significance of memory bandwidth over computational power, showing that the ability to read data from memory rather than perform calculations is the primary constraint on inference speed and that this is an important factor to optimize and comprehend the performance of AI.

AI is bad/good actually. This article’s author suggests eschewing the nebulous good/bad continuum and instead using terminology like “harmful,” “helpful,” “capable,” and “incapable” to distinguish AI conversations. For them, AI is capable yet possibly dangerous because of unresolved problems like bias exaggeration and copyright infringement. Using these more precise phrases, the author asks readers to explain their own opinions on AI
Captain’s log: the irreducible weirdness of prompting AIs. A wealth of free AI and machine learning tools can be found on the new companion website, More Useful Things. These resources highlight the amusing and useful ways in which AI-generated prompts, such as creative scenarios, can surpass human-crafted ones in tasks like solving mathematical puzzles. For more consistent prompting outcomes, the experiment emphasizes the value of adding context, few-shot learning, and chain-of-thought strategies. Though organized prompting is still an evolving art with considerable potential benefits, prompting as a talent may become less important as AI models advance and get better at inferring user intent.
AI Prompt Engineering Is Dead, Long live AI prompt engineering. According to recent studies, as AI and machine learning models get better at optimizing their own prompts, human prompt engineers might become outdated. Prompts produced by algorithms can be strange but powerful; they exceed those created by humans and significantly cut down on optimization time. Despite the potential of automatically adjusted prompts, experts predict that the need for occupations related to prompts will change rather than vanish, maybe taking the form of new positions like LLMOps (Large Language Model Operations).

https://sakana.ai/evolutionary-model-merge/

The Road to Biology 2.0 Will Pass Through Black-Box Data. This year marks perhaps the zenith of expectations for AI-based breakthroughs in biology, transforming it into an engineering discipline that is programmable, predictable, and replicable. Drawing insights from AI breakthroughs in perception, natural language, and protein structure prediction, we endeavor to pinpoint the characteristics of biological problems that are most conducive to being solved by AI techniques. Subsequently, we delineate three conceptual generations of bio AI approaches in the biotech industry and contend that the most significant future breakthrough will arise from the transition away from traditional “white-box” data, understandable by humans, to novel high-throughput, low-cost AI-specific “black-box” data modalities developed in tandem with appropriate computational methods.
”AI, no ads please”: 4 words to wipe out $1tn. AI poses a huge threat to ad-based platforms by slashing how many ads we see
OpenAI’s “Own Goal”. And why it is becoming increasingly difficult to take them at their word
What if it isn’t happening, AGI is not coming? No matter what appears to be happening, we always have to consider what if it isn’t. What If LLMs fail to turn into AGIs? Has our quest for intelligence simply unveiled our demonstrable lack thereof? Will trillions of dollars turn unpredictable hallucination machines into reliable universal productivity tools that can do anything?
How OpenAI’s text-to-video tool Sora could change science — and society. OpenAI’s debut of its impressive Sora text-to-video tool has raised important questions.
Chatbot AI makes racist judgements on the basis of dialect. Some large language models harbor hidden biases that cannot be removed using standard methods.

Could AI-designed proteins be weaponized? Scientists lay out safety guidelines. AI tools that can come up with protein structures at the push of a button should be used safely and ethically, say researchers in the field.
Three reasons why AI doesn’t model human language. Artificial intelligence (AI) is being used to develop large language models (LLMs) with considerable success. But they should not be seen as being models of how human language works and is acquired.
So … you’ve been hacked. Research institutions are under siege from cybercriminals and other digital assailants. How do you make sure you don’t let them in?
8 Google Employees Invented Modern AI. Here’s the Inside Story. They met by chance, got hooked on an idea, and wrote the “Transformers” paper — the most consequential tech breakthrough in recent history.
Using LLMs to Generate Fuzz Generators. Claude and other LLMs are capable of producing efficient fuzzes for code parsing, automating a task that has historically required a great deal of human labor. Given that fuzzing is stochastic, LLMs seem to be a good fit for producing fuzzes, even if they are usually not exact enough for static analysis. To find and exploit code vulnerabilities, a hybrid approach that combines targeted fuzzing and LLM-driven static analysis may be promising.

https://docs.anthropic.com/claude/prompt-library

First Impressions of Early-Access GPT-4 Fine-Tuning. A few weeks ago we finally got access to the GPT-4 fine-tuning API (in limited early access), and were super excited to check out how well it works. We’d been a user of OpenAI’s fine-tuned models since fine-tuning the original GPT-3 Davinci model first became available.
AI and the Future of Work. High Mensa exam scores for Anthropic’s most recent AI, Claude, indicate that self-improving AI is not far off and presents both prospects and existential concerns. As seen at Klarna, where a customer support AI replaced 700 workers, machine learning is already eliminating jobs. This suggests that automation is becoming more and more common. Recent layoffs at Duolingo as a result of AI’s translation capabilities highlight this change and the increasing influence of AI on the nature of work in the future.
Two years later, deep learning is still faced with the same fundamental challenges. Gary Marcus revisits his forecasts two years after writing a pessimistic AI paper, and he maintains his original mistrust. Even with breakthroughs like GPT-4, basic problems like true understanding and reliable AI are still unsolved. Marcus draws the conclusion that multidisciplinary cooperation is essential to achieving AGI and that increasing data and processing capacity alone won’t be enough.
From 0 to 10 million users in four years.In just four years, the AI-powered writing tool Copy.ai has amassed an amazing 10 million users.

The AI worm and the LLM leaf

New research warns how a LLM can be poisoned and spread around

levelup.gitconnected.com

Medium articles

A list of the Medium articles I have read and found the most interesting this week:

Cameron R. Wolfe, Ph.D., The Basics of AI-Powered (Vector) Search, link
Alessandro S. Capezza, Sam Altman Predicts: GPUs, The Currency of Tomorrow’s Tech Economy, link
Ignacio de Gregorio, Google Just Created an Olympic Gold Medalist, link
Yogesh Haribhau Kulkarni (PhD), Tuning for Geometry, link
Dr. Mandar Karhade, MD. PhD., Opensource Grok-1: A New Frontier in AI by xAI, link
___, How To Do RAG Without Vector Databases, link
Michał Oleszak, Designing RAGs, link
Douglas Rushkoff, The Model Isn’t the Territory, Either, link

DeepMind’s SIMA: Rule the Simulated World Before Take Over the Real One

A new agent by DeepMind shows impressive new generalization skills in videogames

levelup.gitconnected.com

Meme of the week

Cosine Similarity and Embeddings Are Still in Love?

Cosine similarity is the most used method, but it is really the best?

levelup.gitconnected.com

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Get an email whenever Salvatore Raieli publishes.

Get an email whenever Salvatore Raieli publishes. By signing up, you will create a Medium account if you don’t already…

salvatore-raieli.medium.com

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…

github.com

or you may be interested in one of my recent articles:

PlanGPT: LLM domain specific to revolutionizing industries

Knowledge and planning give the power to reshape industries

levelup.gitconnected.com

Tabula Rasa: Fill in What Is Missing

Missing values are a known problem, why and how we can solve it

levelup.gitconnected.com

Dividi et Impera: A Practical Guide to BLOB Analysis and Extraction with Python

Simple yet powerful techniques to extract objects.

ai.plainenglish.io

Bring Your AI Agents from Virtual to Reality

AI agents are the new frontier, but how they are doing in the real world?

levelup.gitconnected.com

Learning to Learn: How AI and Humans Learn

Understanding learning to create better AI and understand ourselves

levelup.gitconnected.com

WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

ML news: Week 18–24 March

Stability AI maker leaves the company, Microsoft devours Inflection AI, and much more

GitHub — SalvatoreRa/ML-news-of-the-week: A collection of the the best ML news every week…

A collection of the the best ML news every week (research, news, resources) — GitHub — SalvatoreRa/ML-news-of-the-week…

Weekly AI and ML news - each week the best of the field

Research

Is Apple ready to launch its own AI?

MM1 appears to be a sign that Apple is intent on accelerating on AI

News

Tabula Rasa: Large Language Models for Tabular Data

Tabular data are everywhere, why and how you can use LLMs for them

Resources

Does it Really Matter Grok?

Musk claims he has open-source Grok, but it does matter or is it just another move in larger play?

Perspectives

The AI worm and the LLM leaf

New research warns how a LLM can be poisoned and spread around

Medium articles

DeepMind’s SIMA: Rule the Simulated World Before Take Over the Real One

A new agent by DeepMind shows impressive new generalization skills in videogames

Meme of the week

Cosine Similarity and Embeddings Are Still in Love?

Cosine similarity is the most used method, but it is really the best?

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

Get an email whenever Salvatore Raieli publishes.

Get an email whenever Salvatore Raieli publishes. By signing up, you will create a Medium account if you don’t already…

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…

PlanGPT: LLM domain specific to revolutionizing industries

Knowledge and planning give the power to reshape industries

Tabula Rasa: Fill in What Is Missing

Missing values are a known problem, why and how we can solve it

Dividi et Impera: A Practical Guide to BLOB Analysis and Extraction with Python

Simple yet powerful techniques to extract objects.

Bring Your AI Agents from Virtual to Reality

AI agents are the new frontier, but how they are doing in the real world?

Learning to Learn: How AI and Humans Learn

Understanding learning to create better AI and understand ourselves

Written by Salvatore Raieli

No responses yet