WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 8–14 April

Salvatore Raieli
20 min readApr 15, 2024

Google, Mistral releases new models, Apple’s layoff the autonomous car engineers

Photo by Myznik Egor on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. Single posts are also collected here:

Weekly AI and ML news - each week the best of the field

38 stories

Research

  • Smartphone app could help detect early-onset dementia cause, study finds. App-based cognitive tests were found to be proficient at detecting frontotemporal dementia in those most at risk. Scientists have demonstrated that cognitive tests done via a smartphone app are at least as sensitive to detecting early signs of frontotemporal dementia in people with a genetic predisposition to the condition as medical evaluations performed in clinics.
  • Unsegment Anything by Simulating Deformation. A novel strategy called “Anything Unsegmentable” aims to prevent digital photos from being divided into discrete categories by potent AI models, potentially resolving copyright and privacy concerns.
https://arxiv.org/pdf/2404.02585v1.pdf
https://qwenlm.github.io/blog/qwen1.5-32b/
  • Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions. The potential of Healthcare Foundation Models (HFMs) to transform medical services is examined in this extensive survey. These models are well-suited to adapt to different healthcare activities since they have been pre-trained on a variety of data sets. This could lead to an improvement in intelligent healthcare services in a variety of scenarios.
  • SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing. A new algorithm called SwapAnything may swap out objects in an image with other objects of your choosing without affecting the image’s overall composition. Compared to other tools, it is superior since it can replace any object, not only the focal point, and it excels at ensuring that the replaced object blends seamlessly into the original image. Pretrained diffusion models, idea vectors, and inversion are employed.
https://arxiv.org/pdf/2404.03602v1.pdf
https://arxiv.org/pdf/2404.04095v1.pdf
  • Measuring the Persuasiveness of Language Models. The Claude 3 Opus AI model was shown to closely resemble human persuasiveness in a study that looked at persuasiveness. Statistical tests and multiple comparison adjustments were used to ascertain this. Although not by a statistically significant amount, humans were marginally more convincing, highlighting a trend where larger, more complex models are becoming more credible. The most persuasive model was found to be Claude 3 Opus. The study’s methodological reliability was validated by a control condition that demonstrated predictable low persuasiveness for undisputed facts.
  • DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation. DreamView presents a novel method for turning text descriptions into 3D objects that may be extensively customized from various angles while maintaining the object’s overall consistency.
https://arxiv.org/pdf/2404.04050v1.pdf
https://arxiv.org/pdf/2404.03264v1.pdf

News

https://swap-anything.github.io/
https://spectrum.ieee.org/ai-benchmark-mlperf-llama-stablediffusion
  • Qwen1.5–32B: Fitting the Capstone of the Qwen1.5 Language Model Series. A growing consensus within the field now points to a model with approximately 30 billion parameters as the optimal “sweet spot” for achieving both strong performance and manageable resource requirements. In response to this trend, we are proud to unveil the latest additions to our Qwen1.5 language model series: Qwen1.5–32B and Qwen1.5–32B-Chat.
  • Nvidia Tops Llama 2, Stable Diffusion Speed Trials . Now that we’re firmly in the age of massive generative AI, it’s time to add two such behemoths, Llama 2 70B and Stable Diffusion XL, to MLPerf’s inferencing tests. Version 4.0 of the benchmark tests more than 8,500 results from 23 submitting organizations. As has been the case from the beginning, computers with Nvidia GPUs came out on top, particularly those with its H200 processor. But AI accelerators from Intel and Qualcomm were in the mix as well.
https://uni-fl.github.io/
https://developers.googleblog.com/2024/04/gemma-family-expands.html
  • Hugging Face TGI Reverts to Open Source License. Hugging Face temporarily granted a non-commercial license for their well-known and potent inference server in an effort to deter bigger companies from running a rival offering. While community involvement decreased, business outcomes remained unchanged. It is now back to a license that is more liberal.
  • Securing Canada’s AI advantage. To support Canada’s AI industry, Prime Minister Justin Trudeau unveiled a $2.4 billion investment package beginning with Budget 2024. The package comprises tools to enable ethical AI adoption, support for AI start-ups, and financing for computational skills. These policies are intended to maintain Canada’s competitive advantage in AI globally, boost productivity, and hasten the growth of jobs. The money will also be used to fortify the Artificial Intelligence and Data Act’s enforcement as well as establish a Canadian AI Safety Institute.
https://arxiv.org/pdf/2312.12133v1.pdf
  • Yahoo is buying Artifact, the AI news app from the Instagram co-founders. Instagram’s co-founders built a powerful and useful tool for recommending news to readers — but could never quite get it to scale. Yahoo has hundreds of millions of readers — but could use a dose of tech-forward cool to separate it from all the internet’s other news aggregators.
  • Now there’s an AI gas station with robot fry cooks. There’s a little-known hack in rural America: you can get the best fried food at the gas station (or in the case of a place I went to on my last road trip, shockingly good tikka masala). Now, one convenience store chain wants to change that with a robotic fry cook that it’s bringing to a place once inhabited by a person who may or may not smell like a recent smoke break and cooks up a mean fried chicken liver.
  • Elon Musk predicts superhuman AI will be smarter than people next year. His claims come with a caveat that shortages of training chips and growing demand for power could limit plans in the near term
  • Gemma Family Expands with Models Tailored for Developers and Researchers. Google announced the first round of additions to the Gemma family, expanding the possibilities for ML developers to innovate responsibly: CodeGemma for code completion and generation tasks as well as instruction following, and RecurrentGemma, an efficiency-optimized architecture for research experimentation.
https://github.com/FoundationVision/VAR
https://www.tomshardware.com/pc-components/cpus/intel-details-guadi-3-at-vision-2024-new-ai-accelerator-sampling-to-partners-now-volume-production-in-q3
  • Apple’s new AI model could help Siri see how iOS apps work. Apple’s Ferret LLM could help allow Siri to understand the layout of apps in an iPhone display, potentially increasing the capabilities of Apple’s digital assistant. Apple has been working on numerous machine learning and AI projects that it could tease at WWDC 2024. In a just-released paper, it now seems that some of that work has the potential for Siri to understand what apps and iOS itself looks like.
  • Aerospace AI Hackathon Projects. Together, 200 AI and aerospace experts created an amazing array of tools, including AI flight planners, AI air traffic controllers, and Apple Vision Pro flight simulators, as a means of prototyping cutting-edge solutions for the aviation and space industries.
https://github.com/hetailang/squeezeattention
https://appleinsider.com/articles/24/04/09/apples-new-ai-model-could-help-siri-see-how-ios-apps-work
https://www.anthropic.com/news/measuring-model-persuasiveness
https://github.com/lamalab-org/chem-bench
  • Introducing Rerank 3: A New Foundation Model for Efficient Enterprise Search & Retrieval. Rerank 3, the newest foundation model from Cohere, was developed with enterprise search and Retrieval Augmented Generation (RAG) systems in mind. The model may be integrated into any legacy program with built-in search functionality and is compatible with any database or search index. With a single line of code, Rerank 3 can improve search speed or lower the cost of running RAG applications with minimal effect on latency.
  • Meta to broaden labeling of AI-made content. Meta admits its current labeling policies are “too narrow” and that a stronger system is needed to deal with today’s wider range of AI-generated content and other manipulated content, such as a January video that appeared to show President Biden inappropriately touching his granddaughter.
https://arxiv.org/pdf/2404.06119v1.pdf
https://adamdad.github.io/hash3D/

Resources

  • swe agents. SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can fix bugs and issues in real GitHub repositories.
  • Schedule-Free Learning.Faster training without schedules — no need to specify the stopping time/steps in advance!
  • State-of-the-art Representation Fine-Tuning (ReFT) methods. ReFT is a novel approach to language model fine-tuning that is efficient with parameters. It achieves good performance at a significantly lower cost than even PeFT.
  • The Top 100 AI for Work — April 2024. Following our AI Top 150, we spent the past few weeks analyzing data on the top AI platforms for work. This report shares key insights, including the AI tools you should consider adopting to work smarter, not harder.
  • LLocalSearch. LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.
https://arxiv.org/pdf/2404.06842v1.pdf
  • llm.c. LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of CPython. For example, training GPT-2 (CPU, fp32) is ~1,000 lines of clean code in a single file. It compiles and runs instantly, and exactly matches the PyTorch reference implementation.
  • AIOS: LLM Agent Operating System. AIOS, a Large Language Model (LLM) Agent operating system, embeds a large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system “with soul” — an important step towards AGI. AIOS is designed to optimize resource allocation, facilitate context switch across agents, enable concurrent execution of agents, provide tool service for agents, maintain access control for agents, and provide a rich set of toolkits for LLM Agent developers.
  • Anthropic Tool use (function calling). Claude AI may now communicate with customized client-side tools supplied in API requests thanks to the public beta that Anthropic has released. To utilize the feature, developers need to include the ‘anthropic-beta: tools-2024–04–04’ header. Provided that each tool has a comprehensive JSON structure, Claude’s capability can be expanded.
https://txt.cohere.com/command-r-plus-microsoft-azure/
  • Flyflow. Flyflow is API middleware to optimize LLM applications, same response quality, 5x lower latency, security, and much higher token limits
  • ChemBench. LLMs gain importance across domains. To guide improvement, benchmarks have been developed. One of the most popular ones is BIG-bench which currently only includes two chemistry-related tasks. The goal of this project is to add more chemistry benchmark tasks in a BIG-bench-compatible way and develop a pipeline to benchmark frontier and open models.
  • Longcontext Alpaca Training. On an H100, train more than 200k context windows using a new gradient accumulation offloading technique.
  • attorch. attorch is a subset of PyTorch’s NN module, written purely in Python using OpenAI’s Triton. Its goal is to be an easily hackable, self-contained, and readable collection of neural network modules whilst maintaining or improving upon the efficiency of PyTorch.
https://txt.cohere.com/rerank-3/
  • Policy-Guided Diffusion. A novel approach to agent training in offline environments is provided by policy-guided diffusion, which generates synthetic trajectories that closely match target policies and behavior. By producing more realistic training data, this method greatly enhances the performance of offline reinforcement learning models.
  • Ada-LEval. Ada-LEval is a pioneering benchmark to assess the long-context capabilities with length-adaptable questions. It comprises two challenging tasks: TSort, which involves arranging text segments into the correct order, and BestAnswer, which requires choosing the best answer to a question among multiple candidates.

Perspectives

https://cnbc.com/2024/04/03/waymo-self-driving-cars-are-delivering-uber-eats-orders-for-first-time.html
  • Can Demis Hassabis Save Google? Demis Hassabis, the founder of DeepMind, is currently in charge of Google’s unified AI research division and hopes to keep the tech behemoth ahead of the competition in the field with innovations like AlphaGo and AlphaFold. Notwithstanding the achievements, obstacles nonetheless exist in incorporating AI into physical goods and rivalry from organizations like OpenAI’s ChatGPT. Having made a substantial contribution to AI, Hassabis now has to work within Google’s product strategy in order to make use of DeepMind’s research breakthroughs.
  • Is ChatGPT corrupting peer review? Telltale words hint at AI use. A study of review reports identifies dozens of adjectives that could indicate text written with the help of chatbots.
https://www.flexos.work/learn/top-100-ai-for-work
https://github.com/emptyjackson/policy-guided-diffusion

Medium articles

A list of the Medium articles I have read and found the most interesting this week:

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

or you may be interested in one of my recent articles:

--

--

Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence