WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 24–30 June

OpenAI on an acquisition spree, Anthropic new model, Amazon developing its own LLM, and much more

Salvatore Raieli
19 min readJul 1, 2024
Photo by Mattias Diesel on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. Single posts are also collected here:

Weekly AI and ML news - each week the best of the field

44 stories

Research

  • Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? reports that long-context LLMs can compete with state-of-the-art retrieval and RAG systems without explicit training on the tasks; suggests that compositional reasoning (needed in SQL-like tasks) is still challenging for these LLMs; and encourages further research on advanced prompting strategies. performs a thorough performance analysis of long-context LLMs on in-context retrieval and reasoning. first presents a benchmark with real-world tasks requiring 1M token context.
  • PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers. improves decision-making using the iterative plan-then-RAG (PlanRAG) technique, which consists of two steps: The last phase determines whether a new plan for additional analysis is required and repeats earlier steps or makes a decision based on the data. 1) An LM creates the plan for decision-making by reviewing the questions and data schema, and 2) the retriever creates the queries for data analysis; It is discovered that PlanRAG performs better than iterative RAG on the suggested Decision QA tasks.
https://arxiv.org/pdf/2406.12430
  • Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs. demonstrates how the goldfish loss resists memorization and keeps the model useful, but it may need to train for longer to more effectively learn from the training data. It is a modification of the next-token prediction objective called goldfish loss, which helps mitigate the verbatim generation of memorized training data. It uses a simple technique that excludes a pseudorandom subset of training tokens at training time.
  • Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B. report having used an approach that combines LLMs with Monte Carlo Tree Search to achieve a mathematical Olympiad solution at the GPT-4 level. This approach aims to improve the system’s performance in mathematical reasoning by enabling features like systematic exploration, self-refinement, and self-evaluation.
https://openai.com/index/openai-acquires-rockset/
https://arxiv.org/pdf/2406.17711
https://ai.meta.com/research/publications/meta-large-language-model-compiler-foundation-models-of-compiler-optimization/
https://www.theverge.com/2024/6/20/24181961/anthropic-claude-35-sonnet-model-ai-launch

News

https://arxiv.org/pdf/2406.16853v1
https://arxiv.org/pdf/2406.14848v1
https://multi.app/blog/multi-is-joining-openai
https://arxiv.org/pdf/2406.14537v1
https://stability.ai/news/stability-ai-secures-significant-new-investment
  • Snap Lense Studio 5.0.T he GenAI suite, which Snap introduced with Lens Studio 5.0, is a fantastic development and a huge help for creating augmented reality apps.
  • Instagram Launching An AI Studio. Instagram’s “AI Studio” enables developers to create self-aware AI chatbots. In the US, an early test of it is presently underway.
https://openai.com/chatgpt/mac/

Resources

  • Open-Sora 1.2 Report. a 1.1B parameter model trained on over 30 million data points, this open-source video generation model can produce 16-second 720p videos. It also features an improved diffusion model and video compression network for both temporal and spatial compression, which lowers training costs and improves the controllability of the generations.
  • LLM101n: Let’s build a Storyteller. An outline for a new course that Andrej Karpathy is working on can be found in a new repository. It entails creating a narrative-capable aligned language model. Code, video lectures, and other learning resources are included in the course.
  • AutoCodeRover: Autonomous Program Improvement. AutoCodeRover is a new technology that combines sophisticated code search methods with big language models to automate software enhancements, such as feature additions and problem fixes.
https://arxiv.org/pdf/2406.10209
  • NLUX. NLUX is a React and JavaScript open-source library for building conversational AI interfaces. It makes it super simple to build web applications powered by Large Language Models (LLMs) and AI. With just a few lines of code, you can add conversational AI capabilities and interact with your favorite AI models.
  • Claudette. Claudette is a higher-level and easier-to-use way to interact with Claude.
  • top CVPR 2024 papers. Computer Vision and Pattern Recognition is a massive conference. In 2024 alone, 11,532 papers were submitted, and 2,719 were accepted. I created this repository to help you search for crème de la crème of CVPR publications.
  • TTS in 7000 Languages. Recently, Toucan published a collection of new text-to-speech models that are now compatible with all ISO-639–3 standard languages.
  • ParaLLM: 1300+ tok/s on a MacBook. When batch parallel KV cache is implemented in MLX, inference times for the creation of synthetic data and model completions are significantly sped up.
  • Train vision models in TRL. Transformers can be trained using reinforcement learning with the help of TRL, a Hugging Face library. You may apply the same procedure for vision-based language models, such as LLaVA, using this example.
  • Rethinking Remote Sensing Change Detection With A Mask View. Two new models for remote sensing change detection — CDMask and CDMaskFormer — are presented in this study.
  • llama.ttf. This article explains how to use a font file to run a little Llama language model.
https://github.com/hpcaitech/Open-Sora/blob/main/docs/report_03.md
  • OpenGlass — Open Source Smart Glasses. Turn any glasses into hackable smart glasses with less than $25 of off-the-shelf components. Record your life, remember people you meet, identify objects, translate text, and more.
  • An Intuitive Explanation of Sparse Autoencoders for LLM Interpretability. The Golden Gate Claude served as a potent illustration of how to influence and evaluate models using SAEs. This work includes some sample code for training these models and an easy-to-understand explanation of how it operates.
  • RES-Q. A new benchmark called RES-Q is designed to evaluate how well huge language models can modify code repositories using instructions in natural language.
  • Balancing Old Tricks with New Feats: AI-Powered Conversion From Enzyme to React Testing Library at Slack. Using a hybrid method, Slack developers used AI Large Language Models with Abstract Syntax Tree transformations to automate the translation of more than 15,000 unit tests from Enzyme to React Testing Library. The team utilized Anthropic’s Claude 2.1 AI model in conjunction with DOM tree capture for React components to achieve an 80% success rate in automatic conversions. This ground-breaking project demonstrates Slack’s dedication to using AI to improve developer productivity and experience. It’s part of the continuous attempts to remain ahead of the always-changing frontend scene.
https://jykoh.com/search-agents/paper.pdf
  • R2R. R2R was designed to bridge the gap between local LLM experimentation and scalable, production-ready Retrieval-Augmented Generation (RAG). R2R provides a comprehensive and SOTA RAG system for developers, built around a RESTful API for ease of use.
  • Internist.ai 7b. Internist.ai 7b is a medical domain large language model trained by medical doctors to demonstrate the benefits of a physician-in-the-loop approach. The training data was carefully curated by medical doctors to ensure clinical relevance and required quality for clinical practice.
https://github.com/karpathy/LLM101n
https://github.com/nus-apr/auto-code-rover
  • Point-SAM: Promptable 3D Segmentation Model for Point Clouds. Point-SAM, a transformer-based 3D segmentation model, has been introduced by researchers in response to the increasing demand for comprehensive 3D data.
  • GenIR-Survey. This survey explores generative information retrieval (GenIR), a novel approach to information retrieval that shifts from conventional search techniques to ones that generate results dynamically.
https://github.com/nlkitai/nlux

Perspectives

  • The Long View on AI. AI has the potential to cause tremendous growth rates and technological improvements, according to historical statistics. Society will probably be able to adjust to these rapid changes just as it has in the past.
  • AI’s Hidden Opportunities: Shawn “swyx” Wang on New Use Cases and Career. Well-known developer Shawn “swyx” Wang discusses the untapped potential for conventional software professionals wishing to go into artificial intelligence. In particular, examining how to enhance existing tools, use AI to summarization, and more.
https://github.com/SkalskiP/top-cvpr-2024-papers
  • Apple Intelligence. Rather than developing stand-alone AI products, Apple has incorporated generative AI into its core apps, improving services like Mail classification, Safari summaries, and Siri’s functioning. This demonstrates the company’s focus on user control and privacy.
  • Apple intelligence and AI maximalism. Apple has shown a bunch of cool ideas for generative AI, but much more, it is pointing to most of the big questions and proposing a different answer — that LLMs are commodity infrastructure, not platforms or products.
  • How To Solve LLM Hallucinations. Lamini has created Memory Tuning, which effectively embeds particular facts into models without sacrificing general knowledge and reduces hallucinations by 95%.
https://arxiv.org/pdf/2406.14508
https://github.com/DigitalPhonetics/IMS-Toucan/releases/tag/v3.0
  • How I’m using AI tools to help universities maximize research impacts. Artificial intelligence algorithms could identify scientists who need support with translating their work into real-world applications and more. Leaders must step up.
  • The Future of LLM-Based Agents: Making the Boxes Bigger. Long-term planning and system-level resilience are two essential strategies that assist move Agents from the playground into the real world, and they are discussed in this post. These introduce the ability to create plans of a higher level for the Agents, allowing for adaptability in the middle of an episode. They also introduce systems techniques to intelligently orchestrate the models, resulting in increased performance and accuracy.
https://arxiv.org/pdf/2406.13930v1
  • Apple, Microsoft Shrink AI Models to Improve Them. Large language models are becoming less popular as IT companies shift their focus to more efficient small language models (SLMs). Apple and Microsoft have introduced models with far fewer parameters that nonetheless perform comparably or even better in benchmarks. According to the CEO of OpenAI, we’re past the LLM era since SLMs have benefits including greater accessibility for smaller entities, local device operation, and potential insights into human language acquisition. Even though SLMs are narrower in scope, their performance is enhanced by training them on high-quality, or “textbook-quality” data.
  • Are Tech-Enabled Vertical Roll-Ups the Future or the Past? The ability to generate excess cash flows through operational efficiencies is a prerequisite for roll-up methods. It’s possible that the development of AI offers a new lever that fully unlocks the roll-up strategy. Are rollups for SMBs and verticals the future? Two different perspectives on this issue are presented in this post.

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

or you may be interested in one of my recent articles:

--

--

Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence