WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 15–21 July

OpenAI and Mistral new models, Andrej Karpathy’s new company, and much more

Salvatore Raieli
19 min readJul 22, 2024
Photo by Abhijith S Nair on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. Single posts are also collected here:

Weekly AI and ML news - each week the best of the field

44 stories

Research

  • RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs. demonstrates how a Llama3-RankRAG significantly outperforms Llama3-ChatQA-1.5 and GPT-4 models on nine knowledge-intensive benchmarks. It also introduces a new instruction fine-tuning framework to perform effective context ranking and answering generation to enhance an LLM’s RAG capabilities. This framework makes use of a small ranking dataset to outperform existing expert ranking models.
  • Mixture of A Million Experts. aims to decouple computational cost from parameter count by efficiently routing to a large number of tiny experts through a learned index structure used for routing. It shows superior efficiency compared to dense FFW, coarse-grained MoEs, and Product Key Memory (PKM) layers. introduces a parameter-efficient expert retrieval mechanism that uses the product key technique for sparse retrieval from a million tiny experts.
https://arxiv.org/pdf/2407.02485v1
  • Reasoning in Large Language Models: A Geometric Perspective. establishes a relationship between the expressive power of LLMs and the density of their self-attention graphs; their analysis shows that the density of these graphs defines the intrinsic dimension of the inputs to the MLP blocks. investigates the reasoning of LLMs from a geometrical perspective; reports that a higher intrinsic dimension implies greater expressive capacity of the LLM.
  • Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps. Contextual Hallucinations Mitigation in LLMs: This paper presents a novel approach that both detects and reduces contextual hallucinations in LLMs (e.g., reduces by 10% in the XSum summarization task). It does this by building a hallucination detection model based on input features provided by the ratio of attention weights on the context vs. newly generated tokens (for each attention head). The theory behind this approach is that contextual hallucinations are related to the degree to which an LLM attends to the contextual information provided. Additionally, they suggest a decoding strategy that mitigates contextual hallucinations based on their detection method, and this can be applied to other models without requiring retraining.
https://mistral.ai/news/mistral-nemo/
  • RouteLLM. uses human preference data and data augmentation techniques in its training framework to improve performance and reduce costs by over two times in some cases, all while maintaining response quality. It suggests effective router models to dynamically choose between stronger and weaker LLMs during inference to achieve a balance between cost and performance.
  • Learning to (Learn at Test Time): RNNs with Expressive Hidden States. suggests new layers for sequence modeling that have linear complexity and an expressive hidden state; defines a hidden state as an ML model that can update even when tested; a two-layer MLP-based hidden state combined with a linear model is found to match or outperform baseline models such as Mamba, Transformers, and contemporary RNNs; the linear model is faster than Mamba in wall-clock time and matches Transformer at 8k context.
  • Physicochemical graph neural network for learning protein-ligand interaction fingerprints from sequence data. Predicting the binding affinity between small-molecule ligands and proteins is a key task in drug discovery; however, sequence-based methods are often less accurate than structure-based ones. Koh et al. develop a graph neural network using physicochemical constraints that discovers interactions between small molecules and proteins directly from sequence data and that can achieve state-of-the-art performance without the need for costly, experimental 3D structures.
https://arxiv.org/pdf/2407.04153
https://blog.fal.ai/auraflow/
https://arxiv.org/pdf/2407.02678

News

https://arxiv.org/pdf/2407.07061v2
  • Meet the AI Agent Engineer. At his company, Sierra, Bret Taylor, the Chairman of the Board of OpenAI, has created a new position called Agent Engineer. One of the first people in the role recently wrote a blog post describing the Sierra team’s view of agent engineering as a new field inside AI engineering.
  • OpenAI Revenue. An estimated $3.4 billion in revenue for OpenAI comes from its ChatGPT services.
https://arxiv.org/pdf/2407.07071
  • Taming the tail utilization of ads inference at Meta scale. Meta’s machine learning inference services saw a two-thirds decrease in failure rates, a 35% increase in computing efficiency, and a halving of p99 latency because to changes made in the tail utilization. With these improvements, Meta’s ad delivery systems are guaranteed to be able to manage growing workloads without requiring more resources and to uphold service level agreements. Predictive scaling and managing the machine learning model lifetime with Meta’s unified platform, IPnext, are examples of continuous improvement techniques.
  • Meta to reportedly launch largest Llama 3 model on July 23. Meta Platforms will release its largest Llama 3 model on July 23, The Information reported on Friday, citing an employee of the company. The new model, boasting 405 billion parameters, will be multimodal and capable of understanding and generating both images and text.
  • Quora’s Poe now lets users create and share web apps. Poe, Quora’s subscription-based, cross-platform aggregator for AI-powered chatbots like Anthropic’s Claude and OpenAI’s GPT-4o, has launched a feature called Previews that lets people create interactive apps directly in chats with chatbots.
https://arxiv.org/pdf/2407.06204
https://arxiv.org/pdf/2406.18665v2
https://github.com/jianghaiscu/lightendiffusion
https://arxiv.org/pdf/2407.04620
  • SciCode: A Research Coding Benchmark Curated by Scientists. The objective of coding models has always been HumanEval. It is essentially solved now. This benchmark is the next step forward in solving difficult science programming puzzles.
  • SmolLM — blazingly fast and remarkably powerful. This blog post introduces SmolLM, a family of state-of-the-art small models with 135M, 360M, and 1.7B parameters, trained on a new high-quality dataset. It covers data curation, model evaluation, and usage.
  • Benchmarking results for vector databases. Redis has released updated information on the best vector databases, measuring throughput and latency with the help of the industry-recognized Qdrant framework. Key findings include Redis achieving much higher queries per second and lower latency than Qdrant, Milvus, and Weaviate, and outperforming competitors by 62% for low-complexity datasets and by 21% for high-dimensional datasets.
https://mistral.ai/news/mathstral/
https://arxiv.org/pdf/2407.07614v1
https://mistral.ai/news/codestral-mamba/
https://github.com/ikeyang/vitime

Resources

  • A Survey on Mixture of Experts. a survey study on the Mixture of Experts (MoE), covering its technical specifications, open-source implementations, assessment methods, and practical uses.
  • Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence. a new framework to address several limitations in multi-agent frameworks such as integrating diverse third-party agents and adaptability to dynamic task requirements; introduces an agent integration protocol, instant messaging architecture design, and dynamic mechanisms for effective collaboration among heterogeneous agents.
https://arxiv.org/pdf/2407.09025
  • Meta 3D Gen. a new pipeline that can generate 3D assets from text in less than a minute, from start to finish. It incorporates cutting-edge parts like TextureGen and AssetGen to represent objects in three dimensions: view space, volumetric space, and UV space. It also achieves a 68% win rate compared to the single-stage model.
  • Challenges, evaluation and opportunities for open-world learning. Here we argue that designing machine intelligence that can operate in open worlds, including detecting, characterizing, and adapting to structurally unexpected environmental changes, is a critical goal on the path to building systems that can solve complex and relatively under-determined problems.
  • Machine learning-aided generative molecular design. Data-driven generative methods have the potential to greatly facilitate molecular design tasks for drug design.
https://github.com/robustnlp/derta
https://arxiv.org/pdf/2407.08966v1
  • Open-Canopy. A high-resolution (1.5 m) publicly available dataset called Open-Canopy is used to estimate canopy height over France.
  • crawlee-python. Crawlee — A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless modes. With proxy rotation.
https://github.com/fangyuanmao/pid
https://x.com/mutableai/status/1813815706783490055
  • Praison AI. Using prior agent frameworks as a springboard, Praison AI is a low-code, centralized framework with customizable features and human-agent interaction that makes it easier to create and manage multi-agent systems for a range of LLM applications.
  • Video Object Segmentation with World Knowledge. Reasoning Video Object Segmentation (ReasonVOS) is a new task that uses implicit text queries to generate segmentation masks. It requires complex reasoning and world knowledge.
  • Enhancing Class Learning Without Forgetting. In order to enhance Class-Incremental Semantic Segmentation (CISS), this project presents a background-class separation framework.
  • Leapfrogging traditional vector-based RAG with language maps. When developing a chat application over data, retrieval plays a major role. But frequently, systems are delicate to the format of the data being accessed. Chat-based performance is greatly enhanced by creating a language map (e.g., Wikipedia-style entry) of the material and using that for retrieval. This is how code-based question answering is handled by mutable AI.
  • Removing Inappropriate Content from Diffusion Models. Using a revolutionary technique called Reliable and Efficient Concept Erasure (RECE), improper content may be removed from diffusion models in only three seconds without requiring additional fine-tuning.LLM2sh.A command-line tool called LLM2sh uses LLMs to convert requests written in plain English into shell instructions.
https://github.com/randombk/llm2sh
  • GraphMuse. GraphMuse is a Python Library for Graph Deep Learning on Symbolic Music. This library intends to address Graph Deep Learning techniques and models applied specifically to Music Scores.
  • E5-V: Universal Embeddings with Multimodal Large Language Models. A novel framework called E5-V modifies Multimodal Large Language Models (MLLMs) to provide multimodal embeddings that are universal. With prompts, it bridges the gap between various input formats and achieves remarkable results in multimodal activities without the need for fine-tuning.
  • Strategizing Your Preparation for Machine Learning Interviews. Interviews for machine learning might be difficult. You may greatly increase your chances by being aware of the range of machine learning positions and adjusting your preparation to fit particular job duties and specializations. To approach interviews with confidence, concentrate on learning the fundamentals, investigating technology unique to the organization, and regularly monitoring your progress.
  • Uncensor Any LLM With Abliteration. For safety, llama models are heavily restricted, which reduces their versatility. Through the identification and elimination of the rejection mechanism, the “abliteration” technique uncensored them, enabling models to respond to all stimuli without requiring retraining.
  • SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers. SPIQA is a quality assurance dataset created to assist users in rapidly locating solutions within scientific research publications by deciphering intricate figures and tables.

Perspectives

  • AI’s ‘Oppenheimer moment’: autonomous weapons enter the battlefield. The military use of AI-enabled weapons is growing, and the industry that provides them is booming
  • Will generative AI transform robotics? In the current wave of excitement about applying large vision–language models and generative AI to robotics, expectations are running high, but conquering real-world complexities remains challenging for robots.
  • Introducing: The Managed-Service-as-Software (M-SaS) Startup. AI-driven, service-oriented firms are creating Managed-Service-as-Software (M-SaS) enterprises, which follow a new business model blueprint in building their businesses. Startups need to adopt a fundamentally different attitude to use AI instead of selling it. These firms start off labor-intensive with low gross margins and then use automation and artificial intelligence (AI) to progressively move to greater SaaS-like gross margins.
https://huggingface.co/blog/smollm
  • Could AIs become conscious? Right now, we have no way to tell. With divergent opinions on whether developments in machine learning and neuromorphic computing can result in sentient computers, the discussion over artificial intelligence potentially gaining awareness is becoming more heated. The theory of Integrated Information holds that the current hardware limits make AI consciousness implausible, while computational functionalist theories such as Global Neuronal Workspace Theory and Attention Schema Theory believe that AI awareness is inevitable. Neuroscience is trying to come up with a single theory of consciousness in order to better understand how it might show up in AI.
  • Generative AI makes for better scientific writing — but beware the pitfalls. As researchers who have sometimes struggled with articulating intricate concepts, we find his suggestions for using ChatGPT to improve the clarity and coherence of academic papers compelling. But potential pitfalls warrant further discussion.
https://github.com/manoskary/graphmuse
  • My trip to the frontier of AI education. First Avenue Elementary School in Newark is utilizing Khanmigo, an AI-powered tutor and teacher assistant created by Khan Academy, to include AI tools for education. Teachers in the classroom can customize instruction and cut down on work time by using this technology. The goal of increasing responsiveness and inclusion is a continuous endeavor. Through increased teacher-student involvement, this Gates Foundation-backed project seeks to level the playing field in education.
  • AI-Driven Behavior Change Could Transform Health Care. Thrive AI Health is being funded by OpenAI and Thrive Global to create a customized AI health coach that addresses everyday health-related behaviors like nutrition and sleep. AI’s hyper-personalization powers the mobile app and corporate solution by fusing individual data with peer-reviewed science. The project intends to manage chronic diseases, democratize healthy behavior modification, and show how effectively AI can be integrated into healthcare while maintaining robust privacy protections.
  • GraphRAG Analysis, Part 1: How Indexing Elevates Knowledge Graph Performance in RAG. Analysis of Microsoft’s GraphRAG research suggests that knowledge graphs like Neo4j may not significantly beat FAISS in context retrieval for RAG applications. While Neo4j without its indexing can reach a better answer relevancy, the minor advantages may not justify the cost given ROI limits. Neo4j’s indexing, on the other hand, significantly improves answer faithfulness, lowering the possibility of false information.
https://redis.io/blog/benchmarking-results-for-vector-databases/

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

or you may be interested in one of my recent articles:

--

--

Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence