WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 19 — 25 August

Google’s upgraded AI image generator is now available, Waymo is developing a roomier robotaxi with less-expensive tech, Authors sue Anthropic for copyright infringement over AI training and much more

Salvatore Raieli
20 min readAug 26, 2024
Photo by Myznik Egor on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. Single posts are also collected here:

Weekly AI and ML news - each week the best of the field

44 stories

Research

  • The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. a novel artificial intelligence (AI) agent that, for less than $15, can develop and write a full conference-level scientific paper; it automates scientific discovery by empowering frontier LLMs to conduct independent research and summarize findings; it also uses an automated reviewer to assess the papers it generates; it claims to achieve near-human performance in assessing paper scores; and it claims to generate papers that, according to their automated reviewer, surpass the acceptance threshold at a premier machine learning conference.
  • LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs. suggests AgentWrite as a way to allow off-the-shelf LLMs to produce coherent outputs longer than 20K words. AgentWrite divides the long generation task into smaller tasks and uses a divide-and-conquer strategy to produce the outputs; the agent then splits the task into smaller writing subtasks and concatenates the outputs to produce a final output (i.e., plan + write). This method is then used to create SFT datasets, which are used to tune LLMs to produce coherent longer outputs automatically; a 9B parameter model, further enhanced through DPO, achieves state-of-the-art performance on their benchmark and outperforms proprietary models.
  • EfficientRAG: Efficient Retriever for Multi-Hop Question Answering. trains a filter model to formulate the next-hop query based on the original question and previous annotations; this is done iteratively until all chunks are tagged as or the maximum # of iterations is reached; after the above process has gathered enough information to answer the initial question, the final generator (an LLM) generates the final answer. trains an auto-encoder LM to label and tag chunks; it retrieves relevant chunks, tags them as either or , and annotates chunks for continuous processing.
  • RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation. a detailed assessment methodology for RAG retrieval and generating module diagnosis; demonstrates that RAGChecker exhibits superior correlations with human judgment; presents multiple illuminating patterns and trade-offs in RAG architecture design decisions.
  • HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction. integrates VectorRAG and GraphRAG to create a HybridRAG system that performs better than either one separately; it was tested on a set of transcripts from financial earning calls. When the benefits of both methods are combined, questions can be answered with more accuracy.
  • Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers. introduces self-play mutual reasoning to enhance small language models’ reasoning powers without the need for better models or fine-tuning; To create richer reasoning trajectories, MCTS is enhanced with human-like reasoning actions derived from SLMs; The target SLM chooses the last reasoning trajectory as the solution, while another SLM offers unsupervised input on the trajectories; For LLaMA2–7B, rStar increases GSM8K accuracy from 12.51% to 63.91% while steadily raising other SLM accuracy.
  • Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. explores how inference-time computation in LLMs scales. Specifically, it examines how much an LLM can be improved given a fixed amount of inference-time compute; it discovers that the efficacy of various scaling strategies varies by prompt difficulty; it then suggests an adaptive compute-optimal strategy that can increase efficiency by more than 4x when compared to a best-of-N baseline; it reports that optimally scaling test-time compute can outperform a 14x larger model in a FLOPs-matched evaluation.
  • Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation. a graph-based framework for the medical domain that improves LLMs and produces evidence-based results; makes use of chunk documents and a hybrid static-semantic approach to enhance context capture; uses graphs to represent entities and medical knowledge, creating an interconnected global graph; This method outperforms cutting-edge models and increases precision across several medical Q&A metrics.
  • BAM dense to MoE Upcycling. By using this technique, the FFN and Attention layers of dense models can be recycled into a Mixture of Experts (MoE) model for additional training. This preserves downstream performance while saving a significant amount of computing expense.
  • BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning. Backdoor attacks can be incorporated into medical foundation models using the BAPLe technique during the prompt learning stage.
  • ShortCircuit: AlphaZero-Driven Circuit Design. AI-powered automation and optimization of chip design can lower costs while satisfying the need for more powerful chips. Using an Alpha Zero-based approach, this method was tested on numerous circuits and produced small and effective designs with an 84.6% success rate.
  • Automated Design of Agentic Systems. This study examines the fragility of current agent systems and explores potential future directions for the design of learning systems. Programming languages are used by their creators as a testbed where unsupervised agent creation and execution are possible.
  • Loss of plasticity in deep continual learning. The pervasive problem of artificial neural networks losing plasticity in continual-learning settings is demonstrated and a simple solution called the continual backpropagation algorithm is described to prevent this issue.
  • Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model. Incredible new model from Meta that performs diffusion and next token prediction on text and image interleaving. It performs comparably to earlier generation devices like Dalle 2 and Llama 2 in benchmark tests for text and graphics.
  • To Code, or Not To Code? Exploring Impact of Code in Pre-training. The industry keeps this to itself, although pretraining models on code aid in their generalization to other reasoning-intensive activities. This Cohere study investigates that issue in detail and demonstrates that code may be used as a foundational element of thinking in a variety of contexts.

News

https://levelup.gitconnected.com/can-ai-replace-human-researchers-50fcc43ea587

Resources

Perspectives

  • ‘Threads is just deathly dull’: have Twitter quitters found what they are looking for on other networks? There’s been an exodus of users from X, propelled by Elon Musk’s lurch to the far right, but the alternatives have drawbacks too
  • Five ways the brain can age: 50,000 scans reveal possible patterns of damage. Results raise hopes that methods could be developed to detect the earliest stages of neurodegenerative disease.
  • An AI Empire. As AI develops, mankind may surpass other species as the most intelligent on Earth. AGI may not be far off, as it might allow AI research to be replicated on a never-before-seen scale. The exponential rise in computing suggests that humans will soon become significantly less relevant as AI takes over. Despite possible roadblocks in AI development, society might not be prepared for such a significant transformation.
  • What does Bitcoin smell like? AI startup wants to ‘teleport’ digital scents. A firm focused on artificial intelligence called Osmo is creating technology that will allow computers to recognize and replicate smells, which might help with disease detection and digital scent communication. Scent detection lacks a defined “smell map,” which makes it more difficult for the team to create a molecular bond scent database than audiovisual AI advancements. Osmo’s applications, which integrate olfactory sensations, have the potential to transform digital marketing and medical diagnostics.
  • Eric Schmidt’s AI prophecy: The next two years will shock you. In the next years, former Google CEO Eric Schmidt believes that artificial intelligence will evolve quickly and might produce important apps similar to TikTok rivals in a matter of minutes. He draws attention to the unpredictable and rapid advancements in AI, noting the possibility of massive technological and economic disruption from the convergence of agent-based systems with text-to-action capabilities and big language models. Schmidt’s perspective indicates a revolutionary age ahead, reflecting the significant investments and energy requirements expected for cutting-edge AI development.
  • Why Neuralink’s Blindsight and Brain Implants to restore sight won’t work like human eyesight. This study emphasizes the difficulties in using AI-powered cortical implants to restore vision by highlighting the fact that neurons in the visual cortex do not behave like pixels on a screen. Although high-resolution simulations are promising, cortical implants cannot achieve genuine vision since doing so would entail reproducing intricate neural patterns, which is far beyond the capabilities of present technology and will result in pixelated and subpar images.
  • A Personalized Brain Pacemaker for Parkinson’s. Researchers have created an adaptive method of deep brain stimulation that greatly shortens the duration of symptoms by adjusting electrical pulses to the various symptoms experienced by Parkinson’s sufferers.
  • Why Diffusion could help LLMs reason. Present-day language models anticipate words one at a time, leaving very little opportunity for reasoning and planning. This can be avoided by using techniques like Chain of Thought prompting. To enhance model reasoning, diffusion models — which have the capacity to spend more diffusion steps per token — might be used.
  • AI companies are pivoting from creating gods to building products. Good. The preparedness of generative AI for broad commercial applications has been overstated by AI businesses, which has resulted in expensive errors in product development and market integration. They have five major obstacles to overcome to change direction: making sure that the system is affordable, boosting security and safety, protecting privacy, and optimizing user interfaces. These challenges draw attention to the discrepancy between the potential of AI and the actual difficulties in implementing AI systems that satisfy user expectations and fit in with current processes. Rather than occurring in the quick timeframe some have projected, the route to broad adoption will probably take ten years or longer.
  • Has your paper been used to train an AI model? Almost certainly. Artificial intelligence developers are buying access to valuable data sets that contain research papers — raising uncomfortable questions about copyright.
  • The testing of AI in medicine is a mess. Here’s how it should be done. Hundreds of medical algorithms have been approved on the basis of limited clinical data. Scientists are debating who should test these tools and how best to do it.
  • Light bulbs have energy ratings — so why can’t AI chatbots? The rising energy and environmental cost of the artificial intelligence boom is fuelling concern. Green policy mechanisms that already exist offer a path towards a solution.
  • How the human brain creates cognitive maps of related concepts. Neural activity in human brains rapidly restructures to reflect hidden relationships needed to adapt to a changing environment. Surprisingly, trial-and-error learning and verbal instruction induce similar changes.
  • Switching between tasks can cause AI to lose the ability to learn. Artificial neural networks become incapable of mastering new skills when they learn them one after the other. Researchers have only scratched the surface of why this phenomenon occurs — and how it can be fixed.
  • Markov chains are funnier than LLMs. This article explores LLM predictability and its limitations when it comes to producing humor. It makes the case that although LLMs are excellent at producing text that is appropriate for the context, their predictive nature renders them unsuitable for humorous writing, which depends on unexpectedness.
  • AI at Work Is Here. Now Comes the Hard Part. In the last six months, the use of generative AI has almost doubled globally, with 75% of knowledge workers currently using it.
  • AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work. This is a lengthy and comprehensive overview of the research that DeepMind is doing on AGI safety and alignment.
  • The newest weapon against mosquitoes: computer vision. Developments in computer vision are helping combat malaria by enabling applications such as VectorCam, which facilitates fast identification of mosquito species and data gathering. The Gates Foundation helped develop the app, which can identify species that transmit malaria and aid in improving disease control tactics. Innovative mosquito surveillance techniques are essential for the tactical use of pesticides and other mitigating actions.
  • Fields that I reference when thinking about AI takeover prevention. This article compares fields battling insider threats with AI control, offering ideas on developing and assessing strong AI safety measures. It emphasizes how much more control developers have over AIs than they do over people, but it also points out that, in contrast to humans, AI dishonesty can be endemic. AI control is different mainly because it is adversarial and doesn’t involve complicated system interactions, even though it is influenced by different domains such as physical security and safety engineering.
  • ‘Never summon a power you can’t control: Yuval Noah Harari on how AI could threaten democracy and divide the world. Forget Hollywood depictions of gun-toting robots running wild in the streets — the reality of artificial intelligence is far more dangerous, warns the historian and author in an exclusive extract from his new book

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

or you may be interested in one of my recent articles:

--

--

Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence