WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 5–11 August

Microsoft Says OpenAI is Now a Competitor in AI and Search, Google Broke the Law to Maintain Online Search Monopoly, US Judge Rules, and much more

Salvatore Raieli
19 min readAug 13, 2024
Photo by Jorge Gardner on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. Single posts are also collected here:

Weekly AI and ML news - each week the best of the field

44 stories

Research

  • Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge. Using this LLM-as-a-Meta-Judge approach enhances the LLM’s ability to judge and follow instructions; simply self-improvement to produce better responses (act) saturates quickly; this work enhances the LLM’s ability to judge itself (judge) to avoid issues like reward hacking; in addition to the act and judge roles, a third role called meta-judge is used to evaluate the model’s judgments. This approach, known as meta-rewarding LLMs, proposes a self-improving alignment technique (no human supervision) where the LLM judges its judgments and uses the feedback to improve its judgment skills.
  • MindSearch: Mimicking Human Minds Elicits Deep AI Searcher. In MindSearch, a multi-agent framework based on LLM is presented for complex web-information seeking and integration tasks. A web planner is utilized to efficiently break down complex queries, while a web searcher performs hierarchical information retrieval on the Internet to enhance the relevance of the retrieved information. An iterative graph construction is employed in the planning component to better model complex problem-solving processes. The multi-agent framework is better suited for handling long context problems by assigning retrieval and reasoning tasks to specialized agents.
  • Improving Retrieval Augmented Language Model with Self-Reasoning. Enhanced RAG through Self-Reasoning — utilizes the reasoning trajectories produced by the LLM itself to offer an end-to-end self-reasoning framework that enhances the dependability and traceability of RAG systems. The LLM is utilized to do the following three procedures: This method helps the model be more selective, reason and distinguish relevant and irrelevant documents, thus improving the accuracy of the RAG system as a whole. 1) Relevance-aware: evaluates the relevance between the retrieved documents and the question; 2) Evidence-aware selective: selects and cites relevant documents, and then automatically selects key sentence snippets as evidence from the cited documents; and 3) Trajectory analysis: generates a concise analysis based on all gathered self-reasoning trajectories generated by the preceding 2 processes, and then provides the final inferred answer. Using only 2,000 training examples, the framework outperforms GPT-4. (produced by GPT-4)
  • Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost. Constrained-CoT, a model that restricts the reasoning output length without compromising performance, demonstrates that increasing the LLaMA2–70b’s reasoning limit to 100 words increases accuracy on GSM8K from 36.01% (CoT) to 41.07% (CCoT) while lowering the average output length by 28 words.
  • ThinK: Thinner Key Cache by Query-Driven Pruning. ThinK focuses on long-context scenarios and inference; it offers a query-dependent KV cache pruning method to minimize attention weight loss while selectively pruning the least important channels. HinK — aims to address inefficiencies in KV cache memory consumption.
  • Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. A group of scientists discovered that benchmark performance can be significantly improved at a 3x lower cost than with a larger model if you sample from tiny models regularly, provided that you have adequate coverage and a verification tool.
  • Boosting Audio Visual Question Answering via Key Semantic-Aware Cues. A Temporal-Spatial Perception Model (TSPM) has been established by researchers to enhance the capacity to respond to inquiries concerning auditory and visual signals in videos.
  • No learning rates needed: Introducing SALSA — Stable Armijo Line Search Adaptation. This work presents enhancements to line search strategies that improve the efficiency of stochastic gradient descent systems.
  • Automated Review Generation Method Based on Large Language Models. Utilizing LLMs, researchers have created an automated approach for generating reviews to assist in managing the massive amount of scientific material.
  • CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning. CLEFT is a Contrastive Learning technique meant for medical imaging that aims to overcome the drawbacks of current, resource-intensive CLIP-like methods.
  • Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. To boost model performance, there is a lot of demand to leverage computation at inference time. This essay explores the trade-offs made between various approaches and presents several useful ones. This often suggests a larger trend of getting more performance out of smaller machines.
  • An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion. It is easy to utilize a DiT model to generate unique things based on textual inputs by treating 3D objects as UV-wrapped images.

News

Resources

Perspectives

  • existential risk probabilities are too unreliable to inform policy. The use of AI existential risk probability estimates for policymaking is criticized in this essay, which contends that these estimates are excessively erratic and lack a strong inductive or deductive foundation, frequently approximating educated guesses rather than fact-based projections. The authors argue against the validity of using these projections to inform public policy, particularly when they are connected to expensive or restricting measures, and they support an evidence-based strategy that takes AI development uncertainty into account. They advise against utilizing speculative existential risk probability in high-impact decisions and instead suggest concentrating on specified AI milestones for more significant policy choices.
  • Is AI judging the future of gymnastics or just a surveillance tool? To provide more equitable and transparent scoring, the International Gymnastics Federation (FIG) and Fujitsu have partnered to provide an AI-assisted judging support system at the World Gymnastics Championships. With room for future development and wider uses, the Judging Support System (JSS), which will not take the place of judges, provides 3D model-based second views in challenging cases and inquiry disagreements. The JSS may improve scoring accuracy and consistency, which is important in a sport where even small point variations have a significant impact on standings and players’ careers, despite worries that it may replace human judgment.
  • Why AI’s Tom Cruise problem means it is ‘doomed to fail’. LLMs’ ‘reversal curse’ leads it to fail at drawing relationships between simple facts. It’s a problem that could prove fatal
  • Sound clashes are a thrilling reggae tradition. Will AI ruin them?The use of fake AI vocals — including those of Donald Trump — is sending shockwaves through this historic scene. At a Montego Bay clash, performers debate their culture’s future
  • Replacing my Right Hand with AI. While riding a bike, an anthropic scientist broke their hand. They continued to be incredibly productive by leaning into Claude and his voice.
  • TPU transformation: A look back at 10 years of our AI-specialized chips. Because it has invested in bespoke TPU chips, Google is one of the only companies training massive models without being dependent on Nvidia.
  • I’m Switching Into AI Safety. Alex Irpan left Google’s robotics team after eight years to join Google DeepMind’s AI safety team. His move was motivated by a personal desire to address safety concerns as AI systems get closer to being superhuman. Though the area is difficult and fraught with controversy, they voice concerns about the effectiveness of present AI safety measures, the growing risks of unmanaged AI growth, and their dedication to contributing to AI safety.
  • As Regulators Close In, Nvidia Scrambles for a Response. With a 90 percent share of the A.I. chip market, the company is facing antitrust investigations into the possibility that it could lock in customers or hurt competitors.
  • How GitHub harnesses AI to transform customer feedback into action. GitHub is using AI and machine learning to compile and evaluate user input at scale, providing useful insights that drive feature prioritization and product enhancements. This automated method improves responsiveness to developer needs by facilitating the collection of multilingual input and promoting data-driven decision-making. The project demonstrates GitHub’s dedication to utilizing AI to uphold a developer-centric approach to product development.
  • How Does OpenAI Survive? The paper expresses a strong doubt regarding the sustainability of OpenAI, given the exorbitant costs associated with constructing and maintaining huge language models, as well as the absence of broad business utility for generative AI. The long-term sustainability of OpenAI is questioned by the author in the absence of substantial technology advancements or persistent, extraordinary fundraising efforts. Even though OpenAI has had a significant impact on the AI sector, the business still has issues with profitability, high operational burn rates, and a reliance on key alliances, most notably Microsoft.
  • How neurons make a memory. Loosely packaged DNA might make these nerve cells better able to encode memories.
  • DeepMind hits milestone in solving maths problems — AI’s next grand challenge. AlphaProof showed its prowess on questions from this year’s Mathematical Olympiad — a step in the race to create substantial proofs with artificial intelligence.
  • Dirty talk: how AI is being used in the bedroom — and beyond. Analysis of more than 200,000 chatbot conversations shows how the new tech is actually being used. Turns out quite a lot of it is ‘racy role play’
  • Scientists are falling victim to deepfake AI video scams — here’s how to fight back. Cybercriminals are increasingly singling out researchers, alongside politicians and celebrities. Targeted scientists share tips on how to silence them.
  • What lies beneath: the growing threat to the hidden network of cables that power the internet. Last month large parts of Tonga were left without internet when an undersea cable was broken. It’s a scenario that is far more common than is understood
  • Why AI hasn’t shown up in the GDP statistics yet. Even though LLMs have made remarkable strides in handling complicated tasks, they are still unable to reliably complete activities at a scale comparable to that of humans. As a result, their current potential as direct human substitutes in processes is limited. LLMs require comprehensive prompt engineering and iteration to reach acceptable accuracy. The latest JSON output control and cost reduction enhancements from OpenAI may help with certain problems, but the subtle integration needed for LLMs in corporate settings points to gradual productivity increases rather than a sudden economic revolution.
  • AI Is Coming for India’s Famous Tech Hub. AI integration is posing a danger to employment, particularly in routine operations like contact centers, which has caused a sea change in India’s technology outsourcing sector. While recruiting is slowing down, companies are finding it difficult to move up the value chain. However, some are optimistic that AI technologies may open up new opportunities in fields like programming. Higher-order cognitive abilities will be necessary in the sector going forward as automation continues to reshape traditional employment.
  • Inside the company that gathers ‘human data’ for every major AI company. Advances in AI pre-training have made it possible for models to handle large amounts of online data and supervised fine-tuning with specialists afterward aids in the models’ ability to become more specialized and general. The goal of Turing’s method is to improve AI reasoning capabilities by leveraging “input and output pairs” created by subject-matter experts. These models, foreseeing the “agentic” future of artificial intelligence, might integrate specialized knowledge across areas to accomplish complicated tasks independently.

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

or you may be interested in one of my recent articles:

--

--

Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence