WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

ML news: Week 20–26 November

Salvatore Raieli

9 min readNov 28, 2023

The OpenAI’s infinite saga, a new Orca in LLM’see, and much more

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

GitHub — SalvatoreRa/ML-news-of-the-week: A collection of the the best ML news every week…

A collection of the the best ML news every week (research, news, resources) — GitHub — SalvatoreRa/ML-news-of-the-week…

github.com

You will find the news first in GitHub. Single posts are also collected here:

Salvatore Raieli

Weekly AI and ML news - each week the best of the field

View list

49 stories

Research

MILA: Memory-Based Instance-Level Adaptation for Cross-Domain Object Detection. Cross-domain object detection is challenging, and it involves aligning labeled source and unlabeled target domains. we propose a memory-based instance-level domain adaptation framework. Our method aligns a target instance with the most similar source instance of the same category retrieved from memory storage. official code.

MILA: Memory-Based Instance-Level Adaptation for Cross-Domain Object Detection.

TopoMLP: An Simple yet Strong Pipeline for Driving Topology Reasoning. TopoMLP is a system that detects and analyzes traffic features and road centerlines to comprehend road scenes and identify drivable courses for self-driving automobiles. official code.
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model. In this study, several data optimization strategies that need less computational overhead to enable knowledge transfer across models are examined.

TopoMLP: An Simple yet Strong Pipeline for Driving Topology Reasoning.

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. StyleTTS 2 is a text-to-speech model that combines huge speech language models with adversarial training and style diffusion to produce human-level voice synthesis. official code.
Orca 2: Teaching Small Language Models How to Reason. A few months ago, we introduced Orca, a 13-billion language model that demonstrated strong reasoning abilities by imitating the step-by-step reasoning traces of more capable LLMs.Orca 2 significantly surpasses models of similar size (including the original Orca model) and attains performance levels similar to or better than models 5–10 times larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings.

Orca 2: Teaching Small Language Models How to Reason.

Proving Test Set Contamination in Black Box Language Models. a thorough examination of the data that was utilized to train language models. Its findings imply that a large number of closed-source models most likely did not train on widely used benchmarks.
Amazon Reportedly Training AI With Twice As Many Parameters As GPT-4 . The model will have a whopping 2 trillion parameters, which are the variables that determine the output of a given model, making it one of the largest currently in development.

News

Discord is shutting down its AI chatbot Clyde. Discord users won’t be able to chat with Clyde from December 1st onwards.
OpenAI has put ChatGPT Plus sign-ups on pause. After announcing premium-tier users can build their own chatbots, CEO Sam Altman says its Plus subscription has exceeded capacity
OpenAI Staff Threatens Exodus, Jeopardizing Company’s Future. A board member who was part of Sam Altman’s ouster as chief executive joined a majority of the company’s staff in calling for the decision’s reversal.
Sam Altman is still trying to return as OpenAI CEO. Altman’s move to Microsoft isn’t a done deal, and Ilya Sutskever’s flip to supporting Altman means two board members need to change their minds.

Salesforce looks to poach outbound OpenAI staff with “full cash” compensation offer. OpenAI researchers leaving the firm in protest could be offered a lifeline at Salesforce
Amazon’s offering free courses on generative AI. From the company that brought you AWS certification comes a new ‘AI Ready’ education track to help train aspiring professionals on Amazon’s AI tech.
Eye On AI: Bain Capital Ventures Launches BCV Labs In Search Of New AI Deals. BCV Labs is a new AI incubator and technical community founded by Bain Capital Ventures that provides money, office space, events, GPU credits, a fellowship program, and recruiting help.
Microsoft rebrands its AI-powered Bing Chat as Copilot. The company has also announced more Copilot AI features for its 365 apps.Sam Altman to return as CEO of OpenAI.After an attempted coup by OpenAI’s board that lasted five days, Altman is returning alongside co-founder Greg Brockman.
Microsoft and Nvidia are making it easier to run AI models on Windows. Microsoft’s new Windows AI Studio lets developers access and configure AI models, such as Microsoft’s Phi, Meta’s Llama 2, and Mistral.
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding. Automating autoregressive language model inference may be done in a variety of ways. One method that has generated excitement is the use of draft models. Although it may take longer, this needs two models. On the other hand, you may reduce the requirement for a draft model and accelerate creation linearly by producing related n-grams from the same model.

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding.

OpenAI drops a big new ChatGPT feature with a joke about its CEO drama.ChatGPT’s voice feature lets you ask it a question by saying it aloud — and now it’s available for free.
Emmett Shear threatening to leave OpenAI if the board can’t prove Sam Altman’s wrongdoing. Former Twitch CEO Emmett Shear took a role at OpenAI following the ousting of Sam Altman but is reportedly threatening to leave unless the board can show evidence of Altman’s wrongdoing.
Artificial intelligence finds ways to develop new drugs. A new AI model developed by chemists at ETH Zurich can not only predict where a pharmaceutically active molecule can be chemically modified but also how best to do it. This makes it possible to identify new pharmaceutical ingredients more quickly and improve existing ones in a targeted manner.
OpenAI researchers warned the board of AI breakthrough ahead of CEO ouster, sources say. Ahead of OpenAI CEO Sam Altman’s four days in exile, several staff researchers wrote a letter to the board of directors warning of a powerful artificial intelligence discovery that they said could threaten humanity

Resources

Neural-Cherche. Neural-Cherche is a library designed to fine-tune neural search models such as Splade, ColBERT, and SparseEmbed on a specific dataset.
The Data Engineering Handbook. This repo has all the resources you need to become an amazing data engineer.
tensorli. Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).
THE RISE OF “WET” ARTIFICIAL INTELLIGENCE. Combining AI with traditional wet lab work creates a virtuous circle from lab to data and back to the lab.
Video-LLaVA. Video-LaVA exhibits remarkable interactive capabilities between images and videos, despite the absence of image-video pairs in the dataset. It achieves state-of-the-art performance in video summarization and captioning.

make-real-starter.Recently, tldraw released a popular tool that lets people quickly design software using a paint-like interface. GPT-V is then used to write code for the design’s online version. It produces reliable and functional code and operates remarkably well. It also accepts commands in plain language.
AI Exploits.A collection of real-world AI/ML exploits for responsibly disclosed vulnerabilities
Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation. The recently proposed CoWPiRec method enhances recommender systems using text-based item representations combined with collaborative filtering information. Using word graphs for item interactions, this novel approach has demonstrated better performance in a range of recommendation circumstances, including solving the cold-start issue.

Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.

RustGPT.A web ChatGPT clone entirely crafted using Rust and HTMX.
Stable Video Diffusion Image-to-Video Model Card. Stable Video Diffusion (SVD) Image-to-Video is a diffusion model that takes in a still image as a conditioning frame and generates a video from it.
LangChain for Go.Building applications with LLMs through composability, with Go
Reinforcement Learning for Generative AI: A Survey.Comprehensive review across various application areas like NLP, computer vision, and more exciting and emerging domains. Insights into RL’s flexibility in introducing new training approaches.Future directions for the evolution of generative AI.

Stable Video Diffusion Image-to-Video Model Card.

Perspectives

OpenAI’s identity crisis and the battle for AI’s future. Last weekend some news happened in OpenAI, this blog post is about discussing some open questions.
A Data-Driven Look at the Rise of AI.2023, The AI Revolution: Coatue’s Sri Viswanath breaks down this year’s developments in AI.
AI: The Coming Revolution. Coatue highlights four points for the future: AI has the potential to break through the hype and meaningfully improve our world. Open source is the heartbeat of AI, but not all open source is created equally. Builders and investors need to understand the new, AI-centric tech stack. The best of AI is yet to come
OpenAI’s Misalignment and Microsoft’s Gain. After co-founders Sam Altman and Greg Brockman resigned from OpenAI due to internal issues and the company’s failing non-profit strategy, Microsoft acquired key staff and intellectual property from OpenAI, significantly changing the AI field.
AGI’s Impact on Tech, SaaS Valuations.Thought experiments on how AGI affects SaaS companies of all shapes and sizes
Oops! We Automated Bullshit.ChatGPT is a bullshit generator. To understand AI, we should think harder about bullshit
Explaining the SDXL latent space. Using a smaller latent space for diffusion was one of the advances of the original Stable Diffusion model. This indicates that the diffusion occurs on a compressed image representation rather than on pixels. This article explores many interpretations of that space for SDXL.

Sudden Disturbances in Rapidly Moving Objects: The Implications of the OpenAI Fiasco. The unexpected threat to OpenAI’s dominating position in the developer ecosystem creates a chance for smaller businesses to step in and take advantage of a fresh opening. Microsoft will probably emerge victorious in the AI race, but it’s possible that Anthropic and other model-layer businesses may capitalize on the disruption.
AI should focus on equity in pandemic preparedness. Over-reliance on AI could inadvertently prioritize certain viruses or populations, leading to inequities in vaccine and disease research.
How AI is expanding art history. From identifying disputed artworks to reconstructing lost masterpieces, artificial intelligence is enriching how we interpret our cultural heritage.
How AI shapes the life sciences: an interview with Oliver Stegle.Oliver Stegle explains how AI-based tools have the potential to transform our ability to better understand the complexity of life and how these tools will shape the future of scientific exploration

Meme of the week

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…