ML news: Week Week 13–19 November

Salvatore Raieli
13 min readNov 20, 2023

OpenAI’s CEO Sam Altman found another job (Microsoft), new models, and much more

ML news
Photo by Headway on Unsplash

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

You will find the news first in GitHub. Single posts are also collected here:

Weekly AI and ML news - each week the best of the field

16 stories


ML news
3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models.
  • 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models. In order to provide more control over appearance and geometry, this research integrates 2D diffusion models into the 3DStyle-Diffusion model, a revolutionary technique for comprehensive stylization of 3D meshes. It functions by first employing implicit MLP networks to parameterize the texture of a 3D mesh into reflectance and illumination. After that, a pre-trained 2D diffusion model is used to maintain geometric consistency and match the produced pictures with the text prompt.
  • official code.Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Task. Dual-Guided Spatial-Channel-Temporal (DG-SCT) attention mechanism to enhance pre-trained audio-visual models for multi-modal tasks.
  • Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom. RoseTTAFold All-Atom (RFAA), is a deep network addressing the limitations of current protein structure modeling tools by accurately representing complete biological assemblies, including covalent modifications and interactions with small molecules. RFAA demonstrates comparable accuracy to AlphaFold2 in protein structure prediction, excels in flexible small molecule docking, and predicts covalent modifications and assemblies involving nucleic acids and small molecules. Additionally, the authors present RFdiffusion All-Atom (RFdiffusionAA), a fine-tuned model for generating binding pockets around small and non-protein molecules, showcasing experimental validation with proteins binding to therapeutic, enzymatic, and optically active molecules.
  • FinGPT: Large Generative Models for a Small Language. This study tackles the challenges of creating large language models (LLMs) for Finnish, a language spoken by less than 0.1% of the world population.
  • Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service. VLPMarker, a secure and robust backdoor-based embedding watermarking method for vision-language pre-trained models (VLPs), which effectively injects triggers into VLPs without interfering with model parameters, providing high-quality copyright verification and minimal impact on performance, while also enhancing resilience against various attacks through a collaborative copyright verification strategy based on both backdoor triggers and embedding distribution.
ML news
Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service.
ML news
MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model
ML news
image source: here


ML news
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills.
ML news
Deepmind’s animation gallery.


ML news
The Alignment Handbook.
  • The Alignment Handbook. The HuggingFace’s Alignment Handbook aims to fill that gap by providing the community with a series of robust training recipes that span the whole pipeline.
  • versatile_audio_super_resolution. Pass your audio in, AudioSR will make it high fidelity!
  • tarsier. Vision utilities for web interaction agents. A number of teams are working on creating agents that can interact with web items through vision thanks to the development of potent new vision models. A standard toolset is introduced by Tarsier (e.g., element tagging). Any vision system will work to help you navigate the website and take action. It also has browsing facilities for language models without eyesight.
  • Extra-fast Bark for generating long texts. In this notebook, we’ll show you how to generate very long texts very quickly using Bark, Flash Attention 2, and batching.
  • OpenGPTs. This is an open-source effort to create a similar experience to OpenAI’s GPTs. It builds upon LangChain, LangServe and LangSmith.
AMBER: An Automated Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
ML news
Accelerating Generative AI with PyTorch: Segment Anything, Fast.
  • ai-exploits. This repository, ai-exploits, is a collection of exploits and scanning templates for responsibly disclosed vulnerabilities affecting machine learning tools.
  • Music ControlNet. ControlNet represented an innovative approach to providing image synthetic models with fine-grained control. There is now a model for music generation that is fairly similar and allows you to manage several aspects such as pitch and pronunciation.
  • GPT-4 Turbo Note Taker. Fast and simple, Tactiq’s AI Note Taker with GPT-4 Turbo lets you turn your meetings into actionable notes — so that you’re always taking the right action and getting more out of your meetings.
  • Chroma. Chroma is a generative model for designing proteins programmatically.
  • A Survey on Language Models for Code. gives a summary of LLMs for code, covering 500 relevant works, more than 30 assessment tasks, and more than 50 models.
Music ControlNet.


ML news
An overview of threats to LLM-based applications. (Image source: Greshake et al. 2023)
  • Adversarial Attacks on LLMs. This blog post discusses the many new assaults that language model systems are facing. It has good details regarding several attack types as well as some successful mitigations that teams have discovered.
  • AI and Open Source in 2023. A comprehensive review of the major developments in the AI research, industry, and open-source space that happened in 2023.
  • How investors see your start up? A general partner at Angular Ventures divides the application concepts we are seeing into three major categories in an attempt to make sense of all the nascent AI firms. This exclusively examines application-layer businesses; it ignores model-layer companies.
  • retool’s state of AI 2023. Retool surveyed 1,500 tech workers
  • Language models can use steganography to hide their reasoning, study finds. large language models (LLMs) can master “encoded reasoning,” a form of steganography. This intriguing phenomenon allows LLMs to subtly embed intermediate reasoning steps within their generated text in a way that is undecipherable to human readers.
Example of encoded reasoning (source: arXiv.org)
ML news
image source: here
  • AI Doomers Are Finally Getting Some Long Overdue Blowback. Now, those who predicted AI will bring about our collective extinction must reconsider their claims. The “AI doom” really mainly benefited the large players, and there are plenty of chances for the open source AI movements.
  • There’s a model for democratizing AI. The request for recommendations made by OpenAI on integrating democratic procedures in AI decision-making comes out as constrictive and prefers to handle delicate political matters without accepting accountability, which could limit the application and efficacy of democracy in AI governance.
  • Copilot is an Incumbent Business Model. Though its ultimate disruptive potential rests in redesigning workflows, a challenge that might open substantially larger market opportunities, the Copilot AI business model improves current workflows for efficiency without generating new markets or upending lower ends.

Meme of the week

ML news

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.



Salvatore Raieli

Senior data scientist | about science, machine learning, and AI. Top writer in Artificial Intelligence