ML news: Week 20–26 November

Salvatore Raieli
9 min readNov 28, 2023

The OpenAI’s infinite saga, a new Orca in LLM’see, and much more

Photo by Todd Cravens on Unsplash

The most interesting news, repository, articles, and resources of the week

MILA: Memory-Based Instance-Level Adaptation for Cross-Domain Object Detection.
TopoMLP: An Simple yet Strong Pipeline for Driving Topology Reasoning.
Orca 2: Teaching Small Language Models How to Reason.


Break the Sequential Dependency of LLM Inference Using Lookahead Decoding.


  • Neural-Cherche. Neural-Cherche is a library designed to fine-tune neural search models such as Splade, ColBERT, and SparseEmbed on a specific dataset.
  • The Data Engineering Handbook. This repo has all the resources you need to become an amazing data engineer.
  • tensorli. Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).
  • THE RISE OF “WET” ARTIFICIAL INTELLIGENCE. Combining AI with traditional wet lab work creates a virtuous circle from lab to data and back to the lab.
  • Video-LLaVA. Video-LaVA exhibits remarkable interactive capabilities between images and videos, despite the absence of image-video pairs in the dataset. It achieves state-of-the-art performance in video summarization and captioning.
  • make-real-starter.Recently, tldraw released a popular tool that lets people quickly design software using a paint-like interface. GPT-V is then used to write code for the design’s online version. It produces reliable and functional code and operates remarkably well. It also accepts commands in plain language.
  • AI Exploits.A collection of real-world AI/ML exploits for responsibly disclosed vulnerabilities
  • Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation. The recently proposed CoWPiRec method enhances recommender systems using text-based item representations combined with collaborative filtering information. Using word graphs for item interactions, this novel approach has demonstrated better performance in a range of recommendation circumstances, including solving the cold-start issue.
Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.
  • RustGPT.A web ChatGPT clone entirely crafted using Rust and HTMX.
  • Stable Video Diffusion Image-to-Video Model Card. Stable Video Diffusion (SVD) Image-to-Video is a diffusion model that takes in a still image as a conditioning frame and generates a video from it.
  • LangChain for Go.Building applications with LLMs through composability, with Go
  • Reinforcement Learning for Generative AI: A Survey.Comprehensive review across various application areas like NLP, computer vision, and more exciting and emerging domains. Insights into RL’s flexibility in introducing new training approaches.Future directions for the evolution of generative AI.
Stable Video Diffusion Image-to-Video Model Card.


  • OpenAI’s identity crisis and the battle for AI’s future. Last weekend some news happened in OpenAI, this blog post is about discussing some open questions.
  • A Data-Driven Look at the Rise of AI.2023, The AI Revolution: Coatue’s Sri Viswanath breaks down this year’s developments in AI.
  • AI: The Coming Revolution. Coatue highlights four points for the future: AI has the potential to break through the hype and meaningfully improve our world. Open source is the heartbeat of AI, but not all open source is created equally. Builders and investors need to understand the new, AI-centric tech stack. The best of AI is yet to come
  • OpenAI’s Misalignment and Microsoft’s Gain. After co-founders Sam Altman and Greg Brockman resigned from OpenAI due to internal issues and the company’s failing non-profit strategy, Microsoft acquired key staff and intellectual property from OpenAI, significantly changing the AI field.
  • AGI’s Impact on Tech, SaaS Valuations.Thought experiments on how AGI affects SaaS companies of all shapes and sizes
  • Oops! We Automated Bullshit.ChatGPT is a bullshit generator. To understand AI, we should think harder about bullshit
  • Explaining the SDXL latent space. Using a smaller latent space for diffusion was one of the advances of the original Stable Diffusion model. This indicates that the diffusion occurs on a compressed image representation rather than on pixels. This article explores many interpretations of that space for SDXL.
Explaining the SDXL latent space.

