WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 11–17 November

OpenAI faces AI advancement slowdown, Near plans world’s largest open-source AI model, Microsoft adds AI to Notepad and Paint, AlphaFold3 goes open-source, Google accidentally previews Jarvis AI, and much more

Salvatore Raieli

20 min readNov 19, 2024

The most interesting news, repository, articles, and resources of the week

Check and star this repository where the news will be collected and indexed:

GitHub — SalvatoreRa/ML-news-of-the-week: A collection of the the best ML news every week…

A collection of the the best ML news every week (research, news, resources) — GitHub — SalvatoreRa/ML-news-of-the-week…

github.com

You will find the news first in GitHub. All the Weekly News stories are also collected here:

Salvatore Raieli

Weekly AI and ML news - each week the best of the field

View list

63 stories

Research

Project Sid: Many-agent simulations toward AI civilization. This work illustrates the behavior and evolution of societies composed of 10–1000+ AI agents. It introduces PIANO, an architecture that allows agents to interact with both humans and other agents in real time. The study reveals that agents can autonomously adopt specialized roles, follow and modify collective rules, and participate in cultural and religious transmissions.
Mixtures of In-Context Learners. utilizes subsets of demonstrations to train experts through in-context learning; a trainable weighting function is then employed to merge the next-token predictions from these experts based on the training set. This method is compatible with black-box LLMs, as it does not require access to their internal parameters. Key advantages include: 1) being competitive with standard ICL while offering much greater efficiency in terms of data, memory, and computation, and 2) demonstrating robustness to noisy demonstrations and label imbalance.
Attacking Vision-Language Computer Agents via Pop-ups. demonstrates that incorporating adversarial pop-ups into current agent testing environments results in an attack success rate of 86%, reducing the agents’ task success rate by 47%. It also notes that simple defense methods, like instructing the agent to ignore pop-ups, prove ineffective.
Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models. enhances LLM responses by simulating multiple experts and combining their outputs; it directs an LLM to complete input instructions by simulating several experts and choosing the best response from both individual and aggregated perspectives. This approach sets a new state-of-the-art on TruthfulQA-Generation with ChatGPT, surpassing the previous record of 87.97%. Additionally, it improves performance in terms of factuality and usefulness while reducing toxicity and hurtfulness.
Number Cookbook: Number Understanding of Language Models and How to Improve It. offers a thorough analysis of the numerical understanding and processing ability (NUPA) of LLMs; reveals that while naive finetuning significantly boosts NUPA on many tasks, it doesn’t work for all. It also finds that methods specifically developed to improve NUPA are ineffective when finetuning pre-trained models. The study examines the application of chain-of-thought techniques to NUPA and notes that these methods encounter scalability issues, limiting their practical use.
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning. introduces a self-evolving online curriculum RL framework aimed at closing the performance gap between open and proprietary LLM-based web agents. It boosts the success rate of Llama-3.1–8B from 4.8% to 42.4% and GLM4–9B from 6.1% to 43%, with the open models significantly outperforming GPT-4-Turbo (17.6%) and GPT-4o (13.9%). The framework addresses the limited availability of web agent training tasks using a robust outcome-supervised reward model for task success evaluation. An adaptive RL strategy manages distribution drift in online learning, ensuring steady performance improvements.
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation. introduces a two-stage fine-tuning method where LLMs first learn from tool-generated solutions and then are trained to decide when to solve problems independently versus using tools. Evaluations on benchmarks in math, climate science, and epidemiology demonstrate significant gains, with a 28% increase in accuracy and a 14% improvement in tool usage precision over top models like GPT-4 and Claude-3.5. This approach enables the LLM to flexibly handle scientific problems of varying complexity.
Google’s Flood Forecasting AI to Reach 700 Million People. Google is expanding riverine flood forecasting coverage to over 100 countries and 700 million people, and enabling partners and researchers to better understand flood forecasting through more data and the development of a new API
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. The Mixture-of-Transformers (MoT) architecture features a sparse multi-modal transformer that separates parameters based on modality (text, images, and speech), allowing for efficient processing while preserving performance. In various evaluations, such as Chameleon 7B and Transfusion settings, MoT matches or outperforms dense baselines, utilizing significantly fewer resources — only 37.2% of the FLOPs for speech processing and 47.2% of the wall-clock time for image generation.
Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation. This study investigates methods to enhance alignment between LLMs and protein-focused geometric deep models, aiming to improve cross-modal understanding.
Can LLMs Follow Threads Through Near-Million-Scale Haystacks? Large Language Models (LLMs) with extended context windows support a wider range of applications. Recent research on 17 top LLMs shows that although many can manage multiple information threads simultaneously, their practical context limits are often shorter than the stated maximum. While several models demonstrate “thread safety” by handling concurrent threads without a drop in performance, accuracy typically decreases as the context window approaches its upper limit.
Compressing Mesh Data for 3D Generation. By reducing the mesh sequence length by about 75%, a mesh compression method known as Blocked and Patchified Tokenization (BPT) effectively produces meshes with more than 8k faces.
Successor Feature Matching. A new non-adversarial method for inverse reinforcement learning that avoids reward function learning is called Successor Feature Matching.
Oasis: A Universe in a Transformer. A 500M parameter foundation model without a game engine powers Oasis, a fully AI-generated, real-time open-world video game model. It is tailored for Etched’s Sohu ASIC to achieve great frame rate efficiencies and uses quick transformer inference to generate gameplay. Despite showing great promise, issues like long-context consistency and domain generalization still exist.
OpenAI to present plans for U.S. AI strategy and an alliance to compete with China. OpenAI’s AI infrastructure blueprint suggests establishing AI economic zones and collaborating with the U.S. Navy on nuclear energy to promote AI-driven economic growth and innovation. The proposal features a North American AI alliance and initiatives modeled after the National Interstate and Defense Highways Act to address infrastructure demands. It stresses the importance of investing in U.S. data centers and energy projects to stay competitive with China.
Introducing Athene-V2: Advancing Beyond the Limits of Scaling with Targeted Post-training. Athene V2 consists of models built upon Qwen 2.5 72B, optimized for agentic and chat-based workflows, and outperform GPT-4o on several key benchmarks.

Traditional ML Still Reigns: Why LLMs Struggle in Clinical Prediction?

Clinical prediction is more than medical knowledge: An LLM may not be the solution for every task

levelup.gitconnected.com

News

Modal buys Tidbyt. The elastic scaling GPU company made its first acquisition by purchasing Tidbyt, a hardware firm based in NYC, to gain the in-house expertise of its team specializing in infrastructure and containerization.
OpenAI reportedly developing new strategies to deal with AI improvement slowdown. OpenAI’s forthcoming model, codenamed “Orion,” reportedly exhibits only modest improvements over its predecessors, indicating a potential deceleration in AI advancement. To address this, OpenAI has established a foundation team dedicated to enhancing models through alternative approaches, including synthetic data training and post-training adjustments, in response to the diminishing availability of new data.
Near plans to build world’s largest 1.4T parameter open-source AI model. Near Protocol has announced plans to develop a 1.4 trillion-parameter open-source AI model, aiming to surpass existing models like Meta’s Llama. This initiative reflects Near Protocol’s commitment to advancing AI capabilities and contributing to the open-source community.
Samsung debuts AI-powered ‘Next-generation Bixby,’ but you can’t use it yet. Samsung has launched a “next-generation Bixby” with enhanced AI capabilities on the Galaxy W25 and W25 Flip in China.
Even Microsoft Notepad is getting AI text editing now. Along with adding AI to a text editor that launched in 1983, Microsoft will let Windows Insiders test generative fill-and-erase tools in Paint, too.
Ofcom warns tech firms after chatbots imitate Brianna Ghey and Molly Russell. After ‘distressing incidents’, watchdog says content from user-made bots would be covered by UK Online Safety Act
AI protein-prediction tool AlphaFold3 is now open source. The code underlying the Nobel-prize-winning tool for modelling protein structures can now be downloaded by academics.
Qwen 2.5 Coder 32B Instruct is here. The Qwen 2.5 Coder series consists of language models tailored for coding tasks. The latest 32B parameter model outperforms GPT-4o and is compact enough for local use by many. It also matches Claude Sonnet 3.5 on several benchmarks.
X is testing a free version of AI chatbot Grok. Social network X has so far limited its AI chatbot Grok (built by Elon Musk’s other company xAI) to its premium, paying users. However, the platform is seemingly preparing to open up the chatbot to free users.
Octoverse: AI leads Python to top language as the number of global developers surges. In this year’s Octoverse report, we study how public and open source activity on GitHub shows how AI is expanding as the global developer community surges in size.
Google accidentally leaked a preview of its Jarvis AI that can take over computers. Google’s new AI prototype, Jarvis, briefly appeared on the Chrome Web Store.
AI-powered parenting is here and a16z is ready to back it. Andreessen Horowitz partner Justine Moore introduced a new investment thesis for the firm on X on Thursday, endorsing “a new wave of ‘parenting co-pilots’ built with LLMs and agents.” She pointed to companies like Cradlewise, makers of an AI-powered baby monitor to detect a baby’s sleep pattern and rock the crib, and Nanit, which uses AI to process crib footage to tell if a baby is breathing.
French news titles sue X over allegedly running their content without payment. Social media site accused of violating a law that requires platforms to pay media when republishing articles
Musk’s influence on Trump could lead to tougher AI standards, says scientist. Tycoon might help president-elect realize race for artificial general intelligence is a ‘suicide race’, says Max Tegmark
Bluesky adds 700,000 new members as users flee X after the US election. Social media platform has become a ‘refuge’ from the far-right activism on X, experts say, after Elon Musk teamed up with Donald Trump
Baidu announces its own pair of AI smart glasses. Baidu, which is often called China’s answer to Google, has launched its own pair of AI-powered smart glasses at its annual World Conference event in Shanghai.
OpenAI co-founder Greg Brockman returns after three months of leave. In the midst of major management departures and controversy over OpenAI’s transition to a for-profit business model, co-founder Greg Brockman has returned to the company as president after taking a sabbatical. In its most recent fundraising round, OpenAI was valued at $157 billion. Due to the departure of executives like Lilian Weng, Bob McGrew, and Mira Murati, the company is experiencing internal issues.
European Google rivals partner on search engine infrastructure to counter Big Tech. To improve AI skills and lessen dependency on U.S. Big Tech, Ecosia and Qwant are collaborating to create a European search index. Using a “privacy-first” strategy, the project seeks to promote AI developments by developing a new search infrastructure. Since generative AI is becoming more and more prevalent in search, alternative search providers are better positioned to compete as a result of the rising API expenses.
Robotic exoskeleton adapts to changes in leg movements in real-time. Wearable robots that assist leg movements could transform the lives of people with reduced mobility — but only if the devices can adapt in real time to support a vast range of human activities. Machine learning provides a way forward.
OpenAI’s take on AI agents could come in January. OpenAI is reportedly preparing to launch “Operator,” an AI agent tool, as early as January. Bloomberg states that Operator may be able to execute tasks directly on a user’s computer. It will initially be accessible as a research preview through OpenAI’s developer API.
Google’s AI Initiative to Boost MENA Economy by $320 Billion. Google.org has launched the AI Opportunity Initiative, its largest AI investment in the Middle East and North Africa (MENA) region, aiming to develop essential AI skills, fund research, and expand AI access. This initiative is projected to contribute $320 billion to MENA’s economy by 2030
Two Trillion Token Common Corpus. the release of Common Corpus (part of the AI Alliance Open Trusted Data Initiative) — the largest fully open multilingual dataset for training LLMs, containing over 2 trillion tokens of permissibly licensed content with provenance information (2,003,039,184,047 tokens).
Lume raises $4.2M Seed Round led by General Catalyst. Lume automates data mapping with AI, streamlining mapping, cleaning, and validation of data.
Amazon launches under-$20 online storefront to compete with Temu. Company says Amazon Haul will mostly feature products under $10, which it plans to ship from China warehouse
Francois Chollet leaves Google. The founder of Keras and Arc eval, among other contributions, has departed from Google. He will continue to support the Jax and Keras communities while exploring new opportunities.
OpenAI launches ChatGPT desktop integrations, rivaling Copilot. When OpenAI released desktop app versions of ChatGPT, it was clear the goal was to get more users to bring ChatGPT into their daily workflows. Now, new updates to Mac OS and Windows PC versions encourage users to stay in the ChatGPT apps for most of their tasks.
Supermaven joins Cursor. The team behind the code editing plugin is joining Cursor to further enhance the user experience.
Google’s AI ‘learning companion’ takes chatbot answers a step further. Google’s Learn About AI tool has more educational, textbook-style responses to guide you through new topics.

Open the Artificial Brain: Sparse Autoencoders for LLM Inspection

A deep dive into LLM visualization and interpretation using sparse autoencoders

towardsdatascience.com

Resources

FrontierMath. Epoch AI has introduced FrontierMath, a benchmark comprising expert-level mathematics problems to assess AI’s mathematical reasoning capabilities. Notably, leading AI models have solved less than 2% of these problems, highlighting the benchmark’s difficulty and the current limitations of AI in advanced mathematical reasoning.
BitNet a4.8: 4-bit Activations for 1-bit LLMs. A major challenge with 1.58bit LLMs has been the absence of hardware acceleration support. This research introduces 4.8bit activations to leverage the INT4/FP4 kernels available in new hardware, achieving this with no added runtime cost.
LLM2CLIP. LLM2CLIP combines CLIP’s visual and textual alignment with the advanced language understanding of LLMs.
Torch Compatible Muon Optimizer. Muon is the optimizer that sets the training record for GPT-2. It is a momentum-adapted method similar to SGD. This repository provides an implementation that can be easily used as a replacement for AdamW.
Mochi video model with optimized inference. Mochi 1, an open-source text-to-video model, initially required eight H100 GPUs for operation. Thanks to community efforts, it can now run on a single 48GB L40 GPU without compromising quality.
A trainable PyTorch reproduction of AlphaFold 3. Protenix is a functional and trainable reproduction of AlphaFold 3, DeepMind’s protein folding project, developed by ByteDance’s ‘AI for Science’ team. This open-source initiative aims to advance protein structure prediction by providing a customizable platform for researchers.
LlamaPReview. LlamaPReview is an AI assistant for GitHub that provides easy one-click installation and automatically reviews pull requests with context-aware analysis. It supports various programming languages and integrates seamlessly with GitHub Actions, delivering insightful feedback directly on PRs. Offered for free, it improves code quality by detecting issues and recommending optimizations.
SmolLM2. Hugging Face’s SmolLM2 is a compact family of language models, ranging from 135M to 1.7B parameters, trained on 11 trillion tokens. These models are designed to run efficiently on devices and support various tasks. The weights are released under the Apache 2 license, and quantized versions, such as the 1.7GB and 138MB models, offer flexibility to meet different computational requirements.
AI for Real-time Fusion Plasma Behavior Prediction and Manipulation. A novel multimodal machine learning approach improves super-resolution data, enabling better analysis of complex fusion plasma phenomena like Edge Localized Modes (ELM), and supports the stabilization of future fusion reactors.
A Comprehensive Survey of Small Language Models in the Era of Large Language Models. a review of small language models (SLMs), covering topics such as definitions, applications, improvements, reliability, and related concerns.
Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks. A new generalist multi-agent system capable of managing complex web and file-based tasks, featuring an Orchestrator agent that coordinates four specialized agents: WebSurfer for browser tasks, FileSurfer for file management, Coder for programming, and ComputerTerminal for console operations. Magentic-One performs competitively on various benchmarks, such as GAIA, AssistantBench, and WebArena, without needing any changes to its core architecture.
Personalization of Large Language Models: A Survey. offers a comprehensive framework for understanding personalized LLMs, introducing taxonomies for various personalization aspects and consolidating existing research in personalized text generation and downstream applications.
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images. StdGen is a novel approach for generating 3D characters from a single image. It breaks down the process into distinct components, such as hair and jackets, enhancing the overall quality of the output.
alphafold3. DeepMind has open-sourced the code and weights of AlphaFold 3 for academic research, marking a significant advancement in protein structure prediction. This release is expected to accelerate AI applications in scientific research, particularly in molecular biology and drug discovery.
Online-LoRA. Online-LoRA is a framework developed to mitigate catastrophic forgetting in online continual learning (OCL) by enabling real-time fine-tuning of pre-trained Vision Transformers (ViTs) without the use of rehearsal buffers.
DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions. DeepArUco++ presents a deep learning-based method for enhancing fiducial marker detection, especially in difficult lighting conditions where traditional techniques typically struggle.
Hermes 3. Hermes 3, fine-tuned from Llama 3.1, excels in both reasoning and creativity, showcasing outstanding performance across models with 8B, 70B, and 405B parameters. It introduces new possibilities in AI alignment and artificial consciousness.
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis. To improve the speed and quality of token-based picture production, EfficientNAT is an improved non-autoregressive Transformer model.
UniGAD: Unifying Multi-level Graph Anomaly Detection. A novel framework for graph anomaly detection (GAD), UniGAD simultaneously detects anomalies in nodes, edges, and complete graphs.
Object and Attribute Matching in Images with Token Merging. Token Merging tackles a prevalent problem in text-to-image models: semantic binding, or the inability to associate things with their particular properties.
DataChain. Without abstracting AI models, DataChain is a Pythonic data-frame toolkit for AI that enables effective processing and dataset structuring of unstructured data. It facilitates the creation of metadata, filtering, and vector search by integrating with AI tools like PyTorch, TensorFlow, and LLM APIs. Additionally, the library has built-in vectorized operations on Python object fields, out-of-memory computation, and parallelization.
browser-use. Through a streamlined UI, this open-source web automation application enables LLMs to communicate with websites. It is compatible with models such as Claude 3.5 Sonnet and GPT-4o. XPath extraction, customizable actions, and multi-tab management are important features. Data extraction and smooth web navigation are made possible by the program. Message length is one of its drawbacks, as it impacts task repetition and LLM speed. Robustness and cost reduction will be the main goals of further development.
CUDA Programming Course — High-Performance Computing with GPUs. A great course from freeCodeCamp on CUDA programming from start to finish.
Masked Token Modeling for Zero-Shot Anything-to-Drums Conversion. Zero-shot drum style transfer for any input rhythm presents an exciting music application for artists. This is achieved using a masked token modeling objective, which is particularly effective for audio.
HiCoM: Hierarchical Coherent Motion for Streamable Dynamic Scene with 3D Gaussian Splatting. HiCoM is a cutting-edge framework designed to enhance real-time 3D reconstruction from multi-view streaming videos. It effectively addresses key challenges in storage, training speed, and rendering quality, making it a significant advancement in the field.
Janus. Janus, DeepSeek’s multimodal model, has a new version incorporating rectified flows, similar to Meta Movie Gen, for image generation and understanding. The results are highly impressive.
Link Conversation with Reference Materials. Problem-oriented segmentation & Retrieval (POSR) is a method that breaks conversations into meaningful segments and connects each segment to relevant reference materials, like worksheets or meeting notes.
MureObjectStitch: Multi-reference Image Composition.Researchers have presented an improved fine-tuning method for generative image composition, which seamlessly merges a specified foreground object with a new background to generate realistic images.
StoryTeller. StoryTeller is a system created to generate coherent descriptions for long videos, tackling issues like plot consistency and character tracking throughout different scenes.
SAMPart3D: Segment Any Part in 3D Objects. SAMPart3D, developed by the University of Hong Kong, is a robust method for segmenting 3D objects into semantically meaningful components.
Convolutional Differentiable Logic Gate Networks. Researchers have developed a method to train image recognition networks that are 29 times smaller and more efficient than traditional convolutional neural networks (CNNs) by making logic gates differentiable. They have also provided efficient CUDA kernels in their paper release
Physics Informed Distillation for Diffusion Models. Physics Informed Distillation (PID) is a method that employs a student model to simplify and accelerate diffusion models by framing them as solutions to differential equations.
MinerU: high-quality data extraction tool. MinerU is a robust tool built on StructTable-InternVL2–1B, enabling the extraction of information from PDFs into various machine-readable formats.
Isotonic regression. A powerful technique for fitting a monotonic function to data. It can be differentiated really well for a number of applications outside of curve fitting.
Text-to-SQL Query. XiYan-SQL is an innovative framework aimed at enhancing both the accuracy and diversity of SQL queries produced from natural language input.
X-Portrait 2: Highly Expressive Portrait Animation. ByteDance’s AI group has unveiled X-Portrait 2, an advanced portrait animation technology that transforms static images into highly expressive, realistic videos. Building upon its predecessor, X-Portrait, this new model excels in capturing subtle facial expressions and complex movements, such as pouting, tongue-out gestures, cheek-puffing, and frowning. It achieves high fidelity in emotion preservation, ensuring the generated videos maintain the subject’s identity and emotional nuances.
MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views. The MVSplat360 model offers a new way to create realistic 360° views of real-world scenes, even from just a few sparse images.
Improved Multi-Task Brain Tumour Segmentation with Synthetic Data Augmentation. This paper presents the leading approach for brain tumor segmentation in the BraTS challenge, demonstrating how synthetic data can improve AI models for medical imaging applications.

Teach What You Know, Learn What Is Hard to Master

Adaptive Knowledge Distillation for Efficient Learning from Large Language Models

levelup.gitconnected.com

Perspectives

Embeddings are underrated. Machine learning embeddings can revolutionize technical writing by enabling mathematical comparisons of any text, and enhancing features like recommendation systems through semantic similarities. By positioning text in a multi-dimensional space, they reveal intuitive semantic relationships, which are valuable for tasks such as finding related content. Documentation site owners who provide embeddings for their content could inspire innovative applications from their communities.
The images of Spain’s floods weren’t created by AI. The trouble is, people think they were. The rapid growth of ‘AI slop’ — content created by artificial tools — is starting to warp our perception of what is, or could be, real
What Trump’s election win could mean for AI, climate, and health. Donald Trump made numerous promises during his presidential campaign that could affect scientists and science policy. Will they be implemented once he is president?
The case for targeted regulation. Advancements in AI are significantly enhancing capabilities in mathematics, coding, and science, presenting both opportunities and risks. Effective regulation is crucial to prevent misuse in areas such as cybersecurity and chemical, biological, radiological, and nuclear (CBRN) threats. Anthropic’s Responsible Scaling Policy emphasizes transparency and advocates for a balanced legislative approach that ensures safety while fostering innovation.
AI-powered parenting is here and a16z is ready to back it. Andreessen Horowitz partner Justine Moore introduced a new investment thesis for the firm on X on Thursday, endorsing “a new wave of ‘parenting co-pilots’ built with LLMs and agents.” She pointed to companies like Cradlewise, makers of an AI-powered baby monitor to detect a baby’s sleep pattern and rock the crib, and Nanit, which uses AI to process crib footage to tell if a baby is breathing.
Speculation on Test Time Compute. This video discusses O1 models, their capacity for replication, and their potential utility for a range of future tasks.
Can AI review the scientific literature — and figure out what it all means? Artificial intelligence could help speedily summarize research. But it comes with risks.
Why we are all lab rats in the digital world. Researchers need to establish robust ethical protocols for online experiments.
Don’t blame search engines for sending users to unreliable sites. Analysis of billions of pages of results from searches using the Bing algorithm suggests that reliable sites appear in search results 19 to 45 times more often than do sites with low-quality content.
AI-generated images threaten science — here’s how researchers hope to spot them. Generative-AI technologies can create convincing scientific data with ease — publishers and integrity specialists fear a torrent of faked science.
The quest to build bionic limbs that feel like the real thing. Through brain implants, neural interfaces and skin grafts, researchers are starting to restore sensation for paralysed or amputated limbs.
How AI is reshaping science and society. Artificial intelligence tools such as ChatGPT might soon become fully autonomous by learning to perceive and interact with their environment.
‘It gets more and more confused’: can AI replace translators? A Dutch publisher has announced that it will use AI to translate some of its books — but those in the industry are worried about the consequences if this becomes the norm
StackBlitz achieves $4M ARR in 4 weeks for their AI web development platform with Claude. StackBlitz developed an online developer tool that integrates closely with Claude 3.5 Sonnet. This post details how the company achieved $4 million in annual recurring revenue within a few months.
Why the deep learning boom caught almost everyone by surprise. Fei-Fei Li’s development of the extensive ImageNet dataset played a crucial role in the revival of neural networks. It supplied the training data essential for landmark models such as AlexNet. Using GPUs and Geoffrey Hinton’s backpropagation method, AlexNet showcased the potential of deep learning on large datasets, igniting the current AI revolution. This key event highlighted the significance of integrating neural networks, big data, and GPU computing to drive AI advancements.
Just Have AI Build an App for That. AI agents are increasingly being used to quickly create functional apps for tasks like resizing SVGs.
AI isn’t about unleashing our imaginations, it’s about outsourcing them. The real purpose is profit. Artificial intelligence doesn’t just incrementally erode the rights of authors and other creators. These technologies are designed to replace creative workers altogether
Companies building AI-powered tech are using your posts. Here’s how to opt-out. even if you haven’t knowingly opted in, companies are still scraping your personal information to train their systems

What Is The Best Therapy For a Hallucinating AI Patient?

Exploring the Art and Science of Prompt Engineering to Cure LLM Hallucinations

levelup.gitconnected.com

Meme of the week

LLMs and the Student Dilemma: Learning to Solve or Learning to Remember?

Investigating Whether Large Language Models Rely on Genuine Understanding or Clever Heuristics in Arithmetic Reasoning

levelup.gitconnected.com

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn. You can also subscribe for free to get notified when I publish a new story.

Get an email whenever Salvatore Raieli publishes.

Get an email whenever Salvatore Raieli publishes. By signing up, you will create a Medium account if you don’t already…

salvatore-raieli.medium.com

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…

github.com

or you may be interested in one of my recent articles:

You Know Nothing, John LLM: Why Do You Answer Anyway?

Distinguishing Knowledge Gaps from Misguided Confidence in Large Language Models

levelup.gitconnected.com

The Savant Syndrome: Is Pattern Recognition Equivalent to Intelligence?

Exploring the limits of artificial intelligence: why mastering patterns may not equal genuine reasoning

towardsdatascience.com

The Cultural Lens of AI: Which Party Would Your LLM Vote?

Unveiling Ideological Bias Across Languages and Cultures in Large Language Models

levelup.gitconnected.com

What if LLMs Are Better Than We Think? Or Is It Our Judgement That’s Flawed?

A Study of Label Errors and Their Impact on LLM Performance Evaluations

ai.gopubby.com

WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

AI & ML news: Week 11–17 November

OpenAI faces AI advancement slowdown, Near plans world’s largest open-source AI model, Microsoft adds AI to Notepad and Paint, AlphaFold3 goes open-source, Google accidentally previews Jarvis AI, and much more

GitHub — SalvatoreRa/ML-news-of-the-week: A collection of the the best ML news every week…

A collection of the the best ML news every week (research, news, resources) — GitHub — SalvatoreRa/ML-news-of-the-week…

Weekly AI and ML news - each week the best of the field

Research

Traditional ML Still Reigns: Why LLMs Struggle in Clinical Prediction?

Clinical prediction is more than medical knowledge: An LLM may not be the solution for every task

News

Open the Artificial Brain: Sparse Autoencoders for LLM Inspection

A deep dive into LLM visualization and interpretation using sparse autoencoders

Resources

Teach What You Know, Learn What Is Hard to Master

Adaptive Knowledge Distillation for Efficient Learning from Large Language Models

Perspectives

What Is The Best Therapy For a Hallucinating AI Patient?

Exploring the Art and Science of Prompt Engineering to Cure LLM Hallucinations

Meme of the week

LLMs and the Student Dilemma: Learning to Solve or Learning to Remember?

Investigating Whether Large Language Models Rely on Genuine Understanding or Clever Heuristics in Arithmetic Reasoning

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

Get an email whenever Salvatore Raieli publishes.

Get an email whenever Salvatore Raieli publishes. By signing up, you will create a Medium account if you don’t already…

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…

You Know Nothing, John LLM: Why Do You Answer Anyway?

Distinguishing Knowledge Gaps from Misguided Confidence in Large Language Models

The Savant Syndrome: Is Pattern Recognition Equivalent to Intelligence?

Exploring the limits of artificial intelligence: why mastering patterns may not equal genuine reasoning

The Cultural Lens of AI: Which Party Would Your LLM Vote?

Unveiling Ideological Bias Across Languages and Cultures in Large Language Models

What if LLMs Are Better Than We Think? Or Is It Our Judgement That’s Flawed?

A Study of Label Errors and Their Impact on LLM Performance Evaluations

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Salvatore Raieli

Responses (1)