AI integration

Local LLM

Providers

Libraries

LangChain: doc, Python API, source
GPT4All: doc, code
LangChain4j: API doc, release notes, source
Spring AI
Griptape: doc
LangFlow
CUGA
Mellea

MCP

Model Context Protocol
MCP Inspector
mcptools
GitHub MCP Registry
MCP Servers on GraphQL
- mcp-graphql
- Apollo MCP Server
MCP Bundles

AGENTS.md

AGENTS.md

Skills

Mess

AssemblyAI: YouTube 📡
Abdul Majed Raja: YouTube 📡↓
"Prompt Engineering": YouTube 📡↓
Greg Kamradt: YouTube 📡
Matt Wolfe: YouTube 📡↓

Articles and videos

LLaMA & Alpaca: “ChatGPT” On Your Local Computer 🤯 | Tutorial by Martin Thissen (18 March 2023) ► A short explanation on how to use Dalai and LLaMA.
Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps by Jason Zhou (11 June 2023) ► A small effective demo of using Hugging Face and LangChain.
$0 Embeddings (OpenAI vs. free & open source)↑ by Greg Richardson (25 June 2023) ► A demo of two ways to compute embeddings: online with Hugging Face and locally in the Browser.
"Next Level Prompts?" - 10 mins into advanced prompting by Jason Zhou (29 August 2023) ► Some tools/sites helping to write prompts: Guidance, FlowGPT, gpt-prompt-engineer, PromptsRoyale.
A developer’s guide to open source LLMs and generative AI — Open source generative AI projects are a great way to build new AI-powered features and apps. by Gwen Davis (5 October 2023) ► Some information on open-source LLMs and a short list of four ones.
🤬 How the #@%$! Do You Use an LLM in a SaaS Platform?↓ by Arjan Egges (6 October 2023) ► Arjan Egges describes his first steps to build learntail.com, using OpenAI and Langchain to generates quizzes.
Pydantic is all you need: Jason Liu by Jason Liu (9 October 2023) ► Jason Liu presents his Instructor library to structure prompting and extraction (for OpenAI).
How I Fine-Tuned An AI Clone - Can You Tell The Difference?↓ by Greg Kamradt (2 November 2023) ► A lengthy but too fast video just to end up with using HeyGen to create a deep fake video.
LLM: Trust, but Verify — Understand the challenges of developing, testing, and monitoring non-deterministic software; this is a new and significant challenge for observability. by Pratik Daga (3 November 2023) ► The author describes the problem of model drift and proposes a mechanism to detect it.
Wanna RAG? These are your best LLMs!!! by Abdul Majed Raja (16 November 2023) ► A presentation of Galileo’s Hallucination Index.
No, You DON'T NEED OpenAI Function Calling!!!! by Abdul Majed Raja (17 November 2023) ► A quick and dirty presentation of Gorilla OpenFunctions.
Training Your Own AI Model Is Not As Hard As You (Probably) Think by Steve Sewell (22 November 2023) ► Using several steps to generate code from a Figma design: I wonder if what is presented here really works on other cases than this demo.
llamafile is the new best way to run an LLM on your own computer by Simon Willison (29 November 2023) ► A presentation of llamafile: a single file containing the model and its executable which can run on several OSes.
Detect Texts from Documents (even SCANNED)!!! by Abdul Majed Raja (14 January 2024) ► A presentation of Surya: a tool to identify text lines and compute their bounding boxes.
Exploring ColBERT with RAGatouille by Simon Willison (27 January 2024) ► Some experimentation with ColBERT, a fast retrieval model.
Everything WRONG with LLM Benchmarks (ft. MMLU)!!! by Abdul Majed Raja (10 February 2024) ► Presenting a paper ("When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards") which analyses how model’s scores are sensible to the way benchmarks are structured.
Engineering Practices for LLM Application Development by David Tan and Jessie Wang (13 February 2024) ► Lessons learned from the building of a PoC of a concierge using an LLM.
La recherche sous stéroïdes - une histoire de sémantique↓ by Mathilde Rigabert and Martin Labenne (3 May 2024) ► Some feedback about implementing semantic search on an e-commerce site. This could have been much shorter.
Poorman's ChatGPT-4o Works!! 🤣 by Abdul Majed Raja (15 May 2024) ► A short presentation of KingNish/OpenGPT-4o a Hugging Face space supporting several modalities by using open models.
The 4 Big Changes in LLMs by Sam Witteveen (1 July 2024) ► Sam Witteveen advices to consider four things: models are getting smarter, they are getting faster, there are getting cheaper, and context windows are getting larger.
RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing by Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M Waleed Kadous, and Ion Stoica (1 July 2024) ► The authors propose a router that selects to run a query either toward an expensive powerful model or toward a cheaper smaller model, in order to reduce cost while sacrificing little quality.
↪What is an LLM Router? by Sam Witteveen (3 July 2024) ► Nothing more than the previous announcement.
InternLM - A Strong Agentic Model? by Sam Witteveen (5 July 2024) ► A basic presentation of internlm/internlm2_5-7b-chat, a model specialised for JSON and function calling.
Prompt Poet - Character AI's Prompting Framework by Sam Witteveen (2 August 2024) ► A presentation of Prompt Poet, a Python framework to manage prompts.
Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 1) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this first part of a blog series, we'll explore the fundamental principles of LLM caching, delve into the various caching architectures and implementations that can be employed by Uri Rosenberg (7 August 2024) ► Some cache architectures for LLM, classic and RAG. The cache key can be exact or semantic.
↪Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 2) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this second part of a blog series, we'll explore LLM caching implementations. by Uri Rosenberg (7 August 2024) ► How to implement the previous architecture on AWS using LangChain or not.
How streaming LLM APIs work by Simon Willison (21 September 2024) ► Some experimentation of using SSE with GPT-4o Mini, Sonnet 3, and Gemini Pro using curl, Python‘s HTTPX, and JavaScript’s fetch().
Is Spring AI Strong Enough for AI? — Explore Spring's capabilities within the AI domain, its potential integration with AI libraries, and its ability to effectively manage AI workflows. by Reza Ganji (27 September 2024) ► This article is comparing very different things: Spring, TensorFlow Serving, Kubernetes, MLflow, and Python. Additionally, it only states some obvious facts.
Explore a New C# Library for AI by Matt Williams (11 October 2024) ► Some very little information about Microsoft.Extensions.AI, some new .NET packages to integrate AI.
Run a prompt to generate and execute jq programs using llm-jq by Simon Willison (27 October 2024) ► A new llm plugin to generate and execute jq commands.
How Google is helping developers get better answers from AI — Today’s guest is Logan Kilpatrick, a senior product manager at Google, who tells Ben about his journey from software engineering to machine learning to product management, all with an emphasis on reducing developer friction. They talk through the challenges of non-determinism in AI models and how Google is addressing these issues with a new feature: Grounding with Google Search. Plus, what working at the Apple Store taught Logan about product management. by Logan Kilpatrick and Ben Popper (5 November 2024) ► There is no real information in this interview with Logan Kilpatrick, a product manager for Google AI Studio.
Model Compression: Improving Efficiency of Deep Learning Models — Model compression is a key component of real-time deployment of deep learning models. This article explores different approaches to make models more efficient. by Inderjot Singh Saggu (6 November 2024) ► A high-level and clear description of model pruning, quantisation, and knowledge distillation.
ChainForge by Simon Willison (8 November 2024) ► Some little information about ChainForge, a tool to evaluate prompts.
Introducing the Model Context Protocol by Simon Willison (25 November 2024) ► Anthropic proposes a protocol to connect an LLM to tools and data sources.
Anthropic's New Agent Protocol! by Sam Witteveen (27 November 2024) ► A presentation of Model Context Protocol and some experimentation with it.
17 Python Libraries Every AI Engineer Should Know by Dave Ebbelaar (12 December 2024) ► The title says it all.
Integrating AI With Spring Boot: A Beginner’s Guide — In this guide, you will learn how to integrate AI into your Spring Boot app using Spring AI and simplify your AI setup with familiar Spring abstractions. by Gunter Rotsaert (27 January 2025) ► An introduction to Spring AI.
files-to-prompt 0.5 by Simon Willison (14 February 2025) ► Simon Willison describes his files-to-prompt tool, used to send some files and a prompt to an LLM.
Emerging Patterns in Building GenAI Products by Bharani Subramaniam and Martin Fowler (25 February 2025) ► A good overview of the common methods for integrating generative AI (mostly LLMs).
Open Deep Research (16 April 2025) ► Together AI explains how they build their open-source deep research.
Learn the Hugging Face Kernel Hub in 5 Minutes by David Holtz, Daniël De Kok, Nicolas Patry, Pedro Cuenca, Simon Pagezy, Merve Noyan, and Vaibhav Srivastav (12 June 2025) ► Hugging Face now hosts optimised kernels that can be easily downloaded and used in our own models.
Building a SNAP LLM eval: part 1 by Dave Guarino (19 June 2025) ► A description of the need to evaluate models and the first step of this evaluation: having a domain expert experimenting with the models to get a feeling of their strengths and weaknesses.
↪Building a SNAP LLM eval: Part 2 - testing and automation by Dave Guarino (19 June 2025) ► How to automate (using promptfoo) the evaluation of the knowledge of the facts.
↪Building a SNAP LLM eval: part 3 - testing nuanced capabilities by Dave Guarino (23 April 2025) ► How to automate the evaluation of nuanced capabilities.
↪Exploring Promptfoo via Dave Guarino’s SNAP evals by Simon Willison (24 April 2025) ► Some information extracted from the previous articles.
How Long Contexts Fail — Managing Your Context is the Key to Successful Agents by Drew Breunig (22 June 2025) ► Throwing everything in a very long context is not the simple solution we may believe it is, there are many problems with these long contexts.
↪How to Fix Your Context — Mitigating & Avoiding Context Failures by Drew Breunig (26 June 2025) ► Some advice to better manage the context.
How to deploy LLMs in 1 click...↓ by "2MinutesPy" (15 July 2025) ► This is simply an advertisement for Novita, a company renting cloud GPUs!
LangExtract - Google's New Library for NLP Tasks↓ by Sam Witteveen (4 August 2025) ► A description of LangExtract, an NLP library from Google. Sam Witteveen is botching the video even more than usual.
Introducing AI Sheets: a tool to work with datasets using open AI models! by Daniel Vila, Ame Vi, Francisco Aranda, Damián Pumar, Leandro von Werra, and Thomas Wolf (8 August 2025) ► Hugging Face proposes a new tool to evaluate models/prompts on a dataset that can be generated or imported.
SDS 917: 8 Steps to Becoming an AI Engineer, with Kirill Eremenko (⧉) by Kirill Eremenko and Jon Krohn (26 August 2025) ► Kirill Eremenko describes the 8-weeks formation his company, SuperDataScience, is selling: an overview of prompting, RAG, agents, both for the PoC phase and the production stage.
Build a Local LLM App in Python with Just 2 Lines of Code by Chris Hay (8 October 2025) ► Chris Hay presents his chuk-llm library.
Prompt Engineering for LLMs, PDL, & LangChain in Action by Martin Keen (10 November 2025) ► A short introduction to LangChain and PDL.
AI & Text to SQL: How LLMs & Schema Power Data Analytics by Michael Dobson (13 December 2025) ► An introduction to the writing of SQL queries using an LLM.
How to Use Agentic AI: LLMs, AI Agents & Prompt Engineering in Action↓ by Shad Griffin (27 December 2025) ► This description of replacing a single prompt with a workflow of four ones if unclear.
2026/01/13 - ParisJug Academy - Spring AI with Docker model runner and Debugging by Sreenu Doosari (13 January 2026) ► A demo of Spring AI and Docker Model Runner.
Open Responses - The NEW Standard API for Open Models by Sam Witteveen (20 January 2026) ► OpenAI proposes a standard for the format payload for chat completion API.
Beating Cowork with Open Source Cowork by Sam Witteveen (21 January 2026) ► A presentation of Eigent and its predecessor CAMEL-AI.
Running Pydantic’s Monty Rust sandboxed Python subset in WebAssembly by Simon Willison (6 February 2026) ► Pydantic created a Rust implementation of a subset of Python. Simon Willison converted it into a WASM file and into a Wheel file runnable in Pyodide.
A2A vs MCP: AI Agent Communication Explained by Martin Keen (2 March 2026) ► A short presentation of A2A and MCP.
What Is Agentic Storage? Solving AI’s Limits with LLMs & MCP by Martin Keen (5 March 2026) ► Using MCP to allow AI agents to store data and how to secure the operations on that storage.
7 new open source AI tools you need right now… by Jeff Delaney (12 March 2026) ► Agency Agents, PromptFoo, MiroFish, Impeccable, OpenViking, Heretic, NanoChat.
What Are Hierarchical AI Agents? Solving Context & Task Challenges by Martin Keen (12 March 2026) ► A basic presentation of agent hierarchies, their advantages, and their problems.
What Is Llama.cpp? The LLM Inference Engine for Local AI by Cedric Clyburn (16 March 2026) ► A presentation of Llama.cpp.
Building Single-User vs Multi-User Agents: What Actually Changes by Sam Witteveen (24 March 2026) ► This video is not really about single-user vs. multi-users, but about the fact that a quick n’ dirty personal tool is not the same as a professional multi-tenants one.
An ADK Java agent powered by Gemma 4 by Guillaume Laforge (2 April 2026) ► Three ways to call a Gemma 4 models: using ADK for AI Studio, ADK and the LangChain4j bridge for vllm, and the same for Ollama.
LLM Wiki by Andrej Karpathy (4 April 2026) ► Andrej Karpathy describes a "pattern for building personal knowledge bases using LLMs"; the comments are about every one building their own over-ambitious knowledge repository.
The 7 Skills You Need to Build AI Agents by Bri Kopecki (14 April 2026) ► System Design, Tool and Contract Design, Retrieval Engineering, Reliability Engineering, Security and Safety, Evaluation and Observability, Product Thinking. Bri Kopecki is just listing subjects with little detail. I do not see the value of such a list.
How Claude's Design Agents Work by Sam Witteveen (1 May 2026) ► Sam Witteveen found six design patterns in Claude Design: context grounding, structured memory, multimodal interaction with the user, self evaluation, generation of multiple versions, and handoff using a common format.
The Agent Harness: Building Secure Sandboxes for Autonomous AI Workloads by Ivan Burazin and Matt Turck (14 May 2026) ► Some information on the sandboxes proposed by Daytona.
CAG vs Long Context: How AI Models Use and Remember Information by Martin Keen (21 May 2026) ► Yet another presentation of Cache-Augmented Generation.
Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut (⧉) by Alex Bowcut and Sam Charrington (9 June 2026) ► Some non-technical information about Sphere’s software used to find the tax rules applicable to a product in a country or state.
Fine-tuning
- Fine Tune a model with MLX for Ollama by Matt Williams (30 August 2024) ► How to fine-tune a model with MLX and use it in Ollama.
- ↪Is MLX the best Fine Tuning Framework? by Matt Williams (18 January 2025) ► A detailed introduction to fine-tuning with MLX. This is an expanded version of the previous video.
- Fine-tuning Large Language Models by Zain Hasan, Artem Chumachenko, George, and Max Ryabinin (16 January 2025) ► The basics of LLM and fine-tuning, a demo of Together’s LoRA fine-tuning API, some experiments done by Together, and some pieces of advice.
- Fast Fine Tuning with Unsloth by Matt Williams (24 January 2025) ► A presentation of Unsloth which optimises fine-tuning on NVIDIA GPUs.
- Axolotl is a AI FineTuning Magician↓ by Matt Williams (31 January 2025) ► This presentation of Axolotl is too verbose and it is not very good because Matt Williams does not master the subject.
Skills
- How To Use AI Skills Like A Senior Developer by Kyle Cook (16 June 2026) ► An introduction to skills and some advice on how to write good ones.
RAG
- ClippyGPT - How I Built Supabase’s OpenAI Doc Search (Embeddings) by Greg Richardson (7 February 2023) ► Greg Richardson describes in details how he implemented a chat to answer questions on Supabase: tokenising the doc, finding the paragraph closest to the question, and generating the answer.
- Build RAG Application Using a LLM Running on Local Computer with GPT4All and Langchain — Privacy-preserving LLM without GPU↑ by "(λx.x)eranga" (10 March 2024) ► A clear explanation with working code of how to scrape an Internet doc, to chunk it, to store it in Chroma, and to use GPT4All to generate the answer.
- Ne mettez pas les projets RAG en production trop vite ! by Philippe Prados (3 June 2024) ► Philippe Prados lists some examples of problems that will occur with an overly simplistic implementation of a RAG. But this simply means that you do not design a demo and a scalable application the same way, the second is much more complex.
- ↪Rendre résilient un projet RAG by Philippe Prados (17 June 2024) ► Philippe Prados suggested many changes to LangChain in order to make it more resilient, e.g. to properly support transactions.
- Breaking up is hard to do: Chunking in RAG applications — A look at some of the current thinking around chunking data for retrieval-augmented generation (RAG) systems. by Ryan Donovan (6 June 2024) ► A high-level presentation of some chunking methods and how to evaluate them.
- Supercharging RAG with Generative Feedback Loops from Weaviate by Letitia Parcalabescu (17 June 2024) ► A presentation of Generative Feedback Loops, which is just about storing LLM generated text in a vectorial database, so it can be retrieved quickly rather than regenerated by the LLM.
- Building search-based RAG using Claude, Datasette and Val Town by Simon Willison (21 June 2024) ► The debrief of a live session of implementing a small RAG in Val Town.
- Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) by Yannic Kilcher (26 June 2024) ► A critique of "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools".
- Gemma 2 - Local RAG with Ollama and LangChain by Sam Witteveen (28 June 2024) ► A simple RAG implementation.
- Practical tips for retrieval-augmented generation (RAG) — Retrieval-augmented generation (RAG) is one of the best (and easiest) ways to specialize an LLM over your own data, but successfully applying RAG in practice involves more than just stitching together pretrained models. by Cameron R. Wolfe PhD (15 August 2024) ► Some high-level advice on how to implement RAG.
- Knowledge Graphs: The Secret Weapon for Superior RAG Applications — Integrating knowledge graphs in RAG applications enhances recommendation accuracy and context-awareness, providing structured, interconnected data.🚫 by Pavan Vemuri, Prince Bose, and Tharakarama Reddy Yernapalli Sreenivasulu (19 August 2024) ► This article is only about the data retrieval. The data needs to be structured, so it can be stored as a semantic graph.
- RAG vs. Fine Tuning by Cedric Clyburn (9 September 2024) ► The basics of RAG vs. fine-tuning, and a description of combining both.
- Introducing Contextual Retrieval↑ (19 September 2024) ► Anthropic experimented RAG with adding context to chunks, using embedding and BM25, and reranking.
- ↪Contextual RAG is stupidly brilliant!↓ by Abdul Majed Raja (23 September 2024) ► This presentation of Anthropic’s analysis on how to improve RAG is poorly done.
- Multimodal Document RAG with Llama 3.2 Vision and ColQwen2 by Zain Hasan (8 October 2024) ► A presentation of ColPali design: using a vision language model (PaliGemma or Qwen-2) to transform image patches into vectors, finding the patch vectors nearest to the user query, and providing the corresponding full images and user query to a vision LLM (Llama 3.2 vision).
- Why Your RAG System Is Broken, and How to Fix It with Jason Liu (⧉) by Jason Liu and Sam Charrington (11 November 2024) ► Some advice on RAG implementation: doing fast and simple evals (e.g. looking at the length, using regexp…), use them very frequently, reranking…
- Build a document-based question answering system by using Docling with Granite 3.1 by Ash Minhas, Anna Gutowska, and Erika Russi (18 December 2024) ► A small demo of interrogating a document using Granite, Docling, LangChain, and FFAIS.
- 2 Methods For Improving Retrieval in RAG by Johannes Jolkkonen (19 December 2024) ► This video seems to be a real usage of RAG, not the usual YouTuber doing the usual demo. The guy improved his RAG system by preprocessing the documents to extract structured data from them using an LLM.
- GraphRAG vs. Traditional RAG: Higher Accuracy & Insight with LLM↓ by Sara Bacha (17 February 2025) ► This presentation of GraphRAG is too high-level, you have no clue on how to implement it.
- Build an AI-powered multimodal RAG system with Docling and Granite by BJ Hargrave and Erika Russi (26 February 2025) ► Yet another RAG example, this one extracts text, tables, and images from a PDF file.
- RAG vs. CAG: Solving Knowledge Gaps in AI Models↑ by Martin Keen (17 March 2025) ► A basic and good comparison of Retrieval-Augmented Generation and Cache-Augmented Generation.
- What is Retrieval-Augmented Fine-Tuning (RAFT)? by Isaac Ke (9 June 2025) ► Fine tuning a model so it gets better at using only the relevant documents provided by the retrieval part and at answering that it does not know if no document is relevant.
- Improving Retrieval with ELO Scores (8 July 2025) ► ZeroEntropy explains the training process they used for their zerank-1 rerankers.
- Graph Databases: When to Use Them (And When to Run Away) by Jo Kristian Bergum and Hamel Husain (8 December 2025) ► You can perform some graph RAG without a graph database.
- What is OpenRAG? Unlocking the Future of RAG in Generative AI by David Jones-Gilardi (12 February 2026) ► A short presentation of OpenRAG: Docling + OpenSearch + Langflow.
- What is Multimodal RAG? Unlocking LLMs with Vector Databases by Martin Keen and Josh Spurgin (16 February 2026) ► Some options to implement multimodal RAG: textify everything, hybrid multimodal, and full multimodal.
- Is RAG Still Needed? Choosing the Best Approach for LLMs by Martin Keen (9 March 2026) ► A comparison of the advantages and problems of large context vs. RAG.
- Vector Search with LLMs - Computerphile by Mike Pound (11 March 2026) ► A basic introduction to text embedding and RAG.
- RAG's Evolution: From Simple Retrieval to Agentic AI↓ by Sam Anthony (5 May 2026) ► The evolution of searching presented to the dummies: from text matching, to RAG, to agents.
- NotebookLM
  - Google's RAG Experiment - NotebookLM by Sam Witteveen (28 May 2024) ► The title says it all. Google demo is impressive, using voice for querying and answering.
  - How to create AI Podcasts with NotebookLM Tutorial by Abdul Majed Raja (17 September 2024) ► A presentation of an impressive Google demo usable (from Illuminate and NotebookLM): you give a paper as entry, it generates a two-persons podcast.
  - NotebookLM’s automatically generated podcasts are surprisingly effective by Simon Willison (29 September 2024) ► People are playing with NotebookLM-generated podcasts, sometimes at a meta-level.
  - New in NotebookLM: Customizing your Audio Overviews by Simon Willison (17 October 2024) ► Simon Willison is playing with the fact that NotebookLM users can now provide guidelines for the podcast to generate: as usual he picks up the pelican example and asks the AI-hosts to behave as if they were pelicans.
  - Google's UNREAL AI Gets an UPGRADE... by Wes Roth (19 October 2024) ► The "poop fart" podcast and how Wes Roth added video on it using HeyGen. He also quickly describes the new NotebookLM features.
Web scraping
- Web Scraping AI AGENT, that absolutely works 😍 by Abdul Majed Raja (9 May 2024) ► A presentation of ScrapeGraphAI, a Python library to scrape a website and to interrogate an LLM on the scraped data.
- “Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent by Jason Zhou (16 May 2024) ► Scraping the Web with FireCrawl or AgentQL, and an LLM.
- How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai by Hai Nghiem (17 May 2024) ► Some Web scraping tools: Beautiful Soup, Jina AI, Firecrawl, and Scrapegraph-ai.
- How Stack Overflow fends off scraping bots — Josh Zhang, a staff site reliability engineer at Stack Overflow, tells Ryan and Eira how the Stack Exchange network defends against scraping bots. They also cover the emergence of human botnets, why DDoS attacks have spiked in the last couple of years, and the constant balancing act of protecting sites from attack without inhibiting legitimate users. by Josh Zhang, Ryan Donovan, and Eira May (30 July 2024) ► The subtitle says it all.
- Agentically scrape the web with Firecrawl & LangGraph (LangChain) by Hai Nghiem (25 October 2024) ► The title says it all.
- NuExtract 1.5 by Simon Willison (16 November 2024) ► NuExtract models extract structured data from unstructured text.
Tool calling
- AI Agents' Secret Sauce by Sam Witteveen (7 October 2024) ► Some basic but good advice on how to implement tools.
- What is Tool Calling? Connecting LLMs to Your Data by Roy Derks (13 January 2025) ► The basic presentation of tool calling is classic. But the description of "embedded tool calling" is not detailed enough to understand how that can work.
- Agent Skills: Code Beats Markdown (Here's Why) by Sam Witteveen (27 March 2026) ► Instead of being simple data transferrers, scraping tools should handle as much as possible: parse and generate condensed structured data, limit the number of navigated pages, and cache the data.
Docling
- Docling by Simon Willison (3 November 2024) ► A short feedback on experimenting with docling.
- Building a Basic RAG System with Docling: A Comprehensive Guide↓ by Shashanka B R (24 December 2024) ► This presentation on doing RAG on a PDF file is rather bad, but the code (in the GitHub repo) is fine.
- How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites) by Dave Ebbelaar (13 February 2025) ► A presentation of Docling, a library for parsing documents.
- What Is Docling? Transforming Unstructured Data for RAG and AI by Cedric Clyburn (4 August 2025) ► A presentation for managers of Docling.
- Unlock Better RAG & AI Agents with Docling by Cedric Clyburn and Ming Zhao (8 January 2026) ► A presentation of Docling. This one can now be used as a MCP Server.
LiteParse
- LiteParse - The Local Document Parser by Sam Witteveen (26 March 2026) ► A presentation of LiteParse, a library to extract text or screenshots from PDF files, Office docs, and images.
Frameworks
- LLM Toolkit: Validation is all you need by Jeff Schomay (20 May 2024) ► Building a tool that, from an English question, performs a database request and generates an answer, using Instructor and Fructose.
- LangChain
  - LangChain101: Question A 300 Page Book (w/ OpenAI + Pinecone) by Greg Kamradt (27 February 2023) ► A small demo using LangChain, OpenAI, and Pinecone.
  - Workaround OpenAI's Token Limit With Chain Types by Greg Kamradt (1 March 2023) ► Some solutions to summarise or extract answers from too long documents.
  - The LangChain Cookbook - Beginner Guide To 7 Essential Concepts by Greg Kamradt (29 March 2023) ► Some short examples of the LangChain features.
  - ↪The LangChain Cookbook Part 2 - Beginner Guide To 9 Use Cases by Greg Kamradt (2 May 2023) ► The continuation of the previous video.
  - LangChain: Run Language Models Locally - Hugging Face Models by "Prompt Engineering" (25 April 2023) ► A demo of executing a model on Hugging Face and locally.
  - 5 Levels Of LLM Summarizing: Novice to Expert by Greg Kamradt (4 May 2023) ► More LangChain examples.
  - Scrape any website with OpenAI Functions & LangChain by "LLM School" (2 August 2023) ► The title says it all.
  - RAG avec LangChain : construire un chatbot documentaire (exemple OCTO) by Florian Bastin and Nicolas Cavallo (17 October 2023) ► A detailed example demonstrating how to extract data from Confluence, embed the chunks, create a chain to find and format the answer, and evaluate the result.
  - Content Extraction using Large Language Models & JavaScript↓ by Amanda Winkles (9 January 2025) ► An example of using LangChain with Granite to extract data from a PDF and Mistral Large to format it into a Markdown table. But the Json format is unspecified, she is using some flaky heuristic to clean up Granite’s answer, an LLM is an overkill to convert Json into a Markdown table…
  - LangChain RAG: Optimizing AI Models for Accurate Responses by Erika Russi (13 February 2025) ► A simple RAG system using LangChain and Granite 3.0 8B Instruct.
  - LangChain Reaches 1.0 - Whats new? by Sam Witteveen (26 October 2025) ► An overview of LangChain current products while it is raising more money.
- LangChain4j
  - Java Meets AI: A Hands On Guide to Building LLM Powered Applications with LangChain4j By Lize Raes (5 October 2023) ► An overview of LangChain4j.
  - Experiments with Langchain4j or Java way to LLM-powered applications by Iryna Hvozdyk (6 February 2024) ► A good overview of LangChain4j features, this is mostly for persons who do not know the typical AI use cases.
  - The Definitive Guide to Tool Support in LangChain4J by Ken Kousen (24 February 2024) ► A rather slow presentation of using tools in LangChain4j.
  - Java rencontre l'IA : Comment intégrer les LLMs dans vos applications avec LangChain4j by Lize Raes (3 May 2024) ► The same, in French and updated.
  - Evolution of Java Ecosystem for Integrating AI by Poonam Parhar (29 January 2025) ► Building a RAG chat using LangChain4j and Oracle Generative AI.
  - Agent Orchestration with LangChain4J by Lize Raes (30 November 2025) ► A short presentation of LangChain4j’s features for implementing agents.
Tools
- Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM by Simon Willison (31 July 2025) ► The title says it all.
- Ollama
  - Ollama on CPU and Private AI models! by Abdul Majed Raja (8 November 2023) ► A presentation of Ollama.
  - Ollama Web UI (ChatGPT-ish) - Local AI FTW!!! by Abdul Majed Raja (1 December 2023) ► Running Ollama Web UI in Docker.
  - Ollama's Newest Release and Model Breakdown by Matt Williams (21 September 2024) ► Ollama 0.3.11, Solar Pro Preview, Qwen 2.5, Bespoke Minicheck, Mistral Small, and Reader-LM.
  - Quick Look at Hollama↓ by Matt Williams (8 October 2024) ► The "unboxing" of Hollama, a good basic UI for Ollama. But there is little value in such a video, you can easily do the same yourself.
  - Ollama + HuggingFace - 45,000 New Models by Sam Witteveen (25 October 2024) ► Ollama can now use any GGUF recorded on Hugging Face.
  - Ollama: Llama 3.2 Vision by Simon Willison (13 November 2024) ► Some very little information about Ollama supporting the vision features of Llama 3.2.
  - Open WebUI by Simon Willison (27 December 2024) ► Simon Willison discovers Open WebUI, he is satisfied by the installation easiness, and he experiments it with Llama 3.2 3B.
  - Building a Vision App with Ollama Structured Outputs by Sam Witteveen (31 December 2024) ► A presentation of Ollama Structured Outputs and some examples using them with Llama 3.2’s vision.
  - Solved with Windsurf by Matt Williams (14 February 2025) ► Matt Williams wrote a utility in Rust, a language he barely knows, using Windsurf, to get a report on the models installed in Ollama.
  - Function calling using LLMs — Building AI Agents that interact with the external world. by Kiran Prakash (6 May 2025) ► A simple example of a script using tools and, then, converted to using MCP.
  - Ollama Gets a New App by Sam Witteveen (31 July 2025) ► The title says it all.
  - New Ollama UI by Matt Williams (31 July 2025) ► The same, but more informative.
  - Debugging Ollama↓ by Matt Williams (4 August 2025) ► Matt Williams experimented and discovered that OLLAMA_DEBUG’s behaviour has been changed. He should just have searched for it (PR).
  - 300 tps Just Using Ollama? by Matt Williams (6 August 2025) ► A short presentation of Ollama Turbo, which consists of models hosted in the Cloud.
  - Ollama Launch + Claude Code + GLM Flash by Sam Witteveen (25 January 2026) ► Testing Claude Code, as a mode, wtth GLM-4.7-Flash in Ollama: it is slow and results are not good.
  - Friends Don't Let Friends Use Ollama — Ollama gained traction by being the first easy llama.cpp wrapper, then spent years dodging attribution, misleading users, and pivoting to cloud, all while riding VC money earned on someone else's engine. Here's the full history, and why the alternatives are better. by "Zetaphor" (15 April 2026) ► "Zetaphor" describes some problems with the Ollama project: they do not indicate that they use Llama.cpp, they add complexity, their model offer is not clear…
  - The Ollama Course of Matt Williams
    - 1. The Ollama Course: Intro to Ollama by Matt Williams (23 July 2024) ► An overview of Ollama: installation, basic usage, and downloading a model.
    - 2. Installing Ollama by Matt Williams (30 July 2024) ► How to install Ollama on Windows, Linux, and MacOS.
    - 3. How to use the Ollama.com site to Find Models by Matt Williams (6 August 2024) ► An explanation of the description of Ollama models.
    - 4. The Ollama Course - Using the CLI by Matt Williams (14 August 2024) ► A presentation of all the CLI commands.
    - 5. Comparing Quantizations of the Same Model - Ollama Course by Matt Williams (21 August 2024) ► Compare the results of the same model with different quantisations and select the one that has the quality / speed that is the best for your needs.
    - 6. An Introduction to RAG - Part of the Free Ollama Course by Matt Williams (29 August 2024) ► A basic introduction to RAG.
    - 7. Embeddings in Depth - Part of the Ollama Course by Matt Williams (4 September 2024) ► An overview on how to perform embedding using Olllama.
    - Let's build a RAG system - The Ollama Course🚫 by Matt Williams (11 September 2024) ► An example of a small RAG program, both in Python and JavaScript.
    - What are the different types of models - The Ollama Course by Matt Williams (19 September 2024) ► A basic presentation of the model types: text/base, chat/instruct, code, and vision.
    - Crack Ollama Environment Variables with Ease - Part of the Ollama Course by Matt Williams (26 September 2024) ► The most important environment variables and how to set them on MacOS, Linux, and Windows.
    - Upgrade Your AI Using Web Search - The Ollama Course by Matt Williams (2 October 2024) ► A simple program using SearNGX and Cheerio to perform a Web search, retrieve the found pages, scrape the text in them, and generate an answer with Llama 3.2 1B.
    - Taming AI Hallucinations?🚫 by Matt Williams (9 October 2024) ► Matt Williams describes some basic facts about hallucination.
    - Unlock AI Mastery with Pro Tips on Prompting! by Matt Williams (16 October 2024) ► Some basics on prompt writing.
    - Master Ollama's File Layout in Minutes! by Matt Williams (23 October 2024) ► A description of how Ollama records the models using several files, similarly to what Docker does.
    - Don’t Embed Wrong! by Matt Williams (31 October 2024) ► Matt Williams speaks about using prefixes for RAG with Ollama, but there is no explanation of how they work, he just says that they improve results.
    - AI Model Context Decoded by Matt Williams (6 November 2024) ► How to change the context size and some warnings about using a large context size.
    - AI Vision Models Take a Peek Again! by Matt Williams (8 November 2024) ► Using Llama 3.2’s vision in Ollama 0.4.0.
    - Let's Update Ollama Everywhere by Matt Williams (13 November 2024) ► Explaining something very basic: upgrading Ollama on Mac, Windows, Linux, and Docker.
    - Cracking the Enigma of Ollama Templates by Matt Williams (20 November 2024) ► An introduction to model templates.
    - Find Your Perfect Ollama Build by Matt Williams (22 November 2024) ► How to build Ollama, the main branch or a PR.
    - Simplify Ollama Cleanup Like a Pro by Matt Williams (27 November 2024) ► A presentation of Gollama to clean up Ollama data and how to uninstall Ollama.
    - The Path To Better Custom Models by Matt Williams (6 December 2024) ► An introduction to Ollama model files.
    - The Truth About Ollama's Structured Outputs by Matt Williams (11 December 2024) ► A presentation of structured outputs and a comparison with JSON mode.
    - Optimize Your AI - Quantization Explained↓ by Matt Williams (28 December 2024) ► This description of model and context quantisation is unclear, mostly because there is no technical explanation.
    - MSTY Makes Ollama Better by Matt Williams (28 February 2025) ► A presentation of MSTY, a UI for Ollama.
- Docker Model Runner
  - Docker Model Runner: Running AI Models Locally Made Simple — Docker Model Runner: run AI models locally with zero setup. Pull from Docker Hub, chat via CLI or API. OpenAI-compatible. Beta. by Suleiman Dibirov (1 July 2025) ► Docker is also getting in the AI craziness, proposing an alternative to Ollama: Docker Model Runner.
- LangFLow
  - What is Langflow? Build AI Workflows with Python, Gen AI, & MCP Tools by David Jones-Gilardi (12 January 2026) ► A short presentation of LangFlow.
- llm
  - Language models on the command-line w/ Simon Willison by Simon Willison and Hugo Bowne-Anderson (13 June 2024) ► Simon Willison presents his llm CLI tools.
  - ↪Language models on the command-line by Simon Willison (17 June 2024) ► An overview of the video.
  - Using LLMs on the command line by Mark Needham (26 October 2024) ► A short presentation of llm.
  - Ask questions of SQLite databases and CSV/JSON files in your terminal by Simon Willison (25 November 2024) ► Simon Willison adds to sqlite-utils the possibility to ask questions in natural language and have an LLM generate the SQL query.
  - How I use LLMs – neat tricks with Simon’s `llm` tool — Earlier this year I co-authored a report about the direct environmental impact of AI, which might give the impression I’m massively anti-AI, because it talks about the signficant social and environmental of using it. I’m not. I’m (still, slowly) working through the content of the Climate Change AI Summer School, and I use it a fair amount in my job. This post shows some examples I use. by Chris Adams (30 December 2024) ► Some positive feedback and some examples of usage of llm.
  - LLM 0.22, the annotated release notes by Simon Willison (17 February 2025) ► The title says it all.
  - Structured data extraction from unstructured content using LLM schemas by Simon Willison (28 February 2025) ► Simon Willison added support of JSON schemas to llm.
  - llm-openrouter 0.4 by Simon Willison (10 March 2025) ► Simon Willison improved the support of OpenRouter.
  - Feed a video to a vision LLM as a sequence of JPEG frames on the CLI (also LLM 0.25) by Simon Willison (5 May 2025) ► The release notes of llm 0.25 with some details about a new llm-video-frames plugin to extract frames from a video and send them to the model.
  - LLM 0.26a0 adds support for tools! by Simon Willison (14 May 2025) ► Simon Willison prototypes tool integration in llm.
  - How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation by Simon Willison (24 May 2025) ► Sean Heelan found a use-after-free bug using GPT-o3 via llm.
  - LLM 0.27, the annotated release notes: GPT-5 and improved tool calling by Simon Willison (11 August 2025) ► GPT-5 is now supported and tools can now be configured in templates.
  - LLM 0.32a0 is a major backwards-compatible refactor by Simon Willison (29 April 2026) ► Simon Willison is rewritting the insides of llm to better handle multimodality, his design is classic and seems to be a thin layer on top of the current APIs.
  - Using LLM in the shebang line of a script by Simon Willison (11 May 2026) ► The title says it all.
Agents
- 5 Problems Getting LLM Agents into Production by Sam Witteveen (4 June 2024) ► Some advice on using agents.
- Evals for AI Agents, the right way!!! by Abdul Majed Raja (12 August 2024) ► The usual bad presentation of a paper ("TOOLSANDBOX: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities") evaluating the efficiency of using LLM as agents.
- Agent-S : Unleash The Power Of GUI Computer Use Agents ! by Sam Witteveen (21 October 2024) ► A high-level presentation of "Agent S: An Open Agentic Framework that Uses Computers Like a Human: a framework to use applications as a human would do it.
- Microsoft Launches 10 NEW AI Agents by Sam Witteveen (24 November 2024) ► Mircrosoft is moving agressively on AI and integrates 10 agents in Dynamics 365.
- Building effective agents (19 December 2024) ► A good and simple overiew of some workflow and agent architectures.
- ↪Building effective agents by Simon Willison (20 December 2024) ► Some extracts of the previous article.
- ↪How to Build Effective AI Agents (without the hype) by Dave Ebbelaar (20 January 2025) ► This video is just Anthropic’s article.
- Trace & Evaluate your Agent with Arize Phoenix by Sri Chavali, John Gilhuly, and Aymeric Roucher (28 February 2025) ► A presentation of Arize Phoenix, a platform to trace and evaluate smolagents, the evaluation uses LLM-as-a-judge.
- 5 Types of AI Agents: Autonomous Functions & Real-World Applications by Martin Keen (28 April 2025) ► Martin Keen proposes these categories: simple reflex, model-based reflex, goal-based, utility-based, and learning.
- Must Haves For Agents in Production by Sam Witteveen (15 April 2026) ► An advertisement for TrueFoundry: Model Control, Prompts, Guardrails, Budget Limiting, Tools, Monitoring and Tracing, and Evals.
- LangGraph
  - AgentWrite with LangGraph by Sam Witteveen (6 September 2024) ► Sam Witteveen describes how he set up a short LangGraph example to write long articles, similarly to LongWriter.
  - Building a LangGraph ReAct Mini Agent by Sam Witteveen (17 September 2024) ► A description of a simple Pattern in LangGraph: ReAct Function Calling.
- ChatDev
  - Build AI agent workforce - Multi agent framework with MetaGPT & chatDev by Jason Zhou (8 September 2023) ► A presentation of ChatDev.
- CrewAI
  - CrewAI August Update: Planning Steps, Training, and Advanced Features Explained by Sam Witteveen (20 August 2024) ► Sam Witteveen presents some new CrewAI features, but there is no explanation on how training is taken into account, on how test scores are computed…
  - SDS 918: Multi-Agent Systems with CrewAI (⧉) by Jon Krohn (29 August 2025) ► A short and naive presentation of CrewAI.
- Autogen
  - Autogen - Microsoft's best AI Agent framework that is controllable? by Jason Zhou (3 October 2023) ► A presentation of AutoGen.
  - Microsoft's Magentic One: This FREE AI AGENT can CONTROL BROWSER, DO CODING & MORE! by "AICodeKing" (10 November 2024) ► A presentation and some little test of Magentic-One, a multi-agent system from Microsoft able to surf on the Web, read local file, write code, and pilot a terminal to execute that code.
  - Multi-Agent AI EXPLAINED: How Magentic-One Works by Sam Witteveen (13 November 2024) ► A better presentation of Magentic-One.
- Swarm
  - Introducing Swarm with Code Examples: OpenAI's Groundbreaking Agent Framework by Sam Witteveen (14 October 2024) ► Some simple examples using Swarm framework and some feedback about it.
- PydanticAI
  - PydanticAI - The NEW Agent Builder on the Block by Sam Witteveen (4 December 2024) ► Yet another framework. PydanticAI is simple and pythonic.
  - PydanticAI - Building a Research Agent by Sam Witteveen (6 December 2024) ► Using PydanticAI to create an agent for Web search.
- smolagents
  - smolagents - HuggingFace's NEW Agent Framework by Sam Witteveen (6 January 2025) ► Hugging Face has created yet another agent framework. Sam Witteveen does his usual presentation and experimentation with it.
  - ↪How to make Muilt-Agent Apps with smolagents by Sam Witteveen (8 January 2025) ► More experimentation with smolagents, in particular with multiple agents configurations.
Desktop agents
- UI-TARS AI Agent: This IS THE BEST AI Agent EVER & BEATS Claude's Computer Use! by "AICodeKing" (23 January 2025) ► A simplistic demo of UI-TARS, an agent that can pilot applications UI.
Browser agents
- Browser Use Agent: This FULLY FREE AI Agent CAN CONTROL BROWSERS & DO ANYTHING! (Beats Anthropic!) by "AICodeKing" (18 November 2024) ► A presentation of Browser Use, a Python framework to create agents able to drive a Browser.
- Deepseek Operator (+Free APIs) : This 100% FREE AI Agent Beats OpenAI's Operator FOR FREE! by "AICodeKing" (24 January 2025) ► A demo of Browser Use WebUI, a UI for a Browser agent.
- Qwen-2.5 Operator: This is The BEST LOCAL AI Operator Agent THAT YOU CAN USE NOW! by "AICodeKing" (30 January 2025) ► Using Browser Use with Qwen2.5-VL.
- Gemini Browser Use by Sam Witteveen (14 February 2025) ► Some simple experimentation with Browser Use and Gemini 2.0.
OpenAI Agent SDK
- How to Build an Agent with the OpenAI Agents SDK by Sam Witteveen (17 March 2025) ► A classic Sam Witteveen’s presentation.
OpenClaw
- The wild rise of OpenClaw... by Jeff Delaney (30 January 2026) ► A presentation of OpenClaw.
- Moltbook is the most interesting place on the internet right now by Simon Willison (30 January 2026) ► The world is crazy about OpenClaw, there ie even a chat site where OpenClaw assistants can xommnicate together.
- The Moltbook Situation by Michael B. Paulson (31 January 2026) ► The same.
- Running OpenClaw in Docker by Simon Willison (1 February 2026) ► Simon Willison explains how he installed OpenClaw using Docker Compose.
- A Social Network for A.I. Bots Only. No Humans Allowed. by Simon Willison (2 February 2026) ► Simon Willison has been interviewed about Moltbook.
- Clawdbot to Moltbot to OpenClaw: The 72 Hours That Broke Everything (The Full Breakdown) by Nate B. Jones (2 February 2026) ► Nate B. Jones’ analysis of OpenClaw current buzz: it is a glimpse at the future, but too dangerous to be used if you are not a technical guru and a daredevil.
- The Moltbook Experiment Failed by Michael B. Paulson (3 February 2026) ► Moltbook was quickly kacked.
- OpenClaw Agents Are Hiring Each Other. Transferring Crypto. Building Societies. This Is Real. by Nate B. Jones (3 February 2026) ► It seems that Nate B. Jones was not aware that Moltbook was a hacker paradise when he captured this analysis that Moltbook is the proof that agents cas self-organise…
- The rise of Moltbook suggests viral AI prompts may be the next big security threat — We don’t need self-replicating AI models to have problems, just self-replicating prompts. by Benj Edwards (3 February 2026) ► Moltbook could be a preview of prompt worms that will replicate themselves across Internet.
- Andrej Karpathy talks about "Claws" by Simon Willison (21 February 2026) ► Andrej Karpathy has tweeted about "claws", all the OpenClaw variants.
- Openclaw deletes entire inbox by Michael B. Paulson (25 February 2026) ► Summer Yue used OpenClaw to clean up her inbox…
- NVIDIA NemoCLAW!! - GTC 2026 by Sam Witteveen (17 March 2026) ► NVIDIA announced NemoClaw, an environment around OpenClaw to secure it.
- How to Avoid Runaway API Costs in OpenClaw by Brian Gershon (22 March 2026) ► Brian Gershon provides some advice to reduce the token bill when using OpenClaw.
- I finally found a use case for OpenClaw…↓ by Jeff Delaney (23 April 2026) ► An advertisement for Hostinger’s OpenClaw hosting.
- Warelay -> OpenClaw by Simon Willison (16 May 2026) ► The evolution of OpenClaw name.
- AI Agent writes hit piece
  - An AI Agent Published a Hit Piece on Me by Scott Shambaugh (12 February 2026) ► An OpenClaw agent (or maybe a human) is playing the fool with Matplotlib’s maintainer.
  - An AI Agent Published a Hit Piece on Me by Simon Willison (12 February 2026) ► A summary of the previous blog article.
  - An AI Agent Published a Hit Piece on Me – More Things Have Happened by Scott Shambaugh (13 February 2026) ► Scott Shambaugh gives an update: Ars Technica screwed up an article with fake citations, the bot is still active, Brandolini’s law, and the end of reputation.
  - Sorry all this is my fault by Benj Edwards (15 February 2026) ► Benj Edwards explains how he ended up providing fake citations to his co-author.
  - The obnoxious GitHub OpenClaw AI bot is … a crypto bro by David Gerard (16 February 2026) ► David Gerard presents the hypothesis that the hit piece was written by a crypto bro to scam more persons by creating a crypto token having the same name as the bot.
  - An AI Agent Published a Hit Piece on Me – Forensics and More Fallout by Scott Shambaugh (17 February 2026) ► The story continues, but there is little information here: statistic on the bot activity and a post written by this one.
  - Rathbun’s Operator (17 February 2026) ► The bot’s operator tells their version of the story.
  - An AI Agent Published a Hit Piece on Me – The Operator Came Forward by Scott Shambaugh (19 February 2026) ► Scott Shambaugh describes his hypotheses and the probability he allocates to each one.
  - AI Agent writes hit piece by Michael B. Paulson (20 February 2026) ► Even Michael B. Paulson is speaking about this story.
MCP
- What is MCP? Integrate AI Agents with Databases & APIs by Roy Derks (19 February 2025) ► A high-level description of MCP.
- microsoft/playwright-mcp by Simon Willison (25 March 2025) ► Microsoft released an MCP server wrapping Playwright.
- MCP Tools vs Official MCP Inspector: Choosing the Right Tool for Model Context Protocol Development — Discover the key differences between the official Official MCP Inspector and MCP Tools. Learn when to use each tool and how MCP Tools offers advanced capabilities for proxy, mock servers, and CLI workflows. by Fatih Kadir Akın (29 March 2025) ► The subtitle says it all.
- Building an MCP server in 2 minutes.... by "2MinutesPy" (13 April 2025) ► A simplistic example of implementing a MCP server in Python.
- Model Context Protocol (MCP) : connecter vos LLMs à vos données et outils by Teilo Millet, Gireg Roussel, and Ismael Debbagh (18 April 2025) ► A long high-level presentation of MCP.
- MCP Crash Course: What Python Developers Need to Know by Dave Ebbelaar (19 April 2025) ► This presentation of MCP is rather slow and not so clear.
- Tiny Agents: an MCP-powered agent in 50 lines of code by Julien Chaumond (25 April 2025) ► A simple JavaScript example of using MCP.
- ↪Tiny Agents in Python: a MCP-powered agent in ~70 lines of code by Célina Hanouti, Julien Chaumond, Lucain Pouget, and Shaun Smith (23 May 2025) ► The same in Python.
- MCP is not REST API by Han Lee (17 May 2025) ► Han Lee explains that having a MCP Server that simply maps a REST API is a bad idea: REST is about providing a low-level CRUD API to resources, agent tools should be high-level actions.
- Make AI Agents Fetch Real-time Data using THIS Powerful MCP Server⇊ by "2MinutesPy" (26 May 2025) ► This is not a presentation of MCP, but an advertisement for Bright Data.
- MCP Demo Part 1 | Ep. 21 Bits and Booze by Scott Chacon and José Esteban Vega Carrillo (26 June 2025) ► A presentation of MCP and a demo of building a very simple MCP server (a single tool performing a git commit).
- ↪MCP Demo Part 2 | Ep. 22 Bits and Booze by Scott Chacon and José Esteban Vega Carrillo (11 July 2025) ► The continuation of the previous video. Scott Chacon and José Esteban Vega Carrillo got difficulties to make it run successfully.
- Building the Hugging Face MCP Server by Shaun Smith, Julien Chaumond, Eliott Coyac, and Abubakar Abid (10 July 2025) ► Some information about how Hugging Face implemented their MCP server.
- MCP Is Not Your REST API: 5 Principles by Tony Lewis (12 August 2025) ► Some good advice on implementing a MCP Server.
- too many model context protocol servers and LLM allocations on the dance floor by Simon Willison (22 August 2025) ► People tend to overuse MCP servers, while some other solutions use much fewer tokens in the context.
- [Video Response] What Cloudflare's code mode misses about MCP and tool calling by Yannic Kilcher (19 October 2025) ► Yannic Kilcher comments the claim that it is more efficient that the LLM calls tools via API calls rather than by using MCP: by using MCP, it is possible to handle data which is not properly structured.
- How to Automate Anything with Python Inside Claude Desktop (Using MCP) by Dave Ebbelaar (31 October 2025) ► Implementing a very small and simple MCP Server using Python and uv.
- Code execution with MCP: Building more efficient agents — Direct tool calls consume context for each definition and result. Agents scale better by writing code to call tools instead. Here's how it works with MCP. by Adam Jones and Conor Kelly (4 November 2025) ► The author suggests using MCP servers as code APIs rather than direct tool calls, and explain the advantages to do so: lesser use of context space, the data remains private (it is not seen by the LLM), persistence…
- MCP vs RAG: Two Very Different Ways to Gain Context — At first glance RAG and MCP seem similar. In practice, they solve very different problems and lead to very different system designs. by PJ Hagerty (14 January 2026) ► I do not undestand this comparison between MCP and RAG: they do not have the same purpose, MCP is a protocol handling tools (anf other things), RAG is about using semantic search to complete the context. You could use a tool that performs semantic search, I would still call this RAG.
- Let's learn about MCP Apps by Burke Holland (2 February 2026) ► A presentation of MCP Apps: an extension of MCP allowing a MCP server to provide an UI that the MCP client will display.
- Creating a Wikipedia MCP Server in Java in a Few Prompts with Skills by Guillaume Laforge (2 April 2026) ► Guillaume Laforge explains how he used Gemini CLI to create a MCP Server to search Wikipedia.
- MCP UI: Extending the frontier — Liad Yosef and Ido Salomon, MCP Apps by Liad Yosef and Ido Salomon (6 May 2026) ► A presentation of MCP-UI and its current evolution, MCP Apps.
Hugging Face
- Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ by Lucain Pouget, Célina Hanouti, and Julien Chaumond (25 July 2025) ► Hugging Face has revamped its CLI.
- Sentence Transformers is joining Hugging Face! by Tom Aarsen (22 October 2025) ► Hugging Face is taking over the ownership of Sentence Transformers.
- Building the Open Agent Ecosystem Together: Introducing OpenEnv by Joseph Spisak, Davide Testuggine, Zach Wentz, Pierre Andrews, Sanyam Bhutani, Hamid Shojanazeri, Pankit Thapar, Emre Guven, Lewis Tunstall, and Vaibhav Srivastav (23 October 2025) ► Hugging Face creates a hub where people can propose and get environments to use for training or running agents.
- Building for an Open Future - our new partnership with Google Cloud by Jeff Boudier and Simon Pagezy (13 November 2025) ► The announcement of some improvements for using Hugging Face’s models in Vertex AI, deploying models in Google from Hugging Face, running models on TPU…
- We Got Claude to Fine-Tune an Open Source LLM by Ben Burtenshaw and Shaun Smith (4 December 2025) ► Hugging Face has created a skill.md file to instruct an LLM how to drive the fine-tuning of a model in their environment.
- ggml.ai joins Hugging Face to ensure the long-term progress of Local AI by Simon Willison (20 February 2026) ► Simon Willison considers that gglm.ai joining Hugging Face is a good thing for Llama.cpp’s future.
- Inference Providers
  - Welcome to Inference Providers on the Hub 🔥 by Burkay Gur, Zeke Sikelianos, Anton McGonnell, Hassan El Mghari, Simon Brandeis, Bertrand Chevrier, and Julien Chaumond (28 January 2025) ► Hugging Face is creating a hub for inference providers.
  - Welcome Fireworks.ai on the Hub 🎆 by Teo Feliu, Shaunak Godbole, and Julien Chaumond (14 February 2025) ► Fireworks.ai is now a supported inference provider on Hugging Face Hub.
  - Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 by Julien Chaumond, Bertrand Chevrier, Vaibhav Srivastav, Simon Brandeis, Albert Abdulmanov, Viktor Hu, and Connor Chevli (18 February 2025) ► The title says it all.
  - Cohere on Hugging Face Inference Providers 🔥 by Vaibhav Srivastav, Ben Burtenshaw, Merve Noyan, Célina Hanouti, Alejandro Rodriguez, Julien Chaumond, and Simon Brandeis (16 April 2025) ► The same with Cohere.
  - Featherless AI on Hugging Face Inference Providers 🔥 by Wesley George, Poh Nean, Eugene Cheah, Célina Hanouti, Lucain Pouget, and Simon Brandeis (12 June 2025) ► The same with Featherless AI.
  - Groq on Hugging Face Inference Providers 🔥 by Ben Ankiel, Hatice Ozen, Célina Hanouti, Lucain Pouget, and Simon Brandeis (16 June 2025) ► The same with Groq.
  - Public AI on Hugging Face Inference Providers 🔥 by Joseph Low, Joshua Tan, Célina Hanouti, Julien Chaumond, Simon Brandeis, and Lucain Pouget (17 September 2025) ► The same with Public AI.
  - Scaleway on Hugging Face Inference Providers 🔥 by Guillaume Noale, Fred Bardolle, Guillaume Calmettes, Constance Morales, Célina Hanouti, Julien Chaumond, Simon Brandeis, and Lucain Pouget (19 September 2025) ► … Scaleway.
  - OVHcloud on Hugging Face Inference Providers 🔥 by Gilles Closset, Fabien Ric, and Elias Tourneux (24 November 2025) ► OVH is of course the following one.
Together AI
- Why AI needs its own cloud — Together AI CEO on the next era of infrastructure by Vipul Ved Prakash and Prem Prakash (2 March 2026) ► A short non-informative interview with Vipul Ved Prakash, the CEO of Together AI.
OpenShell
- OpenShell Agents by Sam Witteveen (21 May 2026) ► A presentation of OpenShell.