Providers
Libraries
- LangChain: doc, Python API, source
- GPT4All: doc, code
- LangChain4j: API doc, release notes, source
- Spring AI
- Griptape: doc
- LangFlow
- CUGA
MCP
- Model Context Protocol
- MCP Inspector
- mcptools
- GitHub MCP Registry
- MCP Servers on GraphQL
AGENTS.md
Skills
Mess
AssemblyAI: YouTube📡
: YouTube📡↓
: YouTube📡↓
: YouTube📡
: YouTube📡↓
Articles and videos
- LLaMA & Alpaca: “ChatGPT” On Your Local Computer 🤯 | Tutorial by (18 March 2023) ► A short explanation on how to use Dalai and LLaMA.
- Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps by (11 June 2023) ► A small effective demo of using Hugging Face and LangChain.
- $0 Embeddings (OpenAI vs. free & open source)↑ by (25 June 2023) ► A demo of two ways to compute embeddings: online with Hugging Face and locally in the browser.
- "Next Level Prompts?" - 10 mins into advanced prompting by (29 August 2023) ► Some tools/sites helping to write prompts: Guidance, FlowGPT, gpt-prompt-engineer, PromptsRoyale.
- A developer’s guide to open source LLMs and generative AI — Open source generative AI projects are a great way to build new AI-powered features and apps. by (5 October 2023) ► Some information on open-source LLMs and a short list of four ones.
- 🤬 How the #@%$! Do You Use an LLM in a SaaS Platform?↓ by (6 October 2023) ► describes his first steps to build learntail.com, using OpenAI and Langchain to generates quizzes.
- Pydantic is all you need: Jason Liu by (9 October 2023) ► presents his Instructor library to structure prompting and extraction (for OpenAI).
- How I Fine-Tuned An AI Clone - Can You Tell The Difference?↓ by (2 November 2023) ► A lengthy but too fast video just to end up with using HeyGen to create a deep fake video.
- LLM: Trust, but Verify — Understand the challenges of developing, testing, and monitoring non-deterministic software; this is a new and significant challenge for observability. by (3 November 2023) ► The author describes the problem of model drift and proposes a mechanism to detect it.
- Wanna RAG? These are your best LLMs!!! by (16 November 2023) ► A presentation of Galileo’s Hallucination Index.
- No, You DON'T NEED OpenAI Function Calling!!!! by (17 November 2023) ► A quick and dirty presentation of Gorilla OpenFunctions.
- Training Your Own AI Model Is Not As Hard As You (Probably) Think by (22 November 2023) ► Using several steps to generate code from a Figma design: I wonder if what is presented here really works on other cases than this demo.
- llamafile is the new best way to run an LLM on your own computer by (29 November 2023) ► A presentation of llamafile: a single file containing the model and its executable which can run on several OSes.
- Detect Texts from Documents (even SCANNED)!!! by (14 January 2024) ► A presentation of Surya: a tool to identify text lines and compute their bounding boxes.
- Exploring ColBERT with RAGatouille by (27 January 2024) ► Some experimentation with ColBERT, a fast retrieval model.
- Everything WRONG with LLM Benchmarks (ft. MMLU)!!! by (10 February 2024) ► Presenting a paper ("When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards") which analyses how model’s scores are sensible to the way benchmarks are structured.
- Engineering Practices for LLM Application Development by and (13 February 2024) ► Lessons learned from the building of a PoC of a concierge using an LLM.
- La recherche sous stéroïdes - une histoire de sémantique↓ by and (3 May 2024) ► Some feedback about implementing semantic search on an e-commerce site. This could have been much shorter.
- Poorman's ChatGPT-4o Works!! 🤣 by (15 May 2024) ► A short presentation of KingNish/OpenGPT-4o a Hugging Face space supporting several modalities by using open models.
- The 4 Big Changes in LLMs by (1 July 2024) ► advices to consider four things: models are getting smarter, they are getting faster, there are getting cheaper, and context windows are getting larger.
- RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing by , , , , , , , and (1 July 2024) ► The authors propose a router that selects to run a query either toward an expensive powerful model or toward a cheaper smaller model, in order to reduce cost while sacrificing little quality.
- ↪What is an LLM Router? by (3 July 2024) ► Nothing more than the previous announcement.
- InternLM - A Strong Agentic Model? by (5 July 2024) ► A basic presentation of internlm/internlm2_5-7b-chat, a model specialised for JSON and function calling.
- Prompt Poet - Character AI's Prompting Framework by (2 August 2024) ► A presentation of Prompt Poet, a Python framework to manage prompts.
- Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 1) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this first part of a blog series, we'll explore the fundamental principles of LLM caching, delve into the various caching architectures and implementations that can be employed by (7 August 2024) ► Some cache architectures for LLM, classic and RAG. The cache key can be exact or semantic.
- ↪Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 2) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this second part of a blog series, we'll explore LLM caching implementations. by (7 August 2024) ► How to implement the previous architecture on AWS using LangChain or not.
- How streaming LLM APIs work by (21 September 2024) ► Some experimentation of using SSE with GPT-4o Mini, Sonnet 3, and Gemini Pro using
curl, Python‘sHTTPX, and JavaScript’sfetch(). - Is Spring AI Strong Enough for AI? — Explore Spring's capabilities within the AI domain, its potential integration with AI libraries, and its ability to effectively manage AI workflows. by (27 September 2024) ► This article is comparing very different things: Spring, TensorFlow Serving, Kubernetes, MLflow, and Python. Additionally, it only states some obvious facts.
- Explore a New C# Library for AI by (11 October 2024) ► Some very little information about
Microsoft.Extensions.AI, some new .NET packages to integrate AI. - Run a prompt to generate and execute jq programs using llm-jq by (27 October 2024) ► A new
llmplugin to generate and executejqcommands. - How Google is helping developers get better answers from AI — Today’s guest is Logan Kilpatrick, a senior product manager at Google, who tells Ben about his journey from software engineering to machine learning to product management, all with an emphasis on reducing developer friction. They talk through the challenges of non-determinism in AI models and how Google is addressing these issues with a new feature: Grounding with Google Search. Plus, what working at the Apple Store taught Logan about product management. by and (5 November 2024) ► There is no real information in this interview of , a product manager for Google AI Studio.
- Model Compression: Improving Efficiency of Deep Learning Models — Model compression is a key component of real-time deployment of deep learning models. This article explores different approaches to make models more efficient. by (6 November 2024) ► A high-level and clear description of model pruning, quantisation, and knowledge distillation.
- ChainForge by (8 November 2024) ► Some little information about ChainForge, a tool to evaluate prompts.
- Introducing the Model Context Protocol by (25 November 2024) ► Anthropic proposes a protocol to connect a LLM to tools and data sources.
- Anthropic's New Agent Protocol! by (27 November 2024) ► A presentation of Model Context Protocol and some experimentation with it.
- 17 Python Libraries Every AI Engineer Should Know by (12 December 2024) ► The title says it all.
- Integrating AI With Spring Boot: A Beginner’s Guide — In this guide, you will learn how to integrate AI into your Spring Boot app using Spring AI and simplify your AI setup with familiar Spring abstractions. by (27 January 2025) ► An introduction to Spring AI.
- files-to-prompt 0.5 by (14 February 2025) ► describes his
files-to-prompttool, used to send some files and a prompt to a LLM. - Emerging Patterns in Building GenAI Products by and (25 February 2025) ► A good overview of the common methods for integrating generative AI (mostly LLMs).
- Open Deep Research (16 April 2025) ► Together AI explains how they build their open-source deep research.
- Learn the Hugging Face Kernel Hub in 5 Minutes by , , , , , , and (12 June 2025) ► Hugging Face now hosts optimised kernels that can be easily downloaded and used in our own models.
- Building a SNAP LLM eval: part 1 by (19 June 2025) ► A description of the need to evaluate models and the first step of this evaluation: having a domain expert experimenting with the models to get a feeling of their strengths and weaknesses.
- ↪Building a SNAP LLM eval: Part 2 - testing and automation by (19 June 2025) ► How to automate (using promptfoo) the evaluation of the knowledge of the facts.
- ↪Building a SNAP LLM eval: part 3 - testing nuanced capabilities by (23 April 2025) ► How to automate the evaluation of nuanced capabilities.
- ↪Exploring Promptfoo via Dave Guarino’s SNAP evals by (24 April 2025) ► Some information extracted from the previous articles.
- How Long Contexts Fail — Managing Your Context is the Key to Successful Agents by (22 June 2025) ► Throwing everything in a very long context is not the simple solution we may believe it is, there are many problems with these long contexts.
- ↪How to Fix Your Context — Mitigating & Avoiding Context Failures by (26 June 2025) ► Some advice to better manage the context.
- How to deploy LLMs in 1 click...↓ by (15 July 2025) ► This is simply an advertisement for Novita, a company renting cloud GPUs!
- LangExtract - Google's New Library for NLP Tasks↓ by (4 August 2025) ► A description of LangExtract, an NLP library from Google. is botching the video even more than usual.
- Introducing AI Sheets: a tool to work with datasets using open AI models! by , , , , , and (8 August 2025) ► Hugging Face proposes a new tool to evaluate models/prompts on a dataset that can be generated or imported.
- SDS 917: 8 Steps to Becoming an AI Engineer, with Kirill Eremenko (⧉) by and (26 August 2025) ► describes the 8-weeks formation his company, SuperDataScience, is selling: an overview of prompting, RAG, agents, both for the PoC phase and the production stage.
- Build a Local LLM App in Python with Just 2 Lines of Code by (8 October 2025) ► presents his chuk-llm library.
- Prompt Engineering for LLMs, PDL, & LangChain in Action by (10 November 2025) ► A short introduction to LangChain and PDL.
- AI & Text to SQL: How LLMs & Schema Power Data Analytics by (13 December 2025) ► An introduction to the writing of SQL queries using a LLM.
- How to Use Agentic AI: LLMs, AI Agents & Prompt Engineering in Action↓ by (27 December 2025) ► This description of replacing a single prompt with a workflow of four ones if unclear.
- 2026/01/13 - ParisJug Academy - Spring AI with Docker model runner and Debugging by (13 January 2026) ► A demo of Spring AI and Docker Model Runner.
- Open Responses - The NEW Standard API for Open Models by (20 January 2026) ► OpenAI proposes a standard for the format payload for chat completion API.
- Beating Cowork with Open Source Cowork by (21 January 2026) ► A presentation of Eigent and its predecessor CAMEL-AI.
- Running Pydantic’s Monty Rust sandboxed Python subset in WebAssembly by (6 February 2026) ► Pydantic created a Rust implementation of a subset of Python. converted it into a WASM file and into a Wheel file runnable in Pyodide.
- What Is Agentic Storage? Solving AI’s Limits with LLMs & MCP by (5 March 2026) ► Using MCP to allow AI agents to store data and how to secure the operations on that storage.
- 7 new open source AI tools you need right now… by (12 March 2026) ► Agency Agents, PromptFoo, MiroFish, Impeccable, OpenViking, Heretic, NanoChat.
- What Are Hierarchical AI Agents? Solving Context & Task Challenges by (12 March 2026) ► A basic presentation of agent hierarchies, their advantages, and their problems.
- What Is Llama.cpp? The LLM Inference Engine for Local AI by (16 March 2026) ► A presentation of Llama.cpp.
- Building Single-User vs Multi-User Agents: What Actually Changes by (24 March 2026) ► This video is not really about single-user vs. multi-users, but about the fact that a quick n’ dirty personal tool is not the same as a professional multi-tenants one.
- An ADK Java agent powered by Gemma 4 by (2 April 2026) ► Three ways to call a Gemma 4 models: using ADK for AI Studio, ADK and the LangChain4j bridge for vllm, and the same for Ollama.
- LLM Wiki by (4 April 2026) ► describes a "pattern for building personal knowledge bases using LLMs"; the comments are about every one building their own over-ambitious knowledge repository.
- The 7 Skills You Need to Build AI Agents by (14 April 2026) ► System Design, Tool and Contract Design, Retrieval Engineering, Reliability Engineering, Security and Safety, Evaluation and Observability, Product Thinking. is just listing subjects with little detail. I do not see the value of such a list.
- How Claude's Design Agents Work by (1 May 2026) ► found six design patterns in Claude Design: context grounding, structured memory, multimodal interaction with the user, self evaluation, generation of multiple versions, and handoff using a common format.
- Fine-tuning
- Fine Tune a model with MLX for Ollama by (30 August 2024) ► How to fine-tune a model with MLX and use it in Ollama.
- ↪Is MLX the best Fine Tuning Framework? by (18 January 2025) ► A detailed introduction to fine-tuning with MLX. This is an expanded version of the previous video.
- Fine-tuning Large Language Models by , , , and (16 January 2025) ► The basics of LLM and fine-tuning, a demo of Together’s LoRA fine-tuning API, some experiments done by Together, and some pieces of advice.
- Fast Fine Tuning with Unsloth by (24 January 2025) ► A presentation of Unsloth which optimises fine-tuning on Nvidia GPUs.
- Axolotl is a AI FineTuning Magician↓ by (31 January 2025) ► This presentation of Axolotl is too verbose and it is not very good because does not master the subject.
- RAG
- ClippyGPT - How I Built Supabase’s OpenAI Doc Search (Embeddings) by (7 February 2023) ► describes in details how he implemented a chat to answer questions on Supabase: tokenising the doc, finding the paragraph closest to the question, and generating the answer.
- Build RAG Application Using a LLM Running on Local Computer with GPT4All and Langchain — Privacy-preserving LLM without GPU↑ by (10 March 2024) ► A clear explanation with working code of how to scrape an Internet doc, to chunk it, to store it in Chroma, and to use GPT4All to generate the answer.
- Ne mettez pas les projets RAG en production trop vite ! by (3 June 2024) ► lists some examples of problems that will occur with a too simplistic implementation of a RAG. But this simply means that you do not design a demo and a scalable application the same way, the second is much more complex.
- ↪Rendre résilient un projet RAG by (17 June 2024) ► suggested many changes to LangChain in order to make it more resilient, e.g. to properly support transactions.
- Breaking up is hard to do: Chunking in RAG applications — A look at some of the current thinking around chunking data for retrieval-augmented generation (RAG) systems. by (6 June 2024) ► A high-level presentation of some chunking methods and how to evaluate them.
- Supercharging RAG with Generative Feedback Loops from Weaviate by (17 June 2024) ► A presentation of Generative Feedback Loops, which is just about storing LLM generated text in a vectorial database, so it be retrieved quickly rather than regenerated by the LLM.
- Building search-based RAG using Claude, Datasette and Val Town by (21 June 2024) ► The debrief of a life session of implementing a small RAG in Val Town.
- Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) by (26 June 2024) ► A critique of "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools".
- Gemma 2 - Local RAG with Ollama and LangChain by (28 June 2024) ► A simple RAG implementation.
- Practical tips for retrieval-augmented generation (RAG) — Retrieval-augmented generation (RAG) is one of the best (and easiest) ways to specialize an LLM over your own data, but successfully applying RAG in practice involves more than just stitching together pretrained models. by (15 August 2024) ► Some high-level advice on how to implement RAG.
- Knowledge Graphs: The Secret Weapon for Superior RAG Applications — Integrating knowledge graphs in RAG applications enhances recommendation accuracy and context-awareness, providing structured, interconnected data.🚫 by , , and (19 August 2024) ► This article is only about the data retrieval. The data needs to be structured, so it can be stored as a semantic graph.
- RAG vs. Fine Tuning by (9 September 2024) ► The basics of RAG vs. fine-tuning, and a description of combining both.
- Introducing Contextual Retrieval↑ (19 September 2024) ► Anthropic experimented RAG with adding context to chunks, using embedding and BM25, and reranking.
- ↪Contextual RAG is stupidly brilliant!↓ by (23 September 2024) ► This presentation of Anthropic’s analysis on how to improve RAG is poorly done.
- Multimodal Document RAG with Llama 3.2 Vision and ColQwen2 by (8 October 2024) ► A presentation of ColPali design: using a vision language model (PaliGemma or Qwen-2) to transform image patches into vectors, finding the patch vectors nearest to the user query, and providing the corresponding full images and user query to a vision LLM (Llama 3.2 vision).
- Why Your RAG System Is Broken, and How to Fix It with Jason Liu (⧉) by and (11 November 2024) ► Some advice on RAG implementation: doing fast and simple evals (e.g. looking at the length, using regexp…), use them very frequently, reranking…
- Build a document-based question answering system by using Docling with Granite 3.1 by , , and (18 December 2024) ► A small demo of interrogating a document using Granite, Docling, LangChain, and FFAIS.
- 2 Methods For Improving Retrieval in RAG by (19 December 2024) ► This video seems to be a real usage of RAG, not the usual YouTuber doing the usual demo. The guy improved his RAG system by preprocessing the documents to extract structured data from them using a LLM.
- GraphRAG vs. Traditional RAG: Higher Accuracy & Insight with LLM↓ by (17 February 2025) ► This presentation of GraphRAG is too high-level, you have no clue on how to implement it.
- Build an AI-powered multimodal RAG system with Docling and Granite by and (26 February 2025) ► Yet another RAG example, this one extracts text, tables, and images from a PDF file.
- RAG vs. CAG: Solving Knowledge Gaps in AI Models↑ by (17 March 2025) ► A basic and good comparison of Retrieval-Augmented Generation and Cache-Augmented Generation.
- What is Retrieval-Augmented Fine-Tuning (RAFT)? by (9 June 2025) ► Fine tuning a model so it gets better at using only the relevant documents provided by the retrieval part and at answering that it does not know if no document is relevant.
- Improving Retrieval with ELO Scores (8 July 2025) ► ZeroEntropy explains the training process they used for their zerank-1 rerankers.
- Graph Databases: When to Use Them (And When to Run Away) by and (8 December 2025) ► You can perform some graph RAG without a graph database.
- What is OpenRAG? Unlocking the Future of RAG in Generative AI by (12 February 2026) ► A short presentation of OpenRAG: Docling + OpenSearch + Langflow.
- What is Multimodal RAG? Unlocking LLMs with Vector Databases by and (16 February 2026) ► Some options to implement multimodal RAG: textify everything, hybrid multimodal, and full multimodal.
- Is RAG Still Needed? Choosing the Best Approach for LLMs by (9 March 2026) ► A comparison of the advantages and problems of large context vs. RAG.
- Vector Search with LLMs - Computerphile by (11 March 2026) ► A basic introduction to text embedding and RAG.
- NotebookLM
- Google's RAG Experiment - NotebookLM by (28 May 2024) ► The title says it all. Google demo is impressive, using voice for querying and answering.
- How to create AI Podcasts with NotebookLM Tutorial by (17 September 2024) ► A presentation of an impressive Google demo usable (from Illuminate and NotebookLM): you give a paper as entry, it generates a two-persons podcast.
- NotebookLM’s automatically generated podcasts are surprisingly effective by (29 September 2024) ► People are playing with NotebookLM-generated podcasts, sometimes at a meta-level.
- New in NotebookLM: Customizing your Audio Overviews by (17 October 2024) ► is playing with the fact that NotebookLM users can now provide guidelines for the podcast to generate: as usual he picks up the pelican example and asks the AI-hosts to behave as if they were pelicans.
- Google's UNREAL AI Gets an UPGRADE... by (19 October 2024) ► The "poop fart" podcast and how added video on it using HeyGen. He also quickly describes the new NotebookLM features.
- Web scraping
- Web Scraping AI AGENT, that absolutely works 😍 by (9 May 2024) ► A presentation of ScrapeGraphAI, a Python library to scrape a website and to interrogate an LLM on the scraped data.
- “Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent by (16 May 2024) ► Scraping the Web with FireCrawl or AgentQL, and an LLM.
- How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai by (17 May 2024) ► Some Web scraping tools: Beautiful Soup, Jina AI, Firecrawl, and Scrapegraph-ai.
- How Stack Overflow fends off scraping bots — Josh Zhang, a staff site reliability engineer at Stack Overflow, tells Ryan and Eira how the Stack Exchange network defends against scraping bots. They also cover the emergence of human botnets, why DDoS attacks have spiked in the last couple of years, and the constant balancing act of protecting sites from attack without inhibiting legitimate users. by , , and (30 July 2024) ► The subtitle says it all.
- Agentically scrape the web with Firecrawl & LangGraph (LangChain) by (25 October 2024) ► The title says it all.
- NuExtract 1.5 by (16 November 2024) ► NuExtract models extract structured data from unstructured text.
- Tool calling
- AI Agents' Secret Sauce by (7 October 2024) ► Some basic but good advice on how to implement tools.
- What is Tool Calling? Connecting LLMs to Your Data by (13 January 2025) ► The basic presentation of tool calling is classic. But the description of "embedded tool calling" is not detailed enough to understand how that can work.
- Agent Skills: Code Beats Markdown (Here's Why) by (27 March 2026) ► Instead of being simple data transferrers, scraping tools should handle as much as possible: parse and generate condensed structured data, limit the number of navigated pages, and cache the data.
- Docling
- Docling by (3 November 2024) ► A short feedback on experimenting with
docling. - Building a Basic RAG System with Docling: A Comprehensive Guide↓ by (24 December 2024) ► This presentation on doing RAG on a PDF file is rather bad, but the code (in the GitHub repo) is fine.
- How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites) by (13 February 2025) ► A presentation of Docling, a library for parsing documents.
- What Is Docling? Transforming Unstructured Data for RAG and AI by (4 August 2025) ► A presentation for managers of Docling.
- Unlock Better RAG & AI Agents with Docling by and (8 January 2026) ► A presentation of Docling. This one can now be used as a MCP Server.
- Docling by (3 November 2024) ► A short feedback on experimenting with
- LiteParse
- LiteParse - The Local Document Parser by (26 March 2026) ► A presentation of LiteParse, a library to extract text or screenshots from PDF files, Office docs, and images.
- Frameworks
- LLM Toolkit: Validation is all you need by (20 May 2024) ► Building a tool that, from an English question, performs a database request and generates an answer, using Instructor and Fructose.
- LangChain
- LangChain101: Question A 300 Page Book (w/ OpenAI + Pinecone) by (27 February 2023) ► A small demo using LangChain, OpenAI, and Pinecone.
- Workaround OpenAI's Token Limit With Chain Types by (1 March 2023) ► Some solutions to summarise or extract answers from too long documents.
- The LangChain Cookbook - Beginner Guide To 7 Essential Concepts by (29 March 2023) ► Some short examples of the LangChain features.
- ↪The LangChain Cookbook Part 2 - Beginner Guide To 9 Use Cases by (2 May 2023) ► The continuation of the previous video.
- LangChain: Run Language Models Locally - Hugging Face Models by (25 April 2023) ► A demo of executing a model on Hugging Face and locally.
- 5 Levels Of LLM Summarizing: Novice to Expert by (4 May 2023) ► More LangChain examples.
- Scrape any website with OpenAI Functions & LangChain by (2 August 2023) ► The title says it all.
- Construire son RAG (Retrieval Augmented Generation) grâce à langchain: L’exemple de l’Helpdesk d’OCTO by and (17 October 2023) ► A detailed example demonstrating how to extract data from Confluence, embed the chunks, create a chain to find and format the answer, and evaluate the result.
- Content Extraction using Large Language Models & JavaScript↓ by (9 January 2025) ► An example of using LangChain with Granite to extract data from a PDF and Mistral Large to format it into a Markdown table. But the Json format is unspecified, she is using some flaky heuristic to clean up Granite’s answer, an LLM is an overkill to convert Json into a Markdown table…
- LangChain RAG: Optimizing AI Models for Accurate Responses by (13 February 2025) ► A simple RAG system using LangChain and Granite 3.0 8B Instruct.
- LangChain Reaches 1.0 - Whats new? by (26 October 2025) ► An overview of LangChain current products while it is raising more money.
- LangChain4j
- Java Meets AI: A Hands On Guide to Building LLM Powered Applications with LangChain4j By Lize Raes (5 October 2023) ► An overview of LangChain4j.
- Experiments with Langchain4j or Java way to LLM-powered applications by (6 February 2024) ► A good overview of LangChain4j features, this is mostly for persons who do not know the typical AI use cases.
- The Definitive Guide to Tool Support in LangChain4J by (24 February 2024) ► A rather slow presentation of using tools in LangChain4j.
- Java rencontre l'IA : Comment intégrer les LLMs dans vos applications avec LangChain4j by (3 May 2024) ► The same, in French and updated.
- Evolution of Java Ecosystem for Integrating AI by (29 January 2025) ► Building a RAG chat using LangChain4j and Oracle Generative AI.
- Agent Orchestration with LangChain4J by (30 November 2025) ► A short presentation of LangChain4j’s features for implementing agents.
- Tools
- Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM by (31 July 2025) ► The title says it all.
- Ollama
- Ollama on CPU and Private AI models! by (8 November 2023) ► A presentation of Ollama.
- Ollama Web UI (ChatGPT-ish) - Local AI FTW!!! by (1 December 2023) ► Running Ollama Web UI in Docker.
- Ollama's Newest Release and Model Breakdown by (21 September 2024) ► Ollama 0.3.11, Solar Pro Preview, Qwen 2.5, Bespoke Minicheck, Mistral Small, and Reader-LM.
- Quick Look at Hollama↓ by (8 October 2024) ► The "unboxing" of Hollama, a good basic UI for Ollama. But there is little value in such a video, you can easily do the same yourself.
- Ollama + HuggingFace - 45,000 New Models by (25 October 2024) ► Ollama can now use any GGUF recorded on Hugging Face.
- Ollama: Llama 3.2 Vision by (13 November 2024) ► Some very little information about Ollama supporting the vision features of Llama 3.2.
- Open WebUI by (27 December 2024) ► discovers Open WebUI, he is satisfied by the installation easiness, and he experiments it with Llama 3.2 3B.
- Building a Vision App with Ollama Structured Outputs by (31 December 2024) ► A presentation of Ollama Structured Outputs and some examples using them with Llama 3.2’s vision.
- Solved with Windsurf by (14 February 2025) ► wrote a utility in Rust, a language he barely knows, using Windsurf, to get a report on the models installed in Ollama.
- Function calling using LLMs — Building AI Agents that interact with the external world. by (6 May 2025) ► A simple example of a script using tools and, then, converted to using MCP.
- Ollama Gets a New App by (31 July 2025) ► The title says it all.
- New Ollama UI by (31 July 2025) ► The same, but more informative.
- Debugging Ollama↓ by (4 August 2025) ► experimented and discovered that
OLLAMA_DEBUG’s behaviour has been changed. He should just have searched for it (PR). - 300 tps Just Using Ollama? by (6 August 2025) ► A short presentation of Ollama Turbo, which consists of models hosted in the Cloud.
- Ollama Launch + Claude Code + GLM Flash by (25 January 2026) ► Testing Claude Code, as a mode, wtth GLM-4.7-Flash in Ollama: it is slow and results are not good.
- The Ollama Course of
- 1. The Ollama Course: Intro to Ollama by (23 July 2024) ► An overview of Ollama: installation, basic usage, and downloading a model.
- 2. Installing Ollama by (30 July 2024) ► How to install Ollama on Windows, Linux, and MacOS.
- 3. How to use the Ollama.com site to Find Models by (6 August 2024) ► An explanation of the description of Ollama models.
- 4. The Ollama Course - Using the CLI by (14 August 2024) ► A presentation of all the CLI commands.
- 5. Comparing Quantizations of the Same Model - Ollama Course by (21 August 2024) ► Compare the results of the same model with different quantisations and select the one that has the quality / speed that is the best for your needs.
- 6. An Introduction to RAG - Part of the Free Ollama Course by (29 August 2024) ► A basic introduction to RAG.
- 7. Embeddings in Depth - Part of the Ollama Course by (4 September 2024) ► An overview on how to perform embedding using Olllama.
- Let's build a RAG system - The Ollama Course🚫 by (11 September 2024) ► An example of a small RAG program, both in Python and JavaScript.
- What are the different types of models - The Ollama Course by (19 September 2024) ► A basic presentation of the model types: text/base, chat/instruct, code, and vision.
- Crack Ollama Environment Variables with Ease - Part of the Ollama Course by (26 September 2024) ► The most important environment variables and how to set them on MacOS, Linux, and Windows.
- Upgrade Your AI Using Web Search - The Ollama Course by (2 October 2024) ► A simple program using SearNGX and Cheerio to perform a Web search, retrieve the found pages, scrape the text in them, and generate an answer with Llama 3.2 1B.
- Taming AI Hallucinations?🚫 by (9 October 2024) ► describes some basic facts about hallucination.
- Unlock AI Mastery with Pro Tips on Prompting! by (16 October 2024) ► Some basics on prompt writing.
- Master Ollama's File Layout in Minutes! by (23 October 2024) ► A description of how Ollama records the models using several files, similarly to what Docker does.
- Don’t Embed Wrong! by (31 October 2024) ► speaks about using prefixes for RAG with Ollama, but there is no explanation of how they work, he just says that they improve results.
- AI Model Context Decoded by (6 November 2024) ► How to change the context size and some warnings about using a large context size.
- AI Vision Models Take a Peek Again! by (8 November 2024) ► Using Llama 3.2’s vision in Ollama 0.4.0.
- Let's Update Ollama Everywhere by (13 November 2024) ► Explaining something very basic: upgrading Ollama on Mac, Windows, Linux, and Docker.
- Cracking the Enigma of Ollama Templates by (20 November 2024) ► An introduction to model templates.
- Find Your Perfect Ollama Build by (22 November 2024) ► How to build Ollama, the
mainbranch or a PR. - Simplify Ollama Cleanup Like a Pro by (27 November 2024) ► A presentation of Gollama to clean up Ollama data and how to uninstall Ollama.
- The Path To Better Custom Models by (6 December 2024) ► An introduction to Ollama model files.
- The Truth About Ollama's Structured Outputs by (11 December 2024) ► A presentation of structured outputs and a comparison with JSON mode.
- Optimize Your AI - Quantization Explained↓ by (28 December 2024) ► This description of model and context quantisation is unclear, mostly because there is no technical explanation.
- MSTY Makes Ollama Better by (28 February 2025) ► A presentation of MSTY, a UI for Ollama.
- Docker Model Runner
- Docker Model Runner: Running AI Models Locally Made Simple — Docker Model Runner: run AI models locally with zero setup. Pull from Docker Hub, chat via CLI or API. OpenAI-compatible. Beta. by (1 July 2025) ► Docker is also getting in the AI craziness, proposing an alternative to Ollama: Docker Model Runner.
- LangFLow
- What is Langflow? Build AI Workflows with Python, Gen AI, & MCP Tools by (12 January 2026) ► A short presentation of LangFlow.
- llm
- Language models on the command-line w/ Simon Willison by and (13 June 2024) ► presents his
llmCLI tools. - ↪Language models on the command-line by (17 June 2024) ► An overview of the video.
- Using LLMs on the command line by (26 October 2024) ► A short presentation of
llm. - Ask questions of SQLite databases and CSV/JSON files in your terminal by (25 November 2024) ► adds to
sqlite-utilsthe possibility to ask questions in natural language and have a LLM generate the SQL query. - How I use LLMs – neat tricks with Simon’s `llm` tool — Earlier this year I co-authored a report about the direct environmental impact of AI, which might give the impression I’m massively anti-AI, because it talks about the signficant social and environmental of using it. I’m not. I’m (still, slowly) working through the content of the Climate Change AI Summer School, and I use it a fair amount in my job. This post shows some examples I use. by (30 December 2024) ► Some positive feedback and some examples of usage of
llm. - LLM 0.22, the annotated release notes by (17 February 2025) ► The title says it all.
- Structured data extraction from unstructured content using LLM schemas by (28 February 2025) ► added support of JSON schemas to
llm. - llm-openrouter 0.4 by (10 March 2025) ► improved the support of OpenRouter.
- Feed a video to a vision LLM as a sequence of JPEG frames on the CLI (also LLM 0.25) by (5 May 2025) ► The release notes of
llm0.25 with some details about a newllm-video-framesplugin to extract frames from a video and send them to the model. - LLM 0.26a0 adds support for tools! by (14 May 2025) ► prototypes tool integration in
llm. - How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation by (24 May 2025) ► found a use-after-free bug using GPT-o3 via llm.
- LLM 0.27, the annotated release notes: GPT-5 and improved tool calling by (11 August 2025) ► GPT-5 is now supported and tools can now be configured in templates.
- LLM 0.32a0 is a major backwards-compatible refactor by (29 April 2026) ► is rewritting the insides of
llmto better handle multimodality, his design is classic and seems to be a thin layer on top of the current APIs.
- Language models on the command-line w/ Simon Willison by and (13 June 2024) ► presents his
- Agents
- 5 Problems Getting LLM Agents into Production by (4 June 2024) ► Some advice on using agents.
- Evals for AI Agents, the right way!!! by (12 August 2024) ► The usual bad presentation of a paper ("TOOLSANDBOX: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities") evaluating the efficiency of using LLM as agents.
- Agent-S : Unleash The Power Of GUI Computer Use Agents ! by (21 October 2024) ► A high-level presentation of "Agent S: An Open Agentic Framework that Uses Computers Like a Human: a framework to use applications as a human would do it.
- Microsoft Launches 10 NEW AI Agents by (24 November 2024) ► Mircrosoft is moving agressively on AI and integrates 10 agents in Dynamics 365.
- Building effective agents (19 December 2024) ► A good and simple overiew of some workflow and agent architectures.
- ↪Building effective agents by (20 December 2024) ► Some extracts of the previous article.
- ↪How to Build Effective AI Agents (without the hype) by (20 January 2025) ► This video is just Anthropic’s article.
- Trace & Evaluate your Agent with Arize Phoenix by , , and (28 February 2025) ► A presentation of Arize Phoenix, a platform to trace and evaluate smolagents, the evaluation uses LLM-as-a-judge.
- 5 Types of AI Agents: Autonomous Functions & Real-World Applications by (28 April 2025) ► proposes these categories: simple reflex, model-based reflex, goal-based, utility-based, and learning.
- Must Haves For Agents in Production by (15 April 2026) ► An advertisement for TrueFoundry: Model Control, Prompts, Guardrails, Budget Limiting, Tools, Monitoring and Tracing, and Evals.
- LangGraph
- AgentWrite with LangGraph by (6 September 2024) ► describes how he set up a short LangGraph example to write long articles, similarly to LongWriter.
- Building a LangGraph ReAct Mini Agent by (17 September 2024) ► A description of a simple Pattern in LangGraph: ReAct Function Calling.
- ChatDev
- Build AI agent workforce - Multi agent framework with MetaGPT & chatDev by (8 September 2023) ► A presentation of ChatDev.
- CrewAI
- CrewAI August Update: Planning Steps, Training, and Advanced Features Explained by (20 August 2024) ► presents some new CrewAI features, but there is no explanation on how training is taken into account, on how test scores are computed…
- SDS 918: Multi-Agent Systems with CrewAI (⧉) by (29 August 2025) ► A short and naive presentation of CrewAI.
- Autogen
- Autogen - Microsoft's best AI Agent framework that is controllable? by (3 October 2023) ► A presentation of AutoGen.
- Microsoft's Magentic One: This FREE AI AGENT can CONTROL BROWSER, DO CODING & MORE! by (10 November 2024) ► A presentation and some little test of Magentic-One, a multi-agent system from Microsoft able to surf on the Web, read local file, write code, and pilot a terminal to execute that code.
- Multi-Agent AI EXPLAINED: How Magentic-One Works by (13 November 2024) ► A better presentation of Magentic-One.
- Swarm
- Introducing Swarm with Code Examples: OpenAI's Groundbreaking Agent Framework by (14 October 2024) ► Some simple examples using Swarm framework and some feedback about it.
- PydanticAI
- PydanticAI - The NEW Agent Builder on the Block by (4 December 2024) ► Yet another framework. PydanticAI is simple and pythonic.
- PydanticAI - Building a Research Agent by (6 December 2024) ► Using PydanticAI to create an agent for Web search.
- smolagents
- smolagents - HuggingFace's NEW Agent Framework by (6 January 2025) ► Hugging Face has created yet another agent framework. does his usual presentation and experimentation with it.
- ↪How to make Muilt-Agent Apps with smolagents by (8 January 2025) ► More experimentation with smolagents, in particular with multiple agents configurations.
- Desktop agents
- UI-TARS AI Agent: This IS THE BEST AI Agent EVER & BEATS Claude's Computer Use! by (23 January 2025) ► A simplistic demo of UI-TARS, an agent that can pilot applications UI.
- Browser agents
- Browser Use Agent: This FULLY FREE AI Agent CAN CONTROL BROWSERS & DO ANYTHING! (Beats Anthropic!) by (18 November 2024) ► A presentation of Browser Use, a Python framework to create agents able to drive a Browser.
- Deepseek Operator (+Free APIs) : This 100% FREE AI Agent Beats OpenAI's Operator FOR FREE! by (24 January 2025) ► A demo of Browser Use WebUI, a UI for a Browser agent.
- Qwen-2.5 Operator: This is The BEST LOCAL AI Operator Agent THAT YOU CAN USE NOW! by (30 January 2025) ► Using Browser Use with Qwen2.5-VL.
- Gemini Browser Use by (14 February 2025) ► Some simple experimentation with Browser Use and Gemini 2.0.
- OpenAI Agent SDK
- How to Build an Agent with the OpenAI Agents SDK by (17 March 2025) ► A classic ’s presentation.
- OpenClaw
- The wild rise of OpenClaw... by (30 January 2026) ► A presentation of OpenClaw.
- Moltbook is the most interesting place on the internet right now by (30 January 2026) ► The world is crazy about OpenClaw, there ie even a chat site where OpenClaw assistants can xommnicate together.
- The Moltbook Situation by (31 January 2026) ► The same.
- Running OpenClaw in Docker by (1 February 2026) ► explains how he installed OpenClaw using Docker Compose.
- A Social Network for A.I. Bots Only. No Humans Allowed. by (2 February 2026) ► has been interviewed about Moltbook.
- Clawdbot to Moltbot to OpenClaw: The 72 Hours That Broke Everything (The Full Breakdown) by (2 February 2026) ► ’ analysis of OpenClaw current buzz: it is a glimpse at the future, but too dangerous to be used if you are not a technical guru and a daredevil.
- The Moltbook Experiment Failed by (3 February 2026) ► Moltbook was quickly kacked.
- OpenClaw Agents Are Hiring Each Other. Transferring Crypto. Building Societies. This Is Real. by (3 February 2026) ► It seems that was not aware that Moltbook was a hacker paradise when he captured this analysis that Moltbook is the proof that agents cas self-organise…
- The rise of Moltbook suggests viral AI prompts may be the next big security threat — We don’t need self-replicating AI models to have problems, just self-replicating prompts. by (3 February 2026) ► Moltbook could be a preview of prompt worms that will replicate themselves across Internet.
- Andrej Karpathy talks about "Claws" by (21 February 2026) ► has tweeted about "claws", all the OpenClaw variants.
- Openclaw deletes entire inbox by (25 February 2026) ► used OpenClaw to clean up her inbox…
- NVIDIA NemoCLAW!! - GTC 2026 by (17 March 2026) ► NVIDIA announced NemoClaw, an environment around OpenClaw to secure it.
- How to Avoid Runaway API Costs in OpenClaw by (22 March 2026) ► provides some advice to reduce the token bill when using OpenClaw.
- I finally found a use case for OpenClaw…↓ by (23 April 2026) ► An advertisement for Hostinger’s OpenClaw hosting.
- AI Agent writes hit piece
- An AI Agent Published a Hit Piece on Me by (12 February 2026) ► An OpenClaw agent (or maybe a human) is playing the fool with Matplotlib’s maintainer.
- An AI Agent Published a Hit Piece on Me by (12 February 2026) ► A summary of the previous blog article.
- An AI Agent Published a Hit Piece on Me – More Things Have Happened by (13 February 2026) ► gives an update: Ars Technica screwed up an article with fake citations, the bot is still active, Brandolini’s law, and the end of reputation.
- Sorry all this is my fault by (15 February 2026) ► explains how he ended up providing fake citations to his co-author.
- The obnoxious GitHub OpenClaw AI bot is … a crypto bro by (16 February 2026) ► presents the hypothesis that the hit piece was written by a crypto bro to scam more persons by creating a crypto token having the same name as the bot.
- An AI Agent Published a Hit Piece on Me – Forensics and More Fallout by (17 February 2026) ► The story continues, but there is little information here: statistic on the bot activity and a post written by this one.
- Rathbun’s Operator (17 February 2026) ► The bot’s operator tells their version of the story.
- An AI Agent Published a Hit Piece on Me – The Operator Came Forward by (19 February 2026) ► describes his hypotheses and the probability he allocates to each one.
- AI Agent writes hit piece by (20 February 2026) ► Even is speaking about this story.
- MCP
- What is MCP? Integrate AI Agents with Databases & APIs by (19 February 2025) ► A high-level description of MCP.
- microsoft/playwright-mcp by (25 March 2025) ► Microsoft released an MCP server wrapping Playwright.
- MCP Tools vs Official MCP Inspector: Choosing the Right Tool for Model Context Protocol Development — Discover the key differences between the official Official MCP Inspector and MCP Tools. Learn when to use each tool and how MCP Tools offers advanced capabilities for proxy, mock servers, and CLI workflows. by (29 March 2025) ► The subtitle says it all.
- Building an MCP server in 2 minutes.... by (13 April 2025) ► A simplistic example of implementing a MCP server in Python.
- Model Context Protocol (MCP) : connecter vos LLMs à vos données et outils by , , and (18 April 2025) ► A long high-level presentation of MCP.
- MCP Crash Course: What Python Developers Need to Know by (19 April 2025) ► This presentation of MCP is rather slow and not so clear.
- Tiny Agents: an MCP-powered agent in 50 lines of code by (25 April 2025) ► A simple JavaScript example of using MCP.
- ↪Tiny Agents in Python: a MCP-powered agent in ~70 lines of code by , , , and (23 May 2025) ► The same in Python.
- MCP is not REST API by (17 May 2025) ► explains that having a MCP Server that simply maps a REST API is a bad idea: REST is about providing a low-level CRUD API to resources, agent tools should be high-level actions.
- Make AI Agents Fetch Real-time Data using THIS Powerful MCP Server⇊ by (26 May 2025) ► This is not a presentation of MCP, but an advertisement for Bright Data.
- MCP Demo Part 1 | Ep. 21 Bits and Booze by and (26 June 2025) ► A presentation of MCP and a demo of building a very simple MCP server (a single tool performing a git commit).
- ↪MCP Demo Part 2 | Ep. 22 Bits and Booze by and (11 July 2025) ► The continuation of the previous video. and got difficulties to make it run successfully.
- Building the Hugging Face MCP Server by , , , and (10 July 2025) ► Some information about how Hugging Face implemented their MCP server.
- MCP Is Not Your REST API: 5 Principles by (12 August 2025) ► Some good advice on implementing a MCP Server.
- too many model context protocol servers and LLM allocations on the dance floor by (22 August 2025) ► People tend to overuse MCP servers, while some other solutions use much less tokens in the context.
- [Video Response] What Cloudflare's code mode misses about MCP and tool calling by (19 October 2025) ► comments the claim that it is more efficient that the LLM calls tools via API calls rather than by using MCP: by using MCP, it is possible to handle data which is not properly structured.
- How to Automate Anything with Python Inside Claude Desktop (Using MCP) by (31 October 2025) ► Implementing a very small and simple MCP Server using Python and
uv. - Code execution with MCP: Building more efficient agents — Direct tool calls consume context for each definition and result. Agents scale better by writing code to call tools instead. Here's how it works with MCP. by and (4 November 2025) ► The author suggests using MCP servers as code APIs rather than direct tool calls, and explain the advantages to do so: lesser use of context space, the data remains private (it is not seen by the LLM), persistence…
- MCP vs RAG: Two Very Different Ways to Gain Context — At first glance RAG and MCP seem similar. In practice, they solve very different problems and lead to very different system designs. by (14 January 2026) ► I do not undestand this comparison between MCP and RAG: they do not have the same purpose, MCP is a protocol handling tools (anf other things), RAG is about using semantic search to complete the context. You could use a tool that performs semantic search, I would still call this RAG.
- Let's learn about MCP Apps by (2 February 2026) ► A presentation of MCP Apps: an extension of MCP allowing a MCP server to provide an UI that the MCP client will display.
- Creating a Wikipedia MCP Server in Java in a Few Prompts with Skills by (2 April 2026) ► explains how he used Gemini CLI to create a MCP Server to search Wikipedia.
- MCP UI: Extending the frontier — Liad Yosef and Ido Salomon, MCP Apps by and (6 May 2026) ► A presentation of MCP-UI and its current evolution, MCP Apps.
- Hugging Face
- Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ by , , and (25 July 2025) ► Hugging Face has revamped its CLI.
- Sentence Transformers is joining Hugging Face! by (22 October 2025) ► Hugging Face is taking over the ownership of Sentence Transformers.
- Building the Open Agent Ecosystem Together: Introducing OpenEnv by , , , , , , , , , and (23 October 2025) ► Hugging Face creates a hub where people can propose and get environments to use for training or running agents.
- Building for an Open Future - our new partnership with Google Cloud by and (13 November 2025) ► The announcement of some improvements for using Hugging Face’s models in Vertex AI, deploying models in Google from Hugging Face, running models on TPU…
- We Got Claude to Fine-Tune an Open Source LLM by and (4 December 2025) ► Hugging Face has created a
skill.mdfile to instruct a LLM how to drive the fine-tuning of a model in their environment. - ggml.ai joins Hugging Face to ensure the long-term progress of Local AI by (20 February 2026) ► considers that gglm.ai joining Hugging Face is a good thing for Llama.cpp’s future.
- Inference Providers
- Welcome to Inference Providers on the Hub 🔥 by , , , , , , and (28 January 2025) ► Hugging Face is creating a hub for inference providers.
- Welcome Fireworks.ai on the Hub 🎆 by , , and (14 February 2025) ► Fireworks.ai is now a supported inference provider on Hugging Face Hub.
- Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 by , , , , , , and (18 February 2025) ► The title says it all.
- Cohere on Hugging Face Inference Providers 🔥 by , , , , , , and (16 April 2025) ► The same with Cohere.
- Featherless AI on Hugging Face Inference Providers 🔥 by , , , , , and (12 June 2025) ► The same with Featherless AI.
- Groq on Hugging Face Inference Providers 🔥 by , , , , and (16 June 2025) ► The same wih Groq.
- Public AI on Hugging Face Inference Providers 🔥 by , , , , , and (17 September 2025) ► The same with Public AI.
- Scaleway on Hugging Face Inference Providers 🔥 by , , , , , , , and (19 September 2025) ► … Scaleway.
- OVHcloud on Hugging Face Inference Providers 🔥 by , , and (24 November 2025) ► OVH is of course the following one.
- Together AI
- Why AI needs its own cloud — Together AI CEO on the next era of infrastructure by and (2 March 2026) ► A short non-informative interview of , the CEO of Together AI.