Securing AI Agents from Doing Bad Things
Show notes for AI Explained Part 31 — sandboxing, permission scoping, instruction hierarchy, and the metrics that tell you whether your agent is safe to ship.
The AI Explained series: short, focused episodes on individual AI building blocks — transformers, attention, tokenization, memory, tool use, multi-agent systems, and more.
11 posts below, newest first.
Show notes for AI Explained Part 31 — sandboxing, permission scoping, instruction hierarchy, and the metrics that tell you whether your agent is safe to ship.
Feeling overwhelmed by the fear of AI making huge mistakes? In this video, we break it down into simple pieces.
Feeling overwhelmed by high AI API costs and latency? In this video, we break it down into simple pieces.
Feeling overwhelmed by the different layers of AI memory? In this video, we break it down into simple pieces.
Feeling overwhelmed by memory and state tracking? In this video, we break it down into simple pieces.
Feeling overwhelmed by the idea of communicating AIs? In this video, we break it down into simple pieces.
Feeling overwhelmed by APIs and AI Tool Integration? In this video, we break it down into simple pieces.
Feeling overwhelmed by the hype around "AI Agents"? This video is your ultimate guide to finally understanding AI Agents and Agentic RAG, even if you're completely new…
Feeling overwhelmed by Transformer architecture diagrams? In this video, we break it down into simple pieces.
Ever wonder why AI makes simple math or spelling mistakes? This video demystifies tokenization, breaking down how AI cuts up English words into puzzle pieces.
Unlock the secrets of AI in this masterclass! Part 1 simplifies how AI works, especially Large Language Models like ChatGPT, without any code.
Subjects that frequently appear alongside #ai-explained. Click through to see every post on each one.
How LLMs actually work — tokenization, embeddings, RAG, fine-tuning, agents — explained for engineers who ship production code, not papers.
Large language models — how they think, why they fail, what RAG fixes, and how to evaluate them. The fundamentals every engineer building on top of an LLM should internalise.
How autonomous AI agents reason, plan, use tools, and stay aligned with your intent — the ReAct loop, agentic RAG, and multi-agent orchestration.
Practical software security for engineers — secrets handling, threat modelling, least privilege, prompt injection, sandboxing, and AI-specific attack surfaces.