How Stripe built “minions”—AI coding agents that ship 1,300 PRs weekly from Slack reactions | Steve Kaliski (Stripe engineer)
Steve Kaliski is a software engineer at Stripe who has spent the past six and a half years building developer tools and payment infrastructure. He’s part of the team that created “minions”—Stripe’s internal AI coding agents, which now ship approximately 1,300 pull requests per week with minimal human intervention beyond code review. In this episode, Steve demonstrates how Stripe engineers activate development work from Slack and leverage cloud-based development environments for parallel agent workflows, and demos machine-to-machine payments where AI agents transact autonomously with third-party services. What you’ll learn: - How Stripe’s “minions” write 1,300 pull requests per week with minimal human intervention - Why a good developer experience for humans creates better outcomes for AI agents - The critical role of cloud development environments in unlocking AI-powered engineering velocity - The machine payment protocol that lets AI agents spend money to accomplish tasks - The code review strategy for handling thousands of agent-written PRs - Why non-engineers at Stripe are starting to use minions to ship code - The future of software businesses built primarily for agent consumers — Brought to you by: Optimizely—Your AI agent orchestration platform for marketing and digital teams Rippling—Stop wasting time on admin tasks, build your startup faster — In this episode, we cover: (00:00) Introduction to Steve (02:39) Stripe’s minions and their effect on Stripe as a whole (04:42) Why activation energy matters more than execution (05:44) What is a minion? The technical architecture (06:52) Demo: Activating a minion from Slack with an emoji (09:04) Why good developer experience benefits both humans and agents (11:22) Walking through the agent loop and system prompts (13:42) Why Stripe chose Goose as their agent harness (16:00) The role of Stripe’s developer productivity team (17:15) Why cloud environments unlock multi-threaded AI engineering (21:14) One-shot prompting: from Slack to shipped PR (22:04) How Stripe handles code review for 1,300 AI-written PRs weekly (23:44) Non-engineers using minions across the company (24:53) Demo: Planning a birthday party with Claude and machine payments (32:15) Quick recap (35:08) The future of ephemeral, API-first businesses for agents (36:36) Lightning round and final thoughts — Detailed workflow walkthroughs from this episode: • How Stripe's AI 'Minions' Ship 1,300 PRs Weekly from a Slack Emoji: https://www.chatprd.ai/how-i-ai/stripes-ai-minions-ship-1300-prs-weekly-from-a-slack-emoji • How to Build an Autonomous AI Agent That Pays for Services to Complete Tasks: https://www.chatprd.ai/how-i-ai/workflows/how-to-build-an-autonomous-ai-agent-that-pays-for-services-to-complete-tasks • How to Automate Code Generation from a Slack Message into a Pull Request: https://www.chatprd.ai/how-i-ai/workflows/how-to-automate-code-generation-from-a-slack-message-into-a-pull-request — Tools referenced: • Goose (AI agent harness): https://github.com/block/goose • Claude Code: https://claude.ai/code • Cursor: https://cursor.sh/ • VS Code: https://code.visualstudio.com/ • Slack: https://slack.com/ • Browserbase: https://browserbase.com/ • Parallel AI: https://www.parallel.ai/ • PostalForm: https://postalform.com/ • Stripe Climate: https://stripe.com/climate — Other references: • Stripe machine payments: https://docs.stripe.com/payments/machine • Blue-Green Deployment: https://martinfowler.com/bliki/BlueGreenDeployment.html • Git worktrees: https://git-scm.com/docs/git-worktree — Where to find Steve Kaliski: Twitter: https://twitter.com/stevekaliski LinkedIn: https://www.linkedin.com/in/steve-kaliski-079a7710/ — Where to find Claire Vo: ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [redacted email].
- Published
- Published Mar 25, 2026
- Uploaded
- Uploaded Jun 12, 2026
- File type
- Podcast
- Queried
- 00
- Source
- podcasters.spotify.com
Full transcript
Showing the full transcript for this episode.
AI-generated transcript with timestamped sections.
[00:00] At Stripe, we're landing about 1,300 PRs that have no human assistance besides review per week. A lot of where our work begins is it could be in a Google Doc as we're planning a new feature, or maybe a GR ticket comes in, or we're talking about something in Slack. I can click an emoji, and then the menu will sort of attempt to one-shot resolving that prompt using all the tools that are available at Stripe. When you're in larger organizations, there's so much friction that can come between a good idea and getting it into the world. Not only can I have one of these, [00:30] in isolated environments, making isolated changes all at the same time. How are you getting all this code review done? Whether the text has been written by Steve or the text has been written by Steve's robot, you still want that CI environment that's providing confidence that the code that's being changed is safe and that as it rolls out, we're having blue-green deployments so you can roll back too. All that is super critical, independent of the nature of the authoring of it. No matter how juiced these laptops are, you get three or four work trees in and it starts to sound like an airplane taking off. [01:00] And so I do think on this multi-threading agentic engineering work, cloud environments and virtual environments are so important to unlock velocity. [01:12] Welcome back to How I AI. I'm Clara Vaux, product leader and AI obsessive here on a mission to help you build better with these new tools. [01:21] Today we have Steve Koleski, a software engineer at Stripe, and he's going to show us how the Stripe team deploys a bunch of minions [01:28] to do their engineering work. We'll also watch an agent spend a little bit over $5 to plan a birthday party all in Claude Code.
[01:37] Let's get to it. This episode is brought to you by Optimizely. Most marketing teams aren't short on ideas, but what they are short on is time. And that's exactly what Optimizely Opal gives you back. With AI agents that handle real marketing workflows, you know, like creating content and checking compliance, generating experiment variations, personalizing user experiences, [02:07] platform for marketing and digital teams, plugging seamlessly into the tools you already use, handling the boring busy work, and keeping everything on brand. That leaves marketers with more time to do your actual job. See what Opal can automate for your team by signing up for a free enterprise agentic AI workshop with Optimizely. [02:29] Find out more at optimizely.com/how-eye-ai. Attend live and you'll get a free pair of Ray-Ban Meta AI glasses. [02:40] Steve, I'm so excited to have you on How I AI because I saw the Stripe Minions on... [02:48] the timeline and one [02:50] Exceptional branding. Don't sue us. And two, I just love the idea that [02:56] You and your colleagues in the team at Stripe have created [03:00] Not just one agent. [03:01] but minions all across the company that can help with development work. And I'm so excited for you to show us
[03:08] how that helps you in your day to day here. So welcome to How I AI. [03:12] Thank you for having me. [03:13] So tell me, [03:15] What has been the effect that minions have had on you personally at Stripe and at the Stripe team as a whole? [03:21] - Sure, so for me personally, [03:24] I think sort of anecdotally, I don't remember the last time I started work in the text editor. [03:29] Right. So I do end up there often. [03:31] But what I found is that [03:33] A lot of where our work begins is [03:35] It could be in a Google Doc as we're planning a new feature, or maybe a GR ticket comes in, or we're talking about something in Slack. [03:42] And those are sort of like the more natural entry points to starting work, right? [03:46] And then you end up in a text editor when it's time to actually do the work or make the final tweak. [03:50] And it just felt very natural. [03:52] And I think in particular, the sort of like activation energy of starting work feels a lot lower. [03:58] Right, so if you know [03:59] you're in a Slack thread and maybe there's a [04:02] a piece of user feedback and it's something simple, like, you know, we have to update the docs or, or maybe it's something more consequential and we just want to build a prototype. I can click an emoji and like the work begins. [04:12] And often the work finishes too. We at Strip were landing about 1,300 [04:18] PRs that have no human assistance besides review per week. [04:22] um [04:23] But at the minimum, the activation energy of like [04:26] starting your right code, [04:27] "Seeing tests pass, maybe a test fails, [04:30] occurs without me even, you know, participating. And then I can jump in and I can tweak and I can kind of like have that momentum sort of [04:38] It's sort of like generative momentum.
[04:40] that I can hop in halfway through. [04:42] What I think is magical about this, and I won't call Stripe a big company, but you do have a decent amount of employees and, [04:49] very, very large business is [04:52] I love that concept of activation energy going lower because when you're in larger organizations, there's so much friction. [04:59] That can come between a good idea [05:02] and getting it into the world. And it's not malintent, right? It's nobody's like, oh, man, I really want to slow this process down. Yeah. It's either, you know, functional. I don't have access to a technical area of expertise to actually get from here to there. [05:16] It's operational. I don't know how to organize people and communicate effectively to get the next step done. Or it's just kind of like people get siloed in their day to day and don't think of new ways to get work done. And one of the things that has been so revelatory about AI for me personally is like all that just kind of goes to zero. [05:34] because coordination costs can go down, execution costs can go down, communication costs can go down. [05:40] You just get closer to the work, which I think is the fun part we all really care about. [05:45] So, [05:46] Show me how you actually activate a minion. And, you know, we skipped this a little bit what a minion is. [05:53] The quick spiel of a minion. Um... [05:55] When I, as an engineer, sort of... [05:57] pre-AI time, want to make a modification to Stripe. Well, Stripe is a huge code base with tons of services. It can't run on my computer alone. So Stripe already has a long history of investing in great developer tooling.
[06:11] Having hosted development environments that I can spin up, that have all the code already there and services running. [06:17] And I can SSH in and make modifications [06:20] And we have a ton of great CI tooling around that. So that's the context. We have all that. [06:26] The idea with the minion is that I can provision one of those environments, seed it with a [06:32] And then the menu will sort of [06:34] you know, attempt to one-shot [06:36] resolving that prompt. [06:39] using all the tools that are available at Stripe. [06:41] Right? [06:42] all of our internal documentation, our internal CI, [06:45] our test data, so on and so forth. [06:48] And it will loop through that in an attempt to solve that problem. [06:52] Let's go ahead and jump in and see what sort of a pro-typical experience might look like. [06:57] So I'm in a Slack channel, it's called Steve Klisky Robots, [07:01] dash claire [07:02] I actually have a Steve Klisky Robots channel that has 76 humans in it, but I do have every – it started off as just me and my robots and now there's some sort of audience observing. [07:14] But let's imagine that maybe I'm thinking of a new feature idea, or I want to improve documentation that we have. So we have a launch coming up soon. [07:22] I want to sort of embellish the documentation. [07:26] I have this cool idea for docs at stripe.com/payments/machine. This is our new [07:33] machine-to-machine payment work, which we'll look at later in our call. [07:37] And I want to, you know, [07:39] Make sure the [07:41] landing
[07:42] page really sticks and gives a good code example of how to get started quickly. [07:50] So maybe someone posted a message like that or came in through a ticket or whatever the origin may be. [07:56] All I have to do now is, you know, add a reaction, which is... [08:01] Create Mini and Pay Server. This is a particular repository within Stripe. [08:05] We get the one sec cooking from the dev box. [08:09] Agent and then we get a reply in here saying your minion for pay server. It's the repository and [08:15] For a new branch that's created, landing page code example has been created. [08:19] And it's going to kick off our doc service [08:22] So I can eventually preview it. Now, I'm going to click Follow Along. So right now what it's doing is it's provisioning that development environment I was talking about earlier. All right, so this part isn't new. It is excellent, but it's not new. And basically it's going to spin up an instance in the cloud. [08:38] It's going to apply all the configuration that's required for [08:41] both me and the agent [08:44] to do coding within Stripe. [08:46] So this will just take a few seconds. It's going to check out that repository with a new branch. [08:51] Configure the local database. [08:54] Apply my git config. [08:56] It's going to set up a VS Code server so I could connect to it just through the web or [09:01] or locally, [09:03] So some extensions. [09:05] So what's really great about Minions is obviously there's the agent loop that's making the code modifications, but it's built on top of a ton of incredible work that are
[09:15] developer productivity is done around just making it easy to get [09:18] like a perfectly operating Stripe development environment. [09:22] for coding, which means that [09:23] you know, [09:24] Not only can I have one of these, but I could have, you know, [09:27] Many, many of these running in parallel in isolated environments [09:31] making isolated changes all at the same time. [09:34] that little one click emoji, I could have done that with a few messages at the same time. [09:38] which is really great. [09:40] Yeah, one thing I want to call it here is we had my friend Zach from LaunchDarkly on and one thing he said was, look, what's good for the developer is good for the agent. So there's this virtuous loop of if you have or do invest in developer experience for your human engineers. [09:57] Your agents will benefit off of that. And in turn, if you invest in [10:01] developer experience or agent experience for your agent engineers, [10:05] your development team benefits from that. And so I always tell people, you know, engineering team, you've always asked, like, [10:12] can we just give a little bit more time on the roadmap to DX? Like, pretty please. Can we invest here? And I think if you attach it to an AI initiative, that's like the secret way to get some of that. [10:22] That good stuff done. [10:24] Yeah, I mean, imagine you're... [10:26] Some code bases are small, but Stripes is huge. Imagine you show up day one and there's no documentation. [10:33] And there's no tools. And they say, good luck. Like, [10:36] Anyone would have trouble, and even if you threw the agent at it, it's [10:39] very likely that the context window would be blown by the whole code base, just scanning through to understand
[10:46] all the intricacies would [10:47] be like impossible or extremely expensive. So... [10:50] If there's a very blessed path for [10:53] 90% of the common activities in being an engineer at Stripe [10:58] that makes the propensity that the agent succeeds really high too. So imagine we wanted to make an API change, which we do. [11:06] you know, hundreds or thousands of times a year. [11:09] We have really good documentation on how to add a new field or a new method or a new resource that the minion would read. [11:15] and would execute against, and then the propensity it would one-shot is very high. [11:19] Good docs for developers are equally important for the agent, to your point. [11:24] We've now transitioned from... [11:26] booting up the development environment, [11:28] to now we're in the first agent run. [11:31] We have that prompt that I posted in Slack here, [11:34] And now what it's going to do is boot up an instance of Goose that's basically the harness that's going to run through all this. [11:40] We did have an episode with the block team about Goose, this open source agent harness. [11:47] that got set up. And I want to call out one thing for folks that are not watching and are listening, which is I love your system prompt. So sophisticated. [11:55] It says... [11:56] implement this task completely, colon, and then just whatever you put in. No mistakes. No mistakes. You forgot, no mistakes. But, you know, I think people really think they have to over-architect their initial prompt. And I think if you have a great harness, it can go a long way to extracting out. [12:13] a successful outcome from a pretty loose prompt.
[12:17] Totally. [12:19] A lot of this is an experiment in some way, right? As new models come out and we build new tools like [12:24] There is this sort of dynamic nature to it. And we've built a lot of interesting bots that help write the prompt, right? So maybe first it will do the task of searching through the code base or looking at other pull requests or Google Docs or whatever it may be. [12:37] I think now it's straight-- most things that could have an MCP server have an MCP server. [12:42] So we're able to interact with a lot of the internal data we have. [12:45] And then it can make a prompt that I could then paste in here or I get assigned [12:50] to the Asians. So that's sort of part of why I wanted that public channel we were looking at is like, [12:56] Yeah. [12:57] We're going to see that we don't pair program anymore, but we pair prompt. [13:01] And that activity could be with other engineers or other data sources or other agents too, [13:06] figure out if we can [13:08] you know, properly explain to the agent, you know, how to do it correctly. In any case, you know, what it's doing now is it's taking the link I gave it, which is to public documentation, [13:18] It's going to search through the code base and use some of our code searching tools to locate where that change is. [13:25] in particular should go, [13:27] It's going to execute a whole sequence of tools. And over time, as it figures out where in the code base it should work, [13:33] what the modification should be. They'll ultimately commit those and make those available in a pull request that me and my fellow colleagues can review. [13:42] Yeah, I have a couple questions on this because we've seen... [13:46] A few examples of folks building their own cloud agents and and kind of and I'm curious, you know, why why Goose, you know, versus doing something on your own or.
[13:59] doing sort of a more commercial solution. I'm curious if there was an internal discussion or how this, or did this happen organically because it worked for one engineer? [14:07] Curious how you kind of seeded the idea of of minions on top of your development tools. [14:12] Yeah, sure. So we also make Cloud Code and Perseher [14:16] and tools like that widely available to engineers at Stripe. [14:19] I think our general... [14:20] Sentiment is like... [14:22] We want to accelerate development so we can build new features for our users. [14:27] And there are going to be new models coming out, new tools, and we want to be able to proliferate those as much as we can. [14:33] In the particular case of Minion, [14:35] It's very... [14:37] I don't want to say very specific, but it's very specific to, like, [14:40] the Stripe developer experience in the Stripe developer environment. [14:44] And we have been experimenting with goose early on. [14:47] And I think in this particular case, we'd forked it to make some modifications as well. And really what we were looking at is like sort of a base harness [14:54] and loop [14:55] to apply all of our own tools and software to. [14:58] So we spent a lot of time on like [15:01] making good tools available, [15:03] and making sure that [15:04] the sort of routes that the minions go through, [15:07] you know, work closely with like the most common Stripe developer workflows. So it's [15:12] you know, sort of like commercial versus custom things like, [15:16] There are things that are very specific about [15:18] Stripe's code base and being a developer of Stripe and the way we build things, [15:21] And that it was just sort of easier for us to build and deploy that. [15:25] But the commercial solutions are great and we use those extensively. [15:29] Even later on this demo, I can sort of show like I can, you know, for example, I can pop into
[15:34] VS Code Web. [15:35] where I could manually edit some of the code that's going on here as well. But I can also boot up Claude and I can have sort of the typical Claude experience with [15:43] All the Stripe MCP tool, internal Stripe MCP tools available as well. [15:48] So, you know, there's no singular tool to rule them all, but I think the, like, overall end-to-end development story at Stripe is built on minions. So you can see I'm... [15:57] in that dead box and caught now. Yeah. Cool. I have one other question and then and then an observation. I want to make sure that that the listeners don't miss. So my first question is, [16:06] You know, Stripe is a very well-resourced, I would say, engineering organization. So I'm presuming you have a team dedicated to... [16:14] working on not just [16:16] your dev tools, but as well as as minions and [16:19] managing that as an internal product itself? Has that team been sort of [16:23] built, um, [16:24] As a standalone team that's focused specifically on internal developer experience, is that how it works? [16:29] Yeah, we've had a developer productivity team for as long as I can remember. I think I'm about six and a half years now. And that team's focused on all the tools that I engage with and making them more useful. Right. So that's all the way from. [16:43] how we interact with [16:45] with [16:45] get inversion management to our tech centers and our configurations there. [16:50] to our development environment and how that whole story pieces together. [16:55] And we [16:57] Just as a product engineer in Stripe, I care deeply about our external users and them being successful at Stripe. That team cares equivalently about engineers at Stripe being successful and being able to build things quickly.
[17:10] And I think that's been even more accelerated by AI in the last couple of years. [17:15] And then one other observation I want to make, because I think you glossed over it a little bit at the beginning, but it's so important for folks that really want to go ham on coding with AI. Sure. Which is, look. [17:25] All of us engineers have a MacBook Pro that weighs 8 million pounds. [17:30] I can can do some damage mine for anybody wants to know its nickname is big boy. So whenever I need my kids to get my particular my coding laptop, I say, can you bring me big boy? [17:39] Um, cause it's, I call it San Francisco rucking when I carry two of them in my backpack. But you know, no matter how juiced these, these laptops are, you get like three or four work trees in all running and like, it starts to sound like an airplane taking off. It's no good. [17:56] And so I do think on this sort of like multi-threading agentic engineering work, [18:02] cloud environments and virtual environments are so [18:05] important to unlock velocity. [18:08] And that's one place where I haven't seen enough large engineering teams invest in [18:15] those environments to really unleash the power of [18:19] either AI-assisted coding for their software engineers or agents in general. So if there are any CTOs, VPs of engineering listening... [18:27] If you were to invest in something to really unlock growth in the next year, [18:31] getting that situation locked up would be... [18:34] really good because again, [18:37] I hear so many people be like, oh, I can clod code everything. I can codex anything. I can spin up all these work trees. I'm fine.
[18:43] And I'm like, are you running all these local? Like, what are you doing? [18:47] And so that's one thing I just want people to not miss is the limitations of... [18:52] your actual machine on how multi-threaded you could be, especially in a complex road base like stripes. Totally. And, you know, [18:59] I have Slack on my phone, right? [19:01] I can even kick off one of these minions on the way to work, right, as I'm sort of going through Slack on the subway, and then, you know, [19:09] By the time I'm there, I can jump in halfway through. [19:13] Maybe like the hyperbolic thing here is like, imagine if all engineers at a company could only like, work on, didn't have Git, we all had to like coordinate working on, [19:22] the one code base together, that would be crazy. And the equivalency here is like, imagine if I'm bounded by, my agents are bounded by just what's available and can work on my computer. [19:33] Yeah. [19:35] The 10x thing to do is... [19:38] you know, be able to have 10 of them run in parallel, but also not be contingent on my, like, [19:43] It's like everyone's buying a Mac Mini, right? So it doesn't fall asleep. [19:47] Right. It's like there's a whole business around just the computer not falling asleep. I legitimately, first of all, I like four Mac minis upstairs and one of them. [19:56] is just basically a laptop that doesn't close. Like I use it as a laptop that does not shut. [20:03] And it's really unlocked my velocity. So, okay, we thank you for going on this side quest about virtual environments and local hosts and all those things. I'm a founder, so I know most people don't start companies because they love running payroll or managing compliance. But somewhere between hiring your first employee and raising your next round, you end up in the weeds with HR, IT, and all that other stuff. That's what Rippling was built to solve.
[20:33] finance in one system from day one. The Rippling startup stack replaces disconnected tools that don't sync with a fully connected platform. Over 15,000 startups, including Cursor, Clay, and Sierra, trust Rippling to scale fast without adding additional ops and HR headcount, so founders like you can keep building. Right now, venture-backed startups can get six months of Rippling startup stack for free. Head to rippling.com slash howiai and sign up today. [21:03] That's R-I-P-P-L-I-N-G dot com slash howiai to sign up for six months free today. Focus on what you're building and leave the rest to rippling. [21:15] Okay, so you are now running this, you're going to it's, it's, [21:20] You said one shot at the beginning. [21:22] Really, you're trying to take one one prompt. [21:25] And not a single reply gets you what you want, but it goes into the harness. It goes through its own lube. [21:32] hits the tools it needs and ultimately you as the end user get one response back which is here's the successful implementation. [21:39] Exactly, right. So we can already see that's identifying the relevant files. [21:43] It's keeping track of its own to-dos. That's something that we've codified in it to focus on. It's making changes. [21:50] preparing the commit and so on and so forth. And ultimately, sort of like taking out of the oven, we'll see a response at the end of just like, it finished. You know, you can go ahead and look at the pull request and... [22:02] the sort of normal human review part continues. Let's talk about that really quickly. You said 1,300...
[22:09] code or agent initiated PRs per week, something like that. [22:14] And then humans are involved in code review. How are you getting all this code review done? [22:18] Well, you can make the argument that [22:20] If I'm spending less time actively writing code, I can recenter my time on reviewing the code that's being written or working with users and so on and so forth. So I think that's a big part of it. [22:32] I think the other side of it, it comes back to that CI environment. [22:36] Right, so having really good test coverage [22:39] having synthetics that run to simulate end-to-end [22:42] interactions with your product. [22:44] those all help inspire confidence in the code you're reviewing, right? So absence of those, like [22:49] It'd be really difficult to look at code, especially in a huge code base, [22:53] and have high confidence that it works. [22:55] So, again, whether the text has been written by Steve or the text has been written by Steve's robot, you still want that CI environment that's, [23:03] providing confidence that the code that's being changed is safe and that [23:08] As it rolls out, you're having sort of blue-green deployments, so you can roll back to, like, [23:13] All that is super critical, independent of the nature of the authoring of it. [23:19] I do I do believe like [23:21] If coding becomes easier and coding historically has been the bottleneck in product development, it's just going to shift to other areas, right? [23:29] If coding in effect becomes free, [23:32] the review is going to be really challenging, right? Or getting enough ideas in the first place [23:36] could be a big problem. We're distributing them, right? So I think... [23:39] The attention
[23:41] It's just going to move around to other areas. [23:45] Great. And then one other question before we go on to your next workflow, which I am so excited about. Spoiler alert. [23:51] is [23:52] Are more than engineers using Minions? Are you seeing product managers, designers come in? How is this going across the company and across functions? MARK MIRCHANDANI: Yeah, I think-- [24:01] - Part of why I like the Slack example is, the entire company is in Slack. [24:05] Right. [24:07] And, you know, to that point of activation energy, [24:10] Even if you had the text editor on your computer and I gave you the docs and whatever it may be to someone who's not an engineer, it can be really challenging or intimidating or whatever it may be. [24:22] And [24:23] You know, for whether you just want like a proof of concept or you're going to make a docs change or... [24:27] or whatever it may be, like, [24:29] You can... [24:30] You can probably write out in plain text the thing you want to occur, right? [24:34] You might be writing the proc brief, or you might be [24:36] giving design feedback. You're in effect just writing a prompt at some point. [24:41] So, being able to just click an emoji or tag the robot to spin off the minion, we're trying to see more non-engineer usage there. [24:51] Amazing. Okay, so... [24:53] Let's go to our next workflow, which I am [24:57] Yes. As somebody with a stack of Mac minis downstairs, I am excited about. [25:02] So, [25:03] At Stripe, we're thinking about AI in a few ways, right? So the demo we just showed us how we're thinking about using AI internally to accelerate our product development and engineering.
[25:14] The second way is thinking about how we're supporting [25:16] All these businesses that are... [25:19] you know, leveraging AI in their own products, [25:22] and how we can support their business models. And that's with things like usage-based billing. And we just announced our beta of our LM token billing product. But there's a third side, which is like, [25:34] this sort of idea of agents as economic actors or agents that can spend money as part of their attempt to solve a prompt. [25:43] And, [25:44] Before we jump the demo, the thing I'll illustrate is like, [25:47] you know, [25:48] Often you give a prompt to Claude or some other agent, and it will use its own... [25:53] model to generate text and response, right? Or maybe it will do a web search or call an MCP tool or whatever it may be to gather information or to affect change as part of that response. [26:04] And of course, there's the shopping cases, but [26:08] We imagine a future where like, [26:10] Third-party services, [26:11] are going to want to sell into these kinds of experiences, and that those interactions will cost money. [26:17] So we have to equip [26:19] are agents with the capacity to spend [26:21] so that they can not only consume tokens, but so that they can also pay services [26:26] as part of achieving the prompt. So I'm going to give an example. Jen, who's a product manager I work with, [26:33] is awesome. I think her birthday is coming up soon. If not, the demos, it's her birthday party. And we're going to ask Claude to help plan it. [26:40] And along the way, it's going to interact with a bunch of different real third-party services that are really going to accept money.
[26:46] over a payment protocol, we're calling them machine payment protocol, which we've [26:50] co-designed with Tempo. [26:51] And we'll see some real transactions along the way. So I have a sort of pre-baked prompt we'll paste in just to [27:00] Skip that part. [27:02] And I will go ahead and give it-- [27:06] So, [27:07] I told it to research. [27:09] Jen Lee, who's my product manager, figure out what would be a good idea for her birthday. [27:14] Find a place to have the birthday. [27:16] send invites to the birthday, and then, you know, we burned all these tokens along the way, so we should probably donate to Stripe Climate at the end. [27:24] to make up for all the energy consumption, [27:27] So, [27:28] Right now, we're still getting the environment set up, just setting up our ability to pay [27:33] Tempo, the first thing we're gonna do [27:35] we can see right here, is that we've actually... [27:37] paid browser base, [27:39] to create a new browser session. [27:41] So I didn't sign up for BrowserBase beforehand. I'm just paying for this one session. [27:46] It's going to do that. I gave it her website somewhere up here. So it's going to go ahead and spin up that environment. You can see right now it's writing some Playwright code locally, which will connect. [27:57] to that browser-based session. [28:00] It got to our website, right? [28:02] Jen likes, I think she bakes and she cooks. [28:06] So it actually found out by running that browser session that she's a matcha obsessed baker working on a cookbook. [28:12] We're going to go ahead and turn off that browser session. [28:15] We can see the net cost is just a fraction of a cent.
[28:18] And again, we really paid that business just now. [28:22] The next thing it's going to do is using its knowledge of Jen and her interest in matcha [28:27] It's going to search online using parallel AI. [28:31] to find relevant venues in New York that we could host this party, something that matches. Matches her matcha interest. [28:39] I'm going to just do, again, a side quest, a call back to our episode with Andrew and Nabeel. [28:45] who? who? [28:46] used AI to set up a tabletop gaming [28:50] business they were building in the East Bay. [28:53] And my friend texted me and she said, this is the most San Francisco thing I've ever seen, which is two dudes that need AI to help plan their game night. And I was looking up at your original original prompt and I was like, this is such an engineer's prompt for how to plan a birthday party. It's like source env and then insert Jen's name. You know, you're doing something wrong if I have to load environmental variables to celebrate someone's birthday. [29:21] Exactly. It's just like so funny. [29:23] Yeah, so I found this matcha cafe in New York on Bowery that thinks it's a perfect fit for a matcha interest, which is great. [29:31] Now we should... [29:33] You know, send an invite in the mail. You know, we're taking it offline. [29:37] So now we're interacting with this service called Postal Form. [29:40] Postal form will take a PDF. [29:43] and actually send it in the mail. [29:46] So again, right now what we're doing is we're, the LM is writing code locally.
[29:50] to generate a PDF image of the invite. So there's this sort of interesting balance of like, [29:55] What can the LM do itself right with its own tools and my local machine versus what it [30:00] needs a third party service for. Obviously, the robot can't send mail. And I think if the robot could send mail, [30:06] That would be kind of concerning. [30:09] So, you know, that's trying to fix a couple things with the PDF. [30:13] I'm sure the invite looks [30:15] It'll be very interesting to see what the event looks like. It looks machine generated? [30:18] It will look – yeah, it's just a bunch of binary. No one is going to come to the party. [30:22] How do you, I mean, I know this is a little bit of a demo you're giving us here, but... [30:28] I think so many of these... [30:31] Even consumer... [30:33] you know, facing products like I've never heard of Postal Form. It sounds amazing. [30:37] where it sounds like a very [30:39] you know, individual user problem of like, how do I get mail out the door? [30:43] So many of them are going to be interacting with [30:46] agents and like the API as as the interface. And you and I were talking about that a little bit before the show. [30:53] And you were saying you were getting user feedback recently that sort of spoke to [30:57] that. Yeah, we've been talking to, you know, I think maybe including Postal Form, we've been talking to a lot of users as we've been integrating this machine payment stuff. And, you know, it's very normal to ask for feedback. [31:09] And typically they go, oh, I'll get back to you and write up some notes. [31:12] And I would get these like [31:13] In 30 seconds, I'd get two pages back. [31:16] And... [31:17] The engineer over there had used cloud or codecs to
[31:21] read the Stripe docs and implement the feature [31:23] and then figured since like they hadn't really written it themselves, that they'd ask ColliderCodex to send feedback back to me. [31:29] And like... [31:30] It happened once. I thought, okay, that's funny. And it happened like four or five times that week. [31:36] And it was just extremely jarring. And it added this sort of physicality [31:41] to who the new user is here, right? That like the, we'd have to hear from the agent directly. All right, we're just gonna check in quickly. [31:48] We sent it in the mail. [31:50] And then, you know, we burned some tokens along the way, so we actually made a [31:55] dollar and sixty five cent donation [31:58] or a contribution to striped climates to erase 4.4 kilograms of carbon [32:02] based off of our 70k token usage. [32:05] And you can kind of see here, [32:07] agent receipt of the services it interacted with and the cost of each. [32:13] At some point, I'm going to get an invite to a party in the mail. [32:16] I want to just recap this for folks that are not watching. So we started... [32:20] with a prompt and clawed code that said, "Plan my friend Jen a birthday party. This is what we know about her." It preceded, there was some like movie magic here where it preceded, "Here are some tools I know can take agent payments." [32:34] that might be useful in the pursuit of this. [32:37] And instead of a human having to go into those tools, log in, [32:43] drop a credit card, buy a plan, [32:45] There was a machine-to-machine transaction that happened that gave... [32:49] micro access to the tool for the capacity the agent needed
[32:54] to do the job at hand and we see it use browser base and parallel and postal form [32:58] And it issued those payments [33:02] programmatically, acts as just what it needed, [33:05] did a little offset Stripe Climate purchase. [33:09] and then got your party planned. And [33:12] What I like about this is what's really interesting about this particular example is [33:17] is it makes it very clear... [33:19] the economics of [33:21] doing something agentically. I like this little, you know, we got a little Stripe Climate shout out here. [33:27] But it also just calls out, like, this actually does cost you [33:31] in tokens whether or not your agent [33:33] is doing [33:35] outside transactions. So we're already operating in an economic framework, right? [33:40] Yeah, I think I'm on a [33:41] Stripe plan here, but in general, people have a subscription relationship to [33:47] these providers and that costs money and we get a certain number of tokens. [33:51] And any prompt I give, even though I'm not [33:55] like seeing the penny count move by [33:57] has an ultimate dollar cost to it. [34:00] Right? [34:01] And, you know, maybe in the typical coding example and, you know, consuming... [34:05] tens of thousands, hundreds of millions of tokens, [34:08] We've sort of justified the value of that, right, because the code has... [34:12] business value and the size of monetary value. [34:15] But, [34:16] Like the sort of like [34:18] Token and the currency that backs it like [34:21] They feel closer than ever. [34:23] And,
[34:24] You know, whether I'm spending a penny [34:27] or a dollar on a third party service or I'm spending [34:30] tens or hundreds of thousands of tokens with LM, [34:34] We're sort of doing a similar activity, right? Which is that we need intelligence or we need data or we need operations or we need a service. [34:41] to execute on that prompt and achieve some outcome. [34:46] And I think it's like it. [34:48] Even just this view feels very provocative and it feels early, but I think it's going to feel very natural over time to see the token and the dollar side by side. [34:57] And for me, it's like, [34:58] I planned a birthday party for [35:00] I don't know if it's any good, but I planned a birthday party for $5.47. [35:05] That doesn't seem... [35:07] Too bad. Again, we're doing this episode in the year of our Claude 2026. Like, we're going to show the terminal example. [35:14] And most people watching this and again, how AI is for everybody, super technical and not, they're going to look at this and be like, OK, but yeah, like I'm not going to plan my birthday party in the terminal. [35:25] But let's just pull that thread six months in the future or 12 months in the future. There's going to be a bunch of builders out there that are going to wrap this in a much more consumer friendly. [35:35] user user experience and then you're going to be able to build such interesting products that can [35:40] interact and transact in just a much more human way, which [35:45] again, can just solve problems in a different mindset. [35:48] Yeah, I think it would be really interesting to [35:51] build a business [35:52] where your primary consumer
[35:55] sort of wants an ephemeral interaction with you. [35:58] and [35:58] It doesn't necessarily require you having a dashboard or an admin panel. [36:02] Or... [36:04] a landing page or all the other typical things that are really useful when a human or a business is interacting with you. And instead, you could focus on like [36:13] Just a [36:15] hyper-useful single API. [36:17] and monetize that directly and make your audience primarily agents. I think a lot of really interesting [36:25] businesses can emerge out of that opportunity. [36:28] I completely agree. And then we're going to have agents identify what those businesses are, build them, transact with other agent customers. [36:35] Agents all the way down. Well, Steve, this was... [36:38] Awesome. Just to recap for folks, we saw minions and how to [36:42] kick off development work from Slack and the benefits of investing in developer experience again [36:49] VPs of engineering, just like Carvoff, a DevEx team. [36:53] and give it some love and product managers get out of the way you'll get more product at the end of the day if you just [36:58] Give some time and effort towards developer experience. [37:02] And then we got to see these machine-to-machine payments, which I think by the time the episode is live, we should be able to... [37:08] maybe talk about or see. So fingers crossed. This will be live by the time our episode goes live. And we showed you how to plan a, I guess got to zoom in, a matcha cheesecake cheesecake, [37:20] Birthday party in New York City. Jen Lee's matcha party, April 19th, apparently. All things matcha. I guess I didn't pick the date. So the robot has decided that will be a good birthday. Saturday, April 19th, 3 to 6 p.m. Sounds perfect. We plan a birthday party for $6, carbon neutral.
[37:36] Steve, this is awesome. Before I send you off, a couple lightning round questions. One, [37:41] You know, we showed kind of a contrived personal use case, but [37:45] What are your personal workflows for AI? [37:48] The thing I've been really interested in is the sort of like disposability of software. [37:53] And [37:54] I have a four month old now and almost two and a half year old now. And the two and a half year old keeps grabbing my phone to try to change music. [38:01] So I've toyed around with like music apps that are extremely controlled to just six songs. I have no idea how to build iOS apps, but the robot does. So I've been toying around with like little... [38:11] little engagements like that. [38:12] And then I use... [38:14] you know, all the AI apps sort of in the normal way, I guess, in addition. [38:18] Yeah, well, if folks want to create an app like that, we just did an episode with Jesse Jenea, who built a like [38:24] minimalist YouTube for kids where it can only, like her kids can only watch the videos that she pre-approves. And you can only swipe back and forth. You can't do any, like no other buttons. It's very, very streamlined. [38:37] So very similar to your music example. Okay, and then my last question, which [38:42] Not a sneak preview of a little up on this Claude example, but... [38:47] Thank you. [38:47] When AI is not listening, [38:49] You know, when your minion does not one-shot, [38:52] What is your prompting strategy? And you're a parent. So like, do you gentle parent your AI? Are you like, I know you can do it? [38:59] Or do you bribe it? Do you offer it 15 cents carbon neutral? Like what are you doing?
[39:06] This sounds crazy, but I have... [39:09] made a concerted effort [39:11] to always be polite. [39:13] And... [39:14] And I don't... [39:17] I like sci-fi. I like alien stuff. There's this sort of like... [39:22] Who knows if that's gonna happen or not? [39:24] But like, I definitely don't want to be caught being rude. Even though like, I think I've read some stuff of like, you know, being more intense or being rude can result in better. It's like, I don't want to like... [39:34] I'd rather have to do a little bit extra work than have it on the record that I was mean. [39:38] Because you never know. [39:39] You never know. But the more serious answer is... [39:44] One... [39:46] Asking it to explain or justify itself [39:50] has helped quite a bit [39:52] And then I think in other cases, I've... [39:55] I've tried like [39:56] In other cases, I know the right direction to go. [39:59] I will start going in the right direction. [40:02] and then I will ask it to look at [40:04] sort of like the git status to look at the diff. [40:07] Or like look at other sort of like breadcrumbs that I've left. [40:10] as the directional thing to help guide it. [40:13] And then of course, like, [40:14] If I'm doing a thing that's not recurring, but that I'm going to do again, I try to keep that in some skill or prompt or otherwise that I can inject back in later. [40:23] Got it. So you're doing like the dad teaching his kid to ride a bike move where like your hands on the back of it and then you let it let it go. You're like, here is what I want. It didn't really hit me until you said that, but there's something really weird about raising kids at the exact same time that the.
[40:38] The robot emerges. It hadn't really clicked with me yet. So I don't know what's informing what, but they are happening at the same time. Yeah, I said something like it's really interesting to be raising kids and literally writing like soul.md files into my agents like. [40:54] I guess that's a virtuous cycle of skills. Well, Steve, this has been awesome. Where can we find you and how can we be helpful? [41:01] We can learn more about the work we're doing at Stripe at Stripe.dev, which is our blog. So you can learn all about some interesting things we're building. The demo I just showed you, you can learn more about at docs.stripe.com slash payments slash machine. [41:16] And I guess I'll plug my Twitter, which is just at Steve Kaliske. Amazing. So those three. Well, thanks for joining How I AI. This was awesome. [41:24] Awesome. Thank you so much for having me. [41:35] You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at howiaipod.com. See you next time.
Want to learn more?