Aponte Inga, Claude Code 2.0, and Prism Streaming

TJ (00:00)
Hey everyone, welcome back to the Slightly Caffeinated Podcast. I'm TJ Miller.

Chris Gmyr (00:04)
and I'm Chris Gmyr

TJ (00:06)
So Chris, what is new in your world?

Chris Gmyr (00:08)
Well, that's been a busy couple weeks. We couldn't record last week, which was fine, but yeah, just busy with work stuff, doing some scout activities, selling a bunch of popcorn on the weekends and family online and stuff like that. And then getting into this week, our little one caught a cold or something like that at the Y daycare. So she has been passing that around. So if I sound a little...

TJ (00:29)
Mm.

Chris Gmyr (00:35)
stuffed up or whatever. She gave it to me. I did not wake up in ⁓ a great spot, but we are hanging in there. But yeah, there's a bunch of things that we can talk about for projects at work that we can circle back to. yeah, generally everything's been pretty good and just lots of stuff. But yeah, how about you? What's new in your world?

TJ (00:37)
Ha ha.

man, starting to feel back to some stability after like, just struggling with wavy mental health for the last little bit. yeah, I couldn't record, like just was not feeling up to recording last week and, finally played some catch up with my newsletter and sent out, a newsletter with like a little bit of an update and just kind of like some of my struggles, where things are at and kind of my game plan moving forward. So, since that I've been like,

back in on Prism. still haven't caught up on issues and PRs yet, but I made a ton of progress on the streaming output refactor, got another feature that someone really needed built and deployed. So it was really nice to like finally tag a release after not releasing for a little while and get some stuff out there. So feeling, feeling bit better feeling

a bit more stable, so I'll absolutely take that. And we had like a crazy fun weekend. Went and saw Deltron 3030, went to an Apple orchard, and then we were supposed to go to another concert Sunday night. It's like Saturday, we went and saw Deltron. Sunday morning, we went to the orchard, which was great. had far too many donuts. ⁓ Yeah, they were just...

Chris Gmyr (02:09)
You gotta take advantage.

TJ (02:12)
They were just so good. couldn't stop eating them. We were like walking around with this bag of donuts, like, because they also have the cider mill we went to also has like a big fun land with like a petting zoo and hay bales to climb. just tons of stuff. It's just like huge cider mill complex that they've got going on. But we went and got donuts first. And then so like we're walking around the fun land with these donuts and I just can't help but like keep reaching down and grabbing another donut.

Chris Gmyr (02:29)
cool.

Well that time of the season you gotta take advantage when it's there.

TJ (02:41)
Yeah,

especially when they're like fresh and they're warm and they're gooey and I just I could not help myself

Chris Gmyr (02:45)
Yep.

TJ (02:46)
so that was super fun. And then we were, yeah, Sunday night, we were supposed to go to another concert, but we got there, found out it was canceled. Like had missed the cancellation email like months ago. so we pivoted and we were like by one of our favorite arcades. So we went out and, ⁓ hit it the arcade for a few hours and just had a absolute riot. So, ⁓ yeah, turned, turned that like.

Chris Gmyr (03:07)
Thanks

TJ (03:10)
turned something that would have been like a real big bum out until like a really nice night and much needed and it was good.

Chris Gmyr (03:15)
Yeah, awesome. That seems like a good pivot and can't go wrong with a good arcade either.

TJ (03:20)
Yeah, no way. Like that's something we've been doing a little bit more frequently lately. it's just a riot, dude. Like I'm so, my wife and I are both like super hooked on like coin pusher machines. So I don't know if you've like played those much, but like they're, seems like they're gaining in popularity because like all the arcades are like adding more of them. But

I've always been a big sucker at the coin pusher games, just like instant dopamine. It's like a timing thing and it, is like kind of up my sleeve. so there's the arcades that we frequent and have like both added new ones. like one of them, instead of a coin pusher, it's like a marble pusher. very, very fun. and racked up like a ton of tickets, I think.

We walked out of there with like 30,000 tickets, like decided to just hold on to them for another time. ⁓

Chris Gmyr (04:07)
Yeah,

especially a lot of them just have them on like the the cards so you just like come back the next time and just keep it going like you don't have to empty them or keep your boatload of tickets for the next time so it makes it a lot easier if you keep those cards.

TJ (04:15)
Yeah.

Yep, yeah, like save up for stuff, you know? And the one that we went to is like, it's really cool. I'd say about a third of it is like claw machines. And then the other two thirds are like various like arcade games and stuff. So they had a, dude, I'm such a sucker for it. They had one with a reptar in it from like Rugrats and we were

Chris Gmyr (04:40)
like a good claw game.

Mm-hmm.

TJ (04:49)
hell-bent on getting one. So like we'd go play some games for a while. We'd like come back, try it a few times. Like if it didn't get a good grab or anything, like we'd go play some other games and come back to it. Eventually my wife won one and she's like, do you want what? Do you want something? I'm like, I got to know. Like we can try this claw machine. So she sits down and grab like first pull, like grabs it, gets it. ⁓ my God, just you're on a roll now. So walked out with a couple stuffed animals too. So.

Chris Gmyr (04:51)
you

That's awesome.

Yeah, love the claw games. And ⁓ there's a couple around here that have ticket claw games. they have a roll of, they're not real tickets, but they get scanned as they drop into the bottom there. So you'll grab a 100 pack or 5,000 or something like that. So one of the last times ⁓ that we went,

TJ (05:17)
pretty happy with that.

Yeah

Chris Gmyr (05:41)
spent a whole lot of time on that and got like a whole bunch of like 5,000 rolls and a bunch of like 500 and 250s and just banked a bunch of tickets, which was fun.

TJ (05:48)
Nice.

Yeah, that's sick, dude.

That's awesome. want, I want to like, it makes me want to like build a coin pusher or something this, this winter so I can have something to like play here, you know? Or like a ski, I would kill for a ski ball machine at the house. I'd play all the time.

Chris Gmyr (06:02)
There you go. Yeah.

Yes,

that'd be awesome. Love skeeball too. Are you a go straight down the lane? Are you like a bounce off the side? Like how do you how do you usually do this?

TJ (06:09)
Yeah, big Ski-Ball fan.

I normally

play it safe and just go for like the middle row. if things are feeling like good, I'll sometimes go for like the corner jackpots, but normally I'm just like, yeah, straight up, like run straight up the middle and just kind of hit that, that center line of stuff and you know, bank bank that way.

Chris Gmyr (06:36)
I usually bank it off the ramp. I aim for the bottom of the ramp so it just kind of projects itself and usually gets pretty straight once you're hitting the holes up there for the different points or scores. It tends to be pretty good depending on the quality of the skeeball machine. Some of them work a little bit better than others. ⁓

TJ (06:55)
Yeah, for sure.

Yeah,

there's also like a cornhole one now with like rotating holes that light up. You got to like throw the bean bag through the right holes. That one's super fun too.

Chris Gmyr (07:07)
Mm-hmm.

Yeah. Nice. Well, very cool. Want to shift from arcades to some coffee?

TJ (07:11)
Yeah, man.

Yeah, let's talk about some coffee.

Chris Gmyr (07:17)
Sweet, yeah, I got a couple updates this week. So last week we started and finished a new bag from Purity. So this Aponte Inga Medium. And it's another one from that like Sacred Cups brand or sub-brand. And this one is ⁓ Honey Processed. So I know we talked about Honey Processed a couple episodes ago.

TJ (07:35)
Yeah.

Yeah.

Chris Gmyr (07:42)
It was

my first bag of honey process that we've gotten, or at least that I remember getting. And this one is really cool because it's seemingly like super rare. It's a single origin and it's basically made on this Inga reserve. And the tribe who owns the land or occupies the land takes care of the trees and the beans and the whole thing.

So it's really cool, really cool story. I'll put a link to the beans in the show notes, but yeah, it was pretty tasty. And I would definitely get another bag of it. I think they have another bag of it coming in this week. So looking forward to having some more of that. Not sure if it's like a year round thing, but yeah, would recommend.

TJ (08:26)
Yeah, this looks really good. I bet this is delicious. Just like reading through the page. Yeah, this looks great. Right up my alley, man. I'm going to order a bag once we're done here.

Chris Gmyr (08:37)
Yeah, totally. And then I went to a coffee shop with a coworker because we do some random coffee and co-working at different coffee shops around. We try and do it once a month, but the last couple months have been kind of busy and couldn't get our schedules together. But we went on this last Tuesday, and I wish I took a picture of it, but I totally forgot until I headed out. like, man, I should have saved this for the show.

TJ (08:57)
Hehehe

Chris Gmyr (09:01)
But they have like specialty pour overs. Where the specialty is the different beans and where they come from stuff like that and the one that I got was I don't know if it was from like Guatemala or like Honduras or something like that, but it was like a mixed Like processing it was like honey and something else and it was so good. It was like it was like a

TJ (09:27)
wow.

Chris Gmyr (09:29)
eight or nine dollar pour over but it was it was so good and i don't know if they like don't have very many people pick those up because they're kind of like a little like specialty board like off to the side and the barista was just like yeah let me know like how good how it is ⁓ so i don't know if they have like that many people actually get it or if they drink it all that often ⁓ but yeah i'm like this is fantastic i would recommend this to anyone coming in looking for

TJ (09:31)
Yeah.

Chris Gmyr (09:56)
like regular drip or any sort of like pour over. Let's like, yeah, get that one. So.

TJ (10:00)
that's sick,

man. I've been trying to get out of the house a little bit, being, like dealing with like my depression and stuff. And my wife being in school, like at class, like four days a week. It gets lonely around the house, man. So I've been getting out and going to the coffee shop and working up there a little bit. So I've been on a big, big nitro kick and that's been, that's been like pretty delicious. ⁓

Chris Gmyr (10:22)
Yeah,

TJ (10:23)
filling out my stamp card, gonna have a

Chris Gmyr (10:23)
love the nitros.

TJ (10:25)
free drink soon, but it's been really nice just like getting out of the house, heading up to the coffee shop, hanging out there. It's like right next door to like our favorite bakery too. So, you know, go in there, you know, grab a pastry, grab some coffee, sit down and work for a bit. And that's been really nice. Like changing it up, getting some fresh coffee and getting out of the house.

Chris Gmyr (10:47)
Yeah, yeah. I always say that I want to do that more, but I never end up pulling the trigger for whatever reason, either like it's busy morning, because usually like my afternoons are just like crazy with meetings and would not be beneficial to be in a coffee shop. So was like, well, I either got to get out first thing in the morning and be back by like lunchtime or so. But I don't know, just for whatever reason, don't get out and do it. Plus it's like, you know, a much different setup.

on the laptop compared to widescreen and all your tools and gadgets and stuff like that. So it's a little bit different working experience, but I do like it when I get out. just, I don't know, gotta force myself to do it.

TJ (11:23)
Sure.

Yeah. Yeah. It's nice that, um, my meeting schedule is actually like opposite because of time zones and everything. Um, most of our meetings are in the morning. My afternoons are like typically pretty wide open. So that works out well to like be home and get through like whatever meetings I've got. And then, you know, go chill with the coffee shop for like an hour or two before I got to be back and like pick up the kiddo from school. So yeah, that's been nice. So.

Chris Gmyr (11:39)
Mm-hmm.

Very cool.

TJ (12:00)
Moving on from coffee, you want to talk about some of your upcoming projects? Like I know you've been working on some cool stuff. You just recently shipped a project and, so yeah, I'd love to hear about like what you kind of got coming up in the pipeline.

Chris Gmyr (12:13)
Yeah, definitely. So yeah, we're still slowly rolling out production traffic to the new service, which has been going well. Just a lot of tweaks and changes of, hey, let's revamp the logging and some monitors and metrics and tweak these things a little bit more. But nothing major as far as that goes. So that's been great. And then the two other projects, which have been pretty cool, which I

finished up one of them already was discovery into one of another service that we have in our ecosystem that has or it's supposed to have just like minimal core data for like some of our entities. But it's been getting like overrun with like new features or like other teams bringing in more data, more domains, more functionality into that. And

It's a service that our team is supposed to own in the future. I didn't want to take ownership of a bunch of stuff that shouldn't be there, because we're supposed to own some of this core data. So the discovery project was figuring out what kind of domains and data buckets that we have, what's the functionality for some of these.

highly used features, what's the traffic like for the API endpoints. So I had worked a bunch with Claude code to come up with a plan for one, grabbing the data domains and buckets and grabbing data from Datadog and AWS and a couple other things. So I had it, I explained what I wanted to do and like a pretty

robust plan, and then it filled in the gaps of like, here's how you do this, or here's like the API call to make, or here's like the AWS CLI command to run. So it came up with like a whole bunch of commands that I could run. It downloaded the data in CSVs or JSON, and then it took all that data and like aggregated it into like nice markdown files, tables, stats, things like that. Some mermaid diagrams.

TJ (14:10)
Woo, sick.

Chris Gmyr (14:12)
like all the things and then same thing with like the functionality side and API calls, traffic, traffic analysis, things like that. Again, gave me a bunch of like API or like CLI commands to run. It didn't run any of this itself, but gave me the just the copy and pasteable commands to run in a different like CLI terminal of like run this ⁓ based on the data dog API and it'll kick out this either JSON or CSV.

TJ (14:30)
Yeah.

Chris Gmyr (14:38)
dump it into this directory and then I was able to direct it to analyze all these files and seven day traffic and 30 day traffic and hotspots with some of the APIs or data and then eventually mash all this data together and come up with a more succinct summary of the project and then.

Like I went through and reviewed it like a bunch of days and tweaks and changes and stuff like that. And then was able to present like a plan for what could be a potential for breaking some of these features or domain buckets into additional services. What do we keep in the current service? What do we need to maybe optimize? So for this specific service, we're using Postgres and a lot of the data.

is not structured very well and we also use a lot of UIDs which default to UID 4 and we're using a lot of these in primary keys which is not as performant for primary keys and the B-tree and finding and reorganizing indexes because they're all totally random. It'd be a lot better if we were using UID 7 or something similar that has like that timestamp that you could easily

TJ (15:49)
Yeah.

Chris Gmyr (15:49)
sort by.

So it also was able to come up with a bunch of performance things that if we did take ownership of the service in some of these domains that like these are things that we can implement in the future too. So yeah, it was just a super cool process. And I originally spec'd out like this taking probably like three ish weeks, two or three weeks to do all the discovery.

And I was able to do this on and off for about seven work days. So a little over a week to do this, not full time. It was kind of like part time in addition to all the other things that I usually have to do. But it was so helpful to be able to jump into this, run some more analysis, group it, put my own words and spin on it, figure out what I wanted to present, tweak around the domain.

buckets or different features around a little bit more. And I think it came out as like a really good product and proposal for this project. So yeah, I was super impressed with like how it came out and it made it lot easier compared to like me going through like the API docs or like AWS docs to like try and figure this out. Because a lot of the stuff in there, you can't really export or export very easily. So that's where like the API calls came in.

TJ (16:54)
Yeah.

Chris Gmyr (17:07)
Very handy.

TJ (17:09)
Yeah, no, that's huge. think that's like such a smart use for LMS and like, I guess it's just a great use case. And I love, I love the way you approached it, like getting it to help you. Yeah. Like export all the data together and then be able to like collate that. well that's sick, man. That's, that's awesome. I think that that's such a huge time savings, like trying, even just trying to like.

figure out and like sift through all the different ways to like export and extract that data and like figure out like how to get to it and then like figure out the API calls and all of that. Just I think what a massive time saving like huge time saving and just that alone. Not even considering like being able to like group and organize everything and like kind of like put some.

logic behind that to just like here's here's different ways to approach it. I think that's great. It's huge.

Chris Gmyr (18:01)
Yeah, yeah, so it was it was really fun to work on and like I said, I'm super happy with how it came out and again the time savings was massive so then I could go work on other things and next project. So the ⁓ other project that I was tasked with a couple weeks ago is. Side project for or side interest from someone else in the the engine leadership team of.

TJ (18:13)
Yeah.

Chris Gmyr (18:27)
how do we build in AI and AI workflows into the engineering team? Not to replace anyone, but there's a lot of repetitive things that we do, some things that are not as optimized. A lot of people are running different tools. have GitHub Copilot available to us, but a lot of people don't use it or don't know how to use it. Past, just asking random questions or using it as auto-complete. Other people are using

tools like Cloud Code or ChatGPT, Codex, a couple other editors, things like that. So everyone's all over the place and hardly anyone is having decent enough success with it because everyone is using different tools and there's not a lot of knowledge or best practices or anything like that. So what I've been looking at is what can we build into our engineering system and tool chain and proposing other ideas.

TJ (19:08)
Yeah.

Chris Gmyr (19:21)
for that. So because we already pay for GitHub Copilot, are things that we could do essentially for free within that ecosystem? But what are the downsides of that? What are some other tools that could be better for these other processes and things like that? So I've been doing a bunch of research with that, tinkering with the services that we own on my side to do PRs or push out little bug fixes or

whatever the case is and building up a little bit more of context for myself and also ideas for tooling, you know, moving forward. So created a proposal to basically say like, Hey, we're going to do X, Y, and Z and co-pilot. But I would really suggest like our team getting access to Claude and Claude code for teams. Like there's a enterprise plan. So

trying that out for a few months, so like three months. And here's the plan for that three month pilot of what we're going to do, not only in Copilot, but also build out these tools in Cloud Code and also look at what other things that we can build to support the AI system, like custom MCP servers, maybe load documentation, maybe.

have a different one for loading in all of our open API specs for all the services that we have. So you can ask it, hey, I want data about, I don't know, a user that I want the data x, And it would go out and find that and bring that in and be like, well, you don't have a SDK or a client in this app for calling this other service. Would you like to build one and then implement this API call?

TJ (20:41)
Mm-hmm.

Chris Gmyr (21:05)
So there's lots of things that we could do to support like our AI ecosystem in the future. We just need a little bit of a push and momentum to actually get their codify and toolify things and then get more people onboarded to that and give a lot more knowledge and best practices to them and give them the tools that they can like initially run with. So.

Within all that, I've not only done a bunch of research into a lot of those things, but also started tinkering with some projects on my end of setting up more custom commands that we could eventually commit to the repo. I've tinkered with a lot of ⁓ Claude's subagents, which I haven't really used before. I wanted to see if you've used them in any of your projects, either for work or maybe with Prism.

with sub agents in Claude have been fantastic. And that's like another, I think key piece to all this is have like an agent for having like a Dynamo DB like pro and having like a TypeScript pro and like architect or code review agent and a whole bunch of stuff that is like living within our tech stack that then you could.

use within Claude, these custom commands, like your own one-off commands and prompts. These can also be kind of synced to GitHub Copilot's agents feature. So you can use these in the chat if people prefer that, or even in the GitHub ecosystem on the web, which is really pretty cool. So I've been tinkering with a bunch of that stuff, which has been awesome.

TJ (22:43)
That's awesome, dude. Like, I haven't played with agents a whole lot.

You know, that's, that's actually something I wanted to play with tomorrow. have a hack Friday for GeoCodeo. And I think that's kind of what I wanted to do is look into splitting our Claude file up a little bit into a few different agents and just kind of playing with it to see what we can get out of it. So a couple of weeks ago, I committed our, I committed a Claude Markdown file for

for us to kind of like standardize that use. like we're all, we all use cloud code. So, standardized that I got boost installed in our mono repo of like our two Laravel apps got that figured out, like figured out how to do a mono repo of that. And that's been like working out super well. But yeah, I think, I think there's room for us to be like utilizing agents for different things. So.

I'm going to try to see what I can come up with workflow-wise and just kind of do some experimenting with it tomorrow.

Chris Gmyr (23:40)
Yeah, and it seems like the more specific you are with the agents or the subagents, the better that they're gonna perform, especially with their own context. And what's nice is that as you build the agents, it basically comes with its own clawed file because the agents are just like markdown files with some metadata attached to it, essentially. So you can say, hey, I want a Laravel

TJ (23:46)
Yeah.

Yeah.

Chris Gmyr (24:04)
testing agent or like a PHP unit testing agent and you can dump all of your testing stuff in that agent and say like within this workflow, do like TDD first with this agent and then come back with the implementation with your Laravel agent and then that will do the work and then pass it back to the test agent to make sure the TDD cycle continues until you're done implementing this feature. So it's really cool to see.

how all this stuff happens and then you can run agents in parallel with certain things and the output of one goes into the other. So the TDD agent can pass all the changed context and files to the Laravel agent and then the Laravel agent knows what it needs to implement and where the tests are to then run the tests and around and around you go. So yeah, it's just super cool. And then you can have agents for doing get operations. So when you're done with this,

commit as you go using this agent and then make like a PR branch description like when you're done with it. So it's like super cool. And there's like so many agent examples out there that I'm like sifting through and seeing if we can like repurpose them for our own repos and stuff like that. So I would highly suggest like checking out some of the example repos. if you search for

like Cloud Code agents or like awesome agents and a of those, then there's a bunch of different repos on GitHub to look up. I can also post some that I've been looking at in the show notes as well.

TJ (25:34)
Yeah, do that, man. I just kind of pulled up the awesome Cloud Code repo. I'm going to take a look through there and see what they've got as far as different slash commands and agents and stuff. cool, dude. Yeah, I'm definitely going to, whatever you got to jumpstart me on that, I'll take it because that's going to be my fun experiment tomorrow.

Chris Gmyr (25:57)
Yeah, totally. I'll add a bunch in the show notes and go from there. But yeah, I never really worked with agents before, just like the main context. But it makes a lot of sense that you have these specialized mini-contacts and agents, and they keep things separate from the main context window. So with utilizing agents over the last week, week and a half or so,

⁓ I've had, it seems like more space in the main context because it sends back and forth like minimal data to the main context. So I'm able to like last longer in that session without having to like compact or do anything else like that. So it's been really nice. And it's also cool to see like all the agents have like labels with like different colors. So it's cool like.

TJ (26:30)
Yeah.

Chris Gmyr (26:44)
and you have a big workflow, all the different colors and agents and sending back stuff. there's a status indicator now. So it'll blink a gray almost. Then it'll turn to solid green when it's done. And then it's waiting for other agents to get done. And then it might pass things back and forth to a previous agent that was done to recheck something. And I don't know. It's just super cool to have all these things running in the background.

TJ (27:10)
That's sick. Yeah. That was part of why I haven't played with them much is just trying to figure out a strategy of like what types of agents make sense. do I have, do I leave the like main agent kind of as the like planning and coordinating agent. So like, that's just like your main, like you start Claude, that main context is just going to be about like planning and coordinating.

And then everything else is an agent, Like your Laravel developer is an agent, you've got like a testing agent, like maybe a front end agent.

I think that's maybe the strategy that I'm going to go after before I do a little bit more research, but that's probably what I'm going spend a little bit of time today doing is just like poking around, doing a little research, kind of gear up for some experimenting tomorrow.

Chris Gmyr (27:57)
Yep, yep, definitely. Yeah, and any of the supporting tools that you have for the application. So everyone knows that you've been working with ClickHouse and Kafka and some of those. Those would be great specific agents to have. It's like, how do you do this in Kafka or check our architecture for this? Or like, hey, is this the most efficient way to do this with ClickHouse or something?

Setting up all those mini subagents would be a great potential as well.

TJ (28:26)
Yeah, break out into some domain-specific stuff. Yeah. That's cool.

Chris Gmyr (28:29)
Yep, yep, totally.

Yeah, so I'll keep you and everyone else updated how things go. But yeah, I'd love to roll this out to the rest of the team and just get a lot of these things standardized. So basically, everyone is getting a similar result across the board and doing less one-off prompting. If you want to do something, let's try and make a command for it.

TJ (28:49)
Yeah.

Chris Gmyr (28:53)
have it pass through some arguments of specifically what you want to do with this command or something like that. But let's try and codify things as much as possible and make it really easy for people to get good, consistent results instead of just yoloing it into a main prompt or a different tool or something like that.

TJ (29:10)
Yeah, for sure, man. And like my Clawed file right now is ridiculous. So splitting that up into agents, I think, makes a ton of sense.

Chris Gmyr (29:18)
Yeah, yeah, totally.

TJ (29:20)
Cool, dude. mean, continuing talking about Claude, there has been... I mean, with... There's like a couple things, right? There have been, over the last couple months, like a lot of complaints about...

Claude's quality and like a lot of questioning, like what's going on behind the scenes. And I think there's like a lot of that right now with the latest, like Claude 2.0 and Sonnet 4.5 release with like usage limits. And there's like a whole bunch of stuff going around right now along those lines, but kind of thinking about like the last couple of months and people experiencing like quality issues. Anthropic actually identified like

a few issues that they were experiencing that kind of like caused some issues between like August and September of like basically they had like three issues going on in parallel that made it really hard for them to debug. And it looks like

Chris Gmyr (30:07)
Mm-hmm.

TJ (30:15)
A lot of it was put behind routing. They had some issues routing to different models or routing to new models that weren't quite ready yet, like the 1 million contexts on it. they had sticky routing too. if you got accidentally routed to the wrong server,

you're going to be getting like completions and they have like sticky routing. So like now the rest of your completions are going to be going to that server. And if it's like the 1 million context, you're now have like this massive context window, which you're going to have like quality loss across like more context you have, the more risk you have for quality loss. So people were running into like issues there. They had some like token generation settings that were misconfigured. and just like,

not enough on the evaluating side of things for making sure that they're not introducing bugs and they needed to add more infrastructure for preventing things like this from happening again in the future. So I don't know, did you have any major takeaways from that? I feel like I was definitely impacted by this at times. But.

It didn't scare me enough to like run to like Codex or OpenCode or something like that. I stuck through it through Cloud Code and I've been experiencing none of those issues now with like Cloud 2.0 and Sonnet 4.5.

Chris Gmyr (31:39)
Yeah, yeah, totally. Similar to you, like, and I think we mentioned before, it's like one day be super solid, the next day you come in, it's like, what, what are we doing here? You know, and then the next day, like, it'd be fine, or maybe like not as worse, and then the following week, it'd be fine. So yeah, I had like little bits and tweaks that were, you know, not great. But when I saw the postmortem come up, it's like, oh, okay, there's like a bunch of different issues. And it took a long time.

TJ (31:48)
Yeah.

Chris Gmyr (32:07)
because of the security and privacy that they have built in kind of behind the scenes. Like a lot of that is not really known to them until they have a case to specifically type into it. But yeah, it seems like they made a whole bunch of improvements and monitoring on their end and different debugging that they could have in the future. And it seems like post this postmortem after a week or two, it has

been significantly better. And then earlier this week, good segment to Claude code 2.0 and Sonnet 4.5. That has been a major upgrade. And everything's been super smooth with that so far. Yeah, how do you feel with the upgrades?

TJ (32:50)
Yeah, I'm loving it, man. So with the big streaming output refactor for Prism, I was trying to get Claude to do some heavy lifting. Like I did a bunch of work on like getting the events, like new streaming output events created. I manually moved over the Anthropic provider. Then I got Claude to move like four other providers over or something like that. And then I got to like the last like three or four providers and it was just choking on.

I'm like, it could not figure out how to like do the implementation, write the tests. Like I do not know what was going on, but it could just not get the work done after like three or four attempts. I eventually just gave up, decided that I was going to have to manually rewrite everything. But, um, I just got like super depressed, had to take a break from Prism, came back to it now with Claude 2.0 and Sonnet 4.5 and

It just like smashed through it, like no problem. Like the, took two tries. Like the first try, it couldn't get one of the tests to pass. So it just like commented out the test. It was like, left me a to do. and I was like, no, no, go back. Like you must fix the tests. And it went back and like fix the tests and everything. And it smashed through the rest of the providers. I'm really happy with the Cloud Code 2.0 updates.

Chris Gmyr (33:47)
Yeah.

TJ (34:07)
sonnet 4.5 has just been a beast as far as models go. the only downside with it, like I'm not necessarily seeing the effects of it, but like Twitter and Reddit are like in a bit of an uproar over like people are hitting their Claude weekly usage limits within hours. Like someone was like, yeah, five hours in and I've hit my weekly usage limits for Claude code. ⁓

Chris Gmyr (34:31)
interesting.

TJ (34:32)
Yeah. So like people are, I don't know what the deal is. Like if they're, if it's just legitimately using that much more usage or if there's like a bug with their usage calculation, I don't know what's going on, but like there is a lot of uproar about it. I haven't run into it. I've been using it pretty heavily, um, the last few days, but I don't know. There's definitely like enough people upset about it that there's, there's something going on.

Chris Gmyr (34:59)
Yeah, I haven't ran into it either, but I haven't been super heavy into it this week. Just have different things going on. But yeah, definitely keep in mind for the future and see what's up with that. Yeah. There's some other great features in Cloud Code that were pushed out, like better agent handling, which is great because, like I said, been trying to introduce agents so it handles them.

even without like specific prompting better like it's always better if you like specifically prompt for like agents or like within a workflow or a custom command But it's able to pick up that like a lot better moving forward, which is pretty cool and then checkpoints like it'll pause like as it's like working through things and like wait for like your input or You can rewind them as well

So you can say like, well, we went off the rails like here. Let's go back to and rewind to like a checkpoint that was like three iterations ago. So it'll just like automatically like roll that back. And that's like a lot more visible to you and like a lot easier to control instead of like coming back and trying to grab a timestamp or like a diff from like earlier and as the train to figure that out. So I haven't used that yet, but I can definitely see that as you like let Claude and

agents run for a lot longer, that those checkpoints will be very helpful if you wanted to adjust anything after the fact.

TJ (36:25)
Yeah, I've used the checkpoints a little bit, working on Prism the last two days. Definitely works really well. Yeah, some of the checkpoint stuff was pretty interesting. Like it stopped me, had like 7 % contacts left before compacting. And it was like, well, what do you want to do? Do you want to like finish the work? Do you want to do something different? Because we're going to like...

run out of tokens soon. I thought that was really cool that it like, it stopped its work and it's just like running out of tokens and like auto-compacting and like trying to move forward. It was like, well, we're going to compact soon. What steps do you want to do? And I was like, Hey, just like, I'll put a detailed status of like what we're doing, like create a plan dot MD file, put in there what we're doing, what's done, what's left to do, and like be really detailed about it. So.

It did. And then I was just like, yeah, like mentioned the like at mentioned the plan file and was like, yeah, pick up where we left off. And it just like continued on from there. But I thought that was really cool that it like checkpoint itself before auto compacting and like prompted me with like, well, like, what do you want to do next? Because we're going to compact soon. And like, it seemed like it had an awareness that like, yeah, compacting is bad.

I mean it is, you get a huge quality loss after compacting because you lose so much context.

Chris Gmyr (37:38)
Yeah.

Yep, definitely. And then I've heard on like other podcasts and resources that it seems like older models and like older versions of cloud code seem to be almost like protective of losing or getting to the end of the context window and like not being as thorough. So I'm not sure if you've like ran across that as it's like running down to like.

10 % or even 5 % of it's like it's not making like the best decisions or code changes or anything like that. So it's cool that they've made the adjustments between Cloud Code 2.0 and ⁓ probably even built into the model of like, hey, which way do you want to go? Do you wanna try and squeeze this in for as many tokens as we have left or context that we have left? Or do you want to compact and restart basically after that?

TJ (38:16)
yeah.

Chris Gmyr (38:37)
So yeah, it's just really interesting how all these features are evolving and the capabilities of AI in the tool set.

TJ (38:45)
Yep. Oh yeah. Like I've been very impressed so far. So, um, I'm going to, I'm committed to keep using it. Um, so yeah, I think this is, I think this is really good stuff. Um, very happy with the release. Really interested to see what shakes out of the usage complaints. Um, it'd be really interesting to see what happens out of that. I think I saw yesterday.

Chris Gmyr (38:54)
Same.

Mm-hmm.

TJ (39:11)
that there's been enough commotion about it that they're like gonna, that they're like looking into the situation. So it seems like it's on Anthropix radar at least.

Chris Gmyr (39:20)
That's good. Yeah, and then it seems like they, I don't know if they made like a big to do about it, but you can see like all of your usage now, if you have like the max plan in your like main, like Cloud AI chat settings, which that didn't used to be available. Like if you're using Cloud Code, a bunch of just wouldn't say anything in there. So that's like a nice little upgrade too. So hopefully they figure out whatever's going on with the usage limits.

that might be a different calculation.

TJ (39:47)
Yeah, I think you can pass like,

I think you can pass like slash usage.

I think slash usage gets you to, it displays your current usage inside of Cloud Code.

And then you can also view it from their console.

Chris Gmyr (40:01)
Yep. Yeah, it looks like they updated that too, because it used to only work on the API plan. It didn't used to work on the Max plan if you were logged in with Cloud Code on Max. But I haven't checked it in a while either. That was probably like a month or two ago. So yeah, it seems like they're just making a lot of nice updates that were kind of hidden away before.

TJ (40:24)
Yeah. Yep. I just went and checked the usage, but I just went and checked it on the API and I haven't been using the API too much recently.

Chris Gmyr (40:31)
I just have the API connected to ⁓ Raycast now. Not that I use the AI in Raycast that much, but just helpful here and there.

TJ (40:38)
Yeah.

Yep. Well, cool, man. Do you want to wrap? We'll just like Prism updates and then we can like probably wrap up after that.

Chris Gmyr (40:47)
Yeah, sounds good. What do you got for us?

TJ (40:48)
Sick man, streaming output. Man, I think I'm done. think the code is functionally complete. What I want to add in there is probably two hooks, either an onEvent hook where you can define a callback and we will call your closure or your callable and

pass in events as they happen. instead of yield, well, we will continue to yield as a generator if you use as stream. But if you're using one of the response types, like as broadcast, as data stream response, or as event stream response, you kind of need to be able to hook into there. And so I think I might just do an on event closure.

And you can pass in like, we'll pass into the closure each event as it happens. So if you want to like log that or do something with that. And then I think I might add an on stream end hook so that when the stream is completed, we like pass in the completed messages into that so that you have like, could like persist those or something. So we'll see kind of how that shakes out, but that's that is.

I might merge the PR first and then go back and add the hooks just to kind of like, because this PR is already getting out of control. it's, I think, I think so far it's touched 137 files, 8,300 additions, 2,100 deletions. It's, it's a lot right now. So I, what I might do is get this merged and then add the hooks in.

Chris Gmyr (42:03)
Yeah.

TJ (42:22)
But we'll see. That's just like the next piece. Like don't know how essential it is to like have the hooks. I think it'd be pretty important though, because I think people are going to need to like need and want to persist messages. So at the very least, I think I need to have like an on ship with an on event hook. The on stream end hook is a little bit more complex because

Chris Gmyr (42:22)
Yeah, think that's a call.

TJ (42:43)
Basically, have to, as the chunks of messages come through, I have to buffer all of that and store it somewhere to then create the final messages to send through the onstream end hook. So.

That's probably going to get complex, but we'll see. The on-stream event is really easy because I just pass the events through the closure. ⁓ And I think that way you could like, that could enable message management.

Chris Gmyr (43:07)
Mm-hmm.

TJ (43:12)
I don't know. We'll see. That's a decision I still have to make yet, though. But I'm really stoked that all the providers have been converted. I've manually tested all the providers and everything. So I know we're fairly functional complete finally. So I'm hoping that this weekend I can pull it over the finish line and hopefully get it merged either Sunday or Monday. Yeah.

Yeah, it's really nice to kind of see the light at the end of the tunnel here. Like, the docs are all written. Like, we're in really good shape. It's really just, I've got some failing tests in CI. A fun one, fail in CI, pass locally. So got to figure out that one. ⁓ And then figure out what to do about these hooks. And then I think there's like one cleanup refactor that I want to do as well.

Chris Gmyr (43:39)
Yeah, that'll be awesome.

Yeah, those are really fun.

TJ (44:02)
but if we're so close.

Chris Gmyr (44:04)
Yeah, so close. And then get it out into some people's hands and they can start tinkering with it and give you some feedback. And then, you know, all that much more closer to V1.

TJ (44:15)
Yeah, I'm really behind. I've got like 32 open issues, 19 open PRs.

Gross. I'm like way behind, but.

Chris Gmyr (44:24)
That's all right.

TJ (44:25)
We'll get there.

Chris Gmyr (44:26)
Yep, a little bit day by day. Yeah, let's hear about it.

TJ (44:27)
Sick man, on that note, we want to wrap up.

All right. Thank you all so much for listening to the Slightly Caffeinated podcast. Show notes, including all the links of things we've mentioned in social channels are down below as well as available at slightlycaffeinated.fm. Thank you all so much for listening. We'll catch you next week.

Creators and Guests

Chris Gmyr
Host
Chris Gmyr
Husband, dad, & grilling aficionado. Loves Laravel & coffee. Staff Engineer @ Rula | TrianglePHP Co-Organizer
TJ Miller
Host
TJ Miller
Dreamer ⋅ ADHD advocate ⋅ Laravel astronaut ⋅ Building Prism ⋅ Principal at Geocodio ⋅ Thoughts are mine!
Aponte Inga, Claude Code 2.0, and Prism Streaming
Broadcast by