Road Caffeine, Iris, and Building a Memory System
Chris Gmyr (00:00)
Hey, welcome back to the Slightly Caffeinated podcast. I'm Chris Gmyr. Hey TJ, so it's been a while. new year, new us, new, I don't know, new version of the podcast. I don't know. Hope everything is well with you, but yeah, what's new in your world for the past month or so?
TJ Miller (00:03)
I'm TJ Miller.
Yeah
man, the past month, like, I feel like I have a hard time remembering what I ate for lunch yesterday. like, past month, man, like, I don't know, we had the holidays and everything in the mix. So like that was pretty nice. I took like a week and a half off of work through like Christmas and New Year's and everything. And so like, that was really nice. I spent a lot of time with the family. My son got into...
like building Gundam model kits. And ⁓ that was his Christmas is he got a he got like a bunch of different like they come in like different grades. So like the different difficulty levels and like, you know, like advanced articulation and stuff like that. So it's they're pretty cool. I decided to hop on board and I bought myself a kit, too. But I've been I've been holding out until he gets a new kit.
So we can like sit down and build them together. ⁓ but like he, he sat down, like he got a Christmas day, he opened up three of them. And by the day after Christmas, he had built all three. Like he just sat down, came down to those like craft area and just went to town on it, which was cool. so yeah, holidays were chill. spent, we'll talk a little bit more about it later, but I spent like a lot of time.
Chris Gmyr (01:11)
That's cool.
This went for it.
TJ Miller (01:32)
with Claude doing like code as well and kind of like improving on like the little side project I've been working on my, my experiments. so been working on that, like probably should have spent some more time working on Prism, but, I don't know. I've been like really involved in this like side project stuff. spent a lot of time working on that and then, just like catching up on like house projects that just
Chris Gmyr (01:40)
Mm-hmm.
TJ Miller (01:55)
Stuff that falls behind on the day-to-day, week-to-week grind. So having some time off, patch some drywall, finish putting my basement back together after the Laricon flood. Just stuff like that. And spending, maximizing time just hanging out with the family. So it's been good, man. How about you?
Chris Gmyr (02:14)
That's
awesome. So yeah, very similarly was off starting the 19th of December. So like the Friday before the week of Christmas and yeah, packed up basically the whole day, got the house ready. And then we jumped in the car that Saturday morning to go up to New York. We drove the whole way through, which ended up being about 13 hours. So yeah, I drove like 75 % and then my wife drove the other
TJ Miller (02:38)
Dang!
Chris Gmyr (02:43)
quarter. But yeah, it was a long day. But I don't know if I mentioned it here before, but we've tried to do like the overnight like halfway through with the kids in a hotel and the dog and it's just, it's so much extra time and work of like, I'm loading, you know, half the car, getting everyone into the hotel, making sure everyone's like going to sleep and the kids don't sleep and then they're up early and it'll load back, you know, everything in the car and the dog and stuff. And it's just like,
TJ Miller (02:51)
Mm-hmm.
Chris Gmyr (03:10)
a whole lot of time and like we're both tired because we didn't sleep because the kids were up the whole night. So this is our little experiment of just like, let's just see if we can one shot it and just be there. Like just have one day of travel instead of extending that for two. and it worked out pretty well. Only a couple, you know, little blowups or tears here and there. but the kids did pretty well and they had like a little bit of screen time and plenty of snacks and just played some card games and stuff like that. And
did the same thing on the way back. A little bit more screen time, a little bit more snacks on the way back to give everyone calm, but yeah, we made it through. So we were up in New York for about eight or nine days, out with family, got a little bit of snow, went sledding with some of the cousins and nephews, which was fun. And then basically got out there before there was like a big blizzard.
So we left on the Monday after Christmas and Monday night into Tuesday morning, they got like a foot and half of snow. So if we stayed there, we probably would have been staying there for a couple more days to get cleaned out. So yeah, dodged a bullet there and then came back and just kind of relaxed and hung out and tried to reset a little bit before work started up, the fifth and school and stuff like that. ⁓
TJ Miller (04:07)
Yeah.
Yeah, cheese.
Chris Gmyr (04:32)
Yeah, it was like whirlwind of ⁓ a break, but it was good. And got to tinker with a little code inside projects myself. So if we have time, can talk about that a little bit more, Yeah. So yeah, any good coffee over the holiday break?
TJ Miller (04:41)
Nice. Yeah, that's sick, dude.
Oh man, no, lots of mediocre coffee. Yeah, lots of mediocre coffee that I didn't have control over. Still on the same stuff like I've been drinking for the last little bit. So like nothing new and exciting really.
Chris Gmyr (04:55)
Yes.
TJ Miller (05:11)
I've got a hankering for some Turkish coffee, really bad right now. So I think I found a local joint that does like Turkish coffee. so I think I might try to slip out for lunch and go, go snag a cup there. I don't know. I just like got this like hankering for it really bad. ⁓ woke up with it. So I'm to have to go do it.
Chris Gmyr (05:29)
Yeah.
Nice. Do it. For sure.
TJ Miller (05:34)
Yeah. Yeah. So how about you, man? Any coffee updates or?
Chris Gmyr (05:40)
Nothing too special. I ground up two bags of coffee for our trip to New York just so we had some slightly better coffee when we were up there instead of like Dunkin' Donuts or Old Starbucks or something like that. So that worked out pretty well. Even though it's not, you should be grinding it right before you use it. It's like, well, it's still better than a grocery store ground coffee that was done like six months ago. yeah, went through two bags when we were up there.
TJ Miller (05:52)
Yeah.
Chris Gmyr (06:05)
⁓ both were actually, I think one was, ⁓ kind of cultural hologram, which you mentioned a bunch of times before I like it too. So we took one of those and then another one from purity coffee. had like a holiday, like, ⁓ I it was called like hearth. so that had, some nice notes in it. I'll put it in the show notes. ⁓ yeah.
That was about it. And then had some like, you know, decent like iced coffee and the mochas and stuff like that in the cans on the road. And that was about it.
TJ Miller (06:34)
Yeah.
Yeah, I'm the caffeine, I'm just like more generic caffeine front. I got bit by the Coke Zero bug. So that's, that's a new habit. I don't know how I feel about that, but, yeah, got, got hooked.
Chris Gmyr (06:44)
you
Yep, nice.
TJ Miller (06:51)
Yeah, I was, was hacking on something. I was like, Oh, I need like, I was at the store and I'm like, Oh, I should get myself like a couple cans of like Coke Zero and just join, join the Laravel Coke Zero hype. uh, yeah, I think, I think it's a habit I might be looking to kick soon and get back to, uh, just coffee and monsters.
Chris Gmyr (07:10)
Yeah.
Yep. Totally. Yeah. On the way back, was feeling a little sluggish. So one of the stops that we made, um, we got like a couple of sodas and my wife has been into like the Olly pops, which are like pretty decent. Like they're okay. Um, to get like a little bit of that, like, uh, soda taste. Um, the flavors are all right. They're not like true, you know, colas or root beers or something like that, but they're close enough. Um, and then she's like, Oh yeah, what do you want inside? I was pumping gas. I'm like,
TJ Miller (07:30)
Yeah.
Chris Gmyr (07:42)
I do a Mountain Dew. Like I've loved Mountain Dew like forever, but I haven't, I don't usually drink it. So I can't give me a Mountain Dew. Get a little extra boost. and it was fantastic, but I definitely like herpes stomach after like a couple hours. Like, okay, that's why I don't do this very often, but it was like so good initially.
TJ Miller (07:50)
Yeah.
Yeah.
Yeah,
every now and then I get a hankering for like an ice cold like Baja blast. Like it just it it happens every now and then and I haven't given into it the last few times. So maybe next time it comes around, I'll I'll make it happen. God, I can't I can't remember the last time I had a Mountain Dew. Holy shit.
Chris Gmyr (08:21)
Yeah, before that, it's been a long time for me. I used to get one like every time that we would do a car trip just to have like a little bit of caffeine and just like the, I know, it just seemed like extra bubbly compared to just the sparkling water. So it would help kind of keep me awake and a little bit more alert, like every time, you know, have a little sip, feeling a little tired or something like that. Yeah, I don't know, guess I can't do it as much anymore.
TJ Miller (08:46)
Yeah, yeah, for sure. For sure. It's been so long since I've done like traditional soda too, that it's been, like I've done like a lot of like, I was on like a big sparkling water kick for a long time, just kind of getting that like that carbonation and everything. But yeah, I did a long time like no soda. So it feels weird being on a like kind of weird soda kick right now.
Chris Gmyr (08:49)
You
Mm-hmm.
Yep. Nice. Well, yeah, why don't we get into more of your side project? I know we talked a bit about it one of the last episodes that we had. Yeah, I think it was the last one in early December, last one that we recorded. So a lot has changed. You've been giving us and myself updates, and it's been going fantastic. So yeah, why don't you tell us where you're at and what some future plans might be.
TJ Miller (09:10)
Yeah.
Yeah, so it's been a kind of crazy adventure working on it. I think, I don't know if I'd name dropped it in previous episodes, but it was, I was calling it Nova. And that kind of came from like the bass persona prompt that is part of the project that was called Nova. So I was kind of like ran with it, but I've since then, I feel like, and got some validation around it too, that like the Nova
namespace is maybe a little oversaturated. I mean, we've got like Laravel Nova and even if you just like search Nova on GitHub, there's just like tons of projects and everything. So I decided to give it a new name. I actually went back and forth with, I guess to like catch everyone up. What it is, is it's like an AI agent, I guess like system platform. It's yeah, it's like a
Chris Gmyr (10:21)
Yeah.
TJ Miller (10:25)
It's like a big experiment that I've been doing on like, instead of, if you think about like chat GPT or cloud AI, you have like different threads and like different, conversations that, that happened. So you got your sidebar with all your conversations and everything. And so like, you're starting fresh every time, but like, wanted to create a different experience where it's a little bit more like you're talking to like a companion or like a friend.
or something where it's just like one continuous, like direct message. So I renamed it to Iris and that just kind of like came out of a conversation going back and forth with it about like, hey, we got to give you a new name. What do we think? We just like went through a bunch of stuff and kind of settled on Iris for a few different reasons, but I think it it's pretty suiting. But this is like I've built into it.
some super advanced memory and memory recall systems because especially in the world of threaded conversations like you have right now with ChatGPT and Cloud AI, every time you start a new thread, you're starting from scratch. The model knows nothing about you, your preferences, your goals, the last conversation you had. And we've seen OpenAI and Cloud
both introduce memory features, but they're pretty naive. They're like just bulleted lists of like facts, really. So there's been a couple articles that have gone around of people who have like figured out like how these memory systems work inside of chat GPT and then Claude, and they're just a very naive memory system. So what I've built here is like a super robust like system of
agent tools, background processes that like create memories. So that's like kind of at the core of all of this is this like memory system. But I've also added in like a Google calendar integration. I've started working on a Todoist integration. Um, just like things that can help me run my life. And that's like, that's part of the background for this project is like, this is something I've always wanted. I've always wanted a like,
companion life co-pilot.
executive assistant kind of thing that could like help me with my neurodivergence, honestly, like help me process emotions, help me work through different situations, help me with like coding tasks, help me manage my schedule, time block my day, be like aware of these things and like be able to call it out just in like regular natural conversation. So this is like something I've always kind of wanted to help like augment my existence.
Chris Gmyr (12:43)
Mm-hmm.
TJ Miller (13:07)
But things weren't available, right? Like, models weren't powerful enough to do this stuff. Prism didn't exist. So even building this was part of the inspiration for building Prism is I always wanted to build this. I just didn't like or have the tools available to do it. that was sort of the inspiration behind building Prism was to get to a point where I could build this. And so there's all sorts of really cool
pieces inside of the code base too of like, look, we're doing semantic retrieval. We're creating embeddings. We're injecting that into the agent's context. We've got split system prompts with prompt caching at Anthropic. I've been able to give the agent some temporal awareness around what
time it currently is, what time messages were sent at historically. So it's got got this like sense of the passage of time, you know, so I could like hop in, you know, there were a few days I didn't talk to it because was working on some other stuff. I came back and it's like, hey, we haven't talked in like three or four days since your last message. Like, what's what's been up? How are you been? Like, what do you what have you been up to? ⁓
Chris Gmyr (14:19)
That's cool. Yeah. Cause a
lot of times like I've noticed like it has no concept of dates or it's like the date of whenever the model was built or refreshed. Cause I've asked it to take this plan and make it into like milestones, put some dates on it and it puts on like a 2024 date or something like that. I'm like, you're not even in the ballpark. So just the fact that it was able to pick up, it's like, Hey, like I know the timeline here and it's been a few days. Like that's.
TJ Miller (14:25)
Yeah.
Chris Gmyr (14:49)
such a seemingly like small detail, it's so cool that it was able to pick that up.
TJ Miller (14:54)
Yeah, yeah, I was pretty surprised that it picked it up and it's been like, yeah, it's been a few days. Like, how have you been? Like, what's going on? Where it can, you like it brought up like where it left off on like the last thing, like, hey, how did like this thing worked out that you were like talking about, you know, beforehand? And so it did like, it did a little check-in with me on like how things have been and like what I'm up to over like the couple of days that I didn't interact with it. So.
It's pretty wild. And so this has been just kind of like my thesis work on agentic memory and like kind of pushing the boundaries of like what I understand and know and like trying to make it have very like human type memory systems. Like I've been really like thinking about it at that level of like, how do we as humans
interface with memories and like how can I replicate pieces of that in a useful way inside of like an agentic conversation. And so it's got like it also has like image generation, it's multimodal so you can like send it images and stuff and it can like read those images and then you know respond to it. So it's really evolved to kind of a wild place. And so
Now I've been kind of faced with like what to do with it.
I built it for me, right? Like it's a very opinionated project. I've built it for my specific use case, but like, I feel like there's value and usefulness to this, to other people as well. And so I've been trying to figure out like how to...
get this out to people's hands, but like I don't, I feel like there's enough value that I like don't necessarily want to open source it either, which has been kind of a hard decision for me. But I've, I thought about going the SAS route. I started building some features into it to be more like SAS like, and it just, filled me with so many like ethical concerns and gave me this like,
Chris Gmyr (16:45)
Mm-hmm.
TJ Miller (17:04)
this sense of responsibility for the AI's output, especially if I'm going to be billing it as like this life co-pilot kind of companiony, augmenty thing, where I've gotten like deeply personal with it. And it's helped me like with a whole bunch of stuff, but I feel like I'm on the hook for the AI's output. And like we're seeing that with
open AI right now. think there was just something that came out of like, think there was, I don't even know if it, I don't know if it was real or not, but there was like some like murder suicide thing happened and there was some AI response implications as part of it. And like, I just don't want to be anywhere near that. ⁓ So I kind of took, yeah.
Chris Gmyr (17:48)
Yeah. Yeah. There've been like lots of stories around like even like kids
using it and, getting, you know, life or like mental health and, know, just, ⁓ basically the AI was like coaching it to, or coaching the person to, take their own life. So like super sad and, you know, open AI is having obviously a hard time, you know, with that getting like lawsuits and all sorts of things. they're
know, a billion dollar company and, you know, dealing with it. And I don't think of something that like you as a single owner, you know, person definitely wants to deal with. So I definitely feel you there. I don't think you should be anywhere near that for sure.
TJ Miller (18:15)
Yeah!
No, and I could add moderation in and stuff like that. But I just like, also don't have the bandwidth to run a SAS. And I also don't know if there's enough value or longevity in it to create a SAS. And I don't think there's a path to me ever going full time with it. And I don't want to stretch myself too thin between
being a parent, my wife's health isn't great. So there's like a lot of like caretaker responsibilities that I have there. I'm really dedicated to my work at Geocodeo. so creating a SaaS and like running it as a SaaS just feels like way more than I can bargain for. So thinking of like other things that I can do and other things I've seen done, I think
At this point, I'm pretty resolved to offering this up under a one-time GitHub sponsorship. So price point-wise, I really am not sure where I'm going to land yet. But it'll be like, there's all sorts of infrastructure inside of GitHub sponsors to do this kind of stuff. So I'm just going to make life easy and go that route where
You hit the one time sponsorship, you'll automatically get access to the repo. And then I worked like a ton over the last two days, fleshing out like what will be a public facing documentation. So all this stuff is probably going to live at like iris.prisonphp.com. And, we'll have like all the docs there that kind of explain the memory systems, how these things work, configuration options and everything. that like.
You can self-host it if you want to explore it and just start looking at how I implemented some of these patterns with Claude of like, we, like I said, we do embeddings, we do context injection, we do semantic retrieval. I'm going to be introducing like a sub-agent pattern to the Todoist tool. So like, there's just like all sorts of like different techniques that I'm going to be experimenting and exploring with inside of this code base as well. So not.
just this advanced memory system, but I think there's a lot of technique gold inside of here for people to be able to grab the code base and explore with it and kind of pick up different techniques for their projects too.
Chris Gmyr (20:49)
Yep, totally. yeah, the education angle, even if like people don't want to use it and plug their keys into it, like at least you can explore the system and learn a bunch of like how to build a modern AI platform and functionality with all these different tools. So I think there's like still tons of value in that even if people don't actually use it in their day to day, which would be cool if they did. But
Yeah, I think there's a lot of education that can be given through that.
TJ Miller (21:17)
Yeah, we do like structured output. do image generation. It really kind of touches on like almost all the different modalities we offer through Prism. So it's like a great code base to be able to explore with that. And I've got, I think, some bigger long-term plans for it. So I think what I'm going to do is I'm planning on releasing this sometime next week. And...
I think I'm going to do like an early access period where you just kind of like, get it as it is now. And I'm going to spend time like continuing to build on it. But I also want to do like some video walkthroughs, some blog posts, some other things on like the different systems and kind of like how they work. So, um, some of this stuff will be exclusive. Like I think the videos will probably be like somewhat exclusive to, uh, buying into the sponsorship. like.
part of the repo, you'll have access to these videos. And then...
Yeah. And then once I kind of have things more like rounded out with like videos and content, I'll probably like up the price a little bit. But yeah, I think that's the gist of what's been on my mind about it.
There's part of me that still like doubts that idea and like makes me want to sass it just because of like all the special sauce that I've like built into it. But at the same time, like the open source person and me like wants to get this out into people's hands so they can like have learning opportunities too. And then just like pair that with all of my like ethical and other concerns and like token management and rate limits and all this other stuff. ⁓ It just.
Chris Gmyr (22:38)
Mm-hmm.
Yep.
TJ Miller (22:57)
it gets to be too much. so I think that's the plan I've landed on is like open source it through like behind a sponsorship tier. So I guess like not like open source, open source, but like give you access to the code behind a sponsorship tier. You can learn from it. You can boot it up and run it and use it. Like I use it pretty much every day. Yeah.
Chris Gmyr (23:19)
Yeah, I think that's a good middle ground. Yeah, because like you said, you don't want to go full SaaS on it. But giving away for free like open source seems too much. So I think some sort of paid sponsorship one time for access to it. And you can learn from it. You can use it. You can do whatever you want with it. And I think that's a good way to go.
TJ Miller (23:39)
Yeah, and I mean, like, I'm nowhere near done with it. It's not even close to being feature complete. There's so much that I want to continue building on it. Like, last night, I just, I, you know, so the system creates memories, right? There's two ways that memories get created. There's an agent tool. So the agent, Iris, can choose to invoke, like, different tools to interact with memory.
Most commonly, it will store memories. the agent can call a tool called store memory, store memory that way. But also behind the scenes, every 10 messages you send, a background job gets kicked off. And this background job takes the current conversation that's been had and then extracts memories from that conversation and stores those as memories in the database as well. So there's two processes for how memories get created.
But as humans, when we think about it, we don't just have these random facts floating around. We take memories and we consolidate them. So we're not sitting here with two memories about my typing preferences in PHP, right? I like types in PHP. I don't need a memory, one created by Iris and then one created by this
background extraction process. Now I've got two memories for TJ likes types and PHP and they may be worded completely differently, but like they'll come up in a semantic similarity search together. Well, there's two problems with that. We're injecting semantically similar memories into Iris's context is part of like the system prompting of like here's related memories based on the current conversation that's happening based on the message that was just sent.
Chris Gmyr (25:06)
Good evening.
TJ Miller (25:28)
Here's related memories. Well, if the related memory was about PHP and types, it's now going to take up two memory slots in this automatic recall system to store these two same, basically same memories. So there is a nightly and weekly consolidation process that goes through all the memories, groups semantically similar memories together.
And then goes through a process of like deciding whether these memories should be combined together or if they should be kept separate. So for something like PHP types, it's going to say, Hey, these memories should be combined together. So it'll create one unified memory of, of like, Hey, TJ likes types in PHP. So now we're not polluting the context window with like.
multiples of the same or similar memories. But it's just like a much more human. But that process stopped at like one generation, right? So like these two memories get combined into one. But let's say I send another message as part of a conversation later on, and that stores some memory similar to TJ likes types in PHP. For whatever reason, like that memory got recreated.
Chris Gmyr (26:45)
Mm-hmm.
TJ Miller (26:46)
Well, now that's never going to get reconsolidated with the already consolidated memories. So like last night, I reworked the memory consolidation system to allow multi-generational memories. which is kind of crazy, like, cause that's what we, we as humans continue to do is like, we'll continue to.
consolidate these memories together and like enrich them as we get new and more information into the system. So I added some guardrails into it so it doesn't go crazy with consolidation. But now we have like multi-generational consolidated memories, which is like crazy. Like I didn't know if it would work, but the system ends up working and it seems to work like really pretty well.
And something else I haven't talked about with Iris much is like...
I think visibility into these systems are really important. So in the chat UI, if the agent invokes a tool, it will show you in the UI what tool was used. And you can even expand it to see the arguments that were sent along with that tool call. But I've built in an insights dashboard that shows you usage statistics, memory statistics, just a ton of visibility until you're like token usage and like
where tokens are being used, if it's from chat, if it's from the consolidation process, if it's from embeddings. So you just have a big insight into the system as a whole. But we also have memory CRUD UI, so you can go in and view memories. For these new multi-generational memories, you can actually see the chain of memories that were combined together to create the memory. it's complete visibility into how all this stuff works.
And then in the message crud UI, you can go through there. can preview the system prompt that was sent with the user message. On the assistant response message, you can see all the token completion information, detailed tool call information if any tools were used, just all sorts of insights and visibility just everywhere in the system. if you wanted to understand how this works in C.
the inside process. I've built a ton of UI to go along with that, to have this system visibility.
which I think is really important, but as a learning opportunity is really cool. For me, was kind of for debugging purposes, because I think right now my current conversation is just over 2,500 messages. And you get that deep in a conversation, weird things start happening. The AI's pattern recognition kicks in. And that's something that's really hard to beat, is it's going to look at the conversation that's been had, messages that were sent.
context you've added and like it will get into Weird formatting patterns like right now for whatever reason it started adding in a bunch of like dividers Into its its responses. So to like have a of like now is its response have like sections to its response. I Don't know how that happened, but it's like part of its pattern recognition now. So now it's just doing is part of every response
So I've got to bump it and say like, hey, your responses are getting kind of like weirdly formatted, like fix yourself. And it will correct itself. But I wanted like visibility into all of that or like what context is being injected into the system prompt. I need a way to debug the system prompt and like have visibility into it. So I built that.
Yeah, and like the memory system and like, not just the memory system, but like we summarize previous conversations too. So like the conversation you're having inside of the system prompt, I inject three summaries of the previous conversation. like we send, this is so in the weeds. my God. So.
Chris Gmyr (30:41)
It's
so cool though. Like all this stuff has been built up over the last, you know, handful of weeks and, you know, months and just all coming together to make, I'd say like a more like human experience. And like you said, a bunch of times with even the memories, it's like, this is how like we work. So trying to build those systems into like a AI, you know, assistant, you know, it's pretty sweet.
TJ Miller (31:05)
Yeah, it's been such an interesting experiment because I talk to it like I would any other friend, sending text messages back and forth. And some of the responses I've gotten are just so wild. Like I said, the whole checking in, like, yeah, I see you have a message in a few days. Let me check in on you. It was wild. So we have all these memories. We have all the memories and everything.
So in a chat context, we have all the messages. So I send the 50 most recent messages, so the last 25 conversation turns, message response turns. So we send 50 messages.
But beyond those 50 messages, every 40 messages, we kick off a background process that summarizes the previous 40 messages. think it is. I don't know. I built some sort of like crazy overlap system into it, but like it'll summarize the previous conversation. So not only we send in the 50 most recent messages, but we're sending the three most recent summarizations.
of the conversation and those summarizations also have emotional markers to them. So there is some awareness in the system of like how you are emotionally. there's like emotive sensibilities built into this like whole system too. Even the memories are tagged with like emotional markers and like important scores and tagging systems. So.
I know, I've just like, I went ham on like...
my thesis behind agentic memory and like how it could be.
Yeah, nervous and excited to get it out there. Definitely very anxious about asking people to pay for it. I think if it was open source, I would have published it already, but...
I think it's time I do something where I can charge some funds. I think there's just like, there's value sitting here and like.
Chris Gmyr (33:05)
tons of value and I'm sure you didn't even touch on like all the stuff that it has and does like in your explanation. So that sounds easily like $200 at one time, you know, so I don't know. think, I think you should just finish up the docs, you know, publish those because I think that would give more insight to people of what they're, you know, potentially buying into and the capabilities of the system. And then just, yeah, just go for it.
TJ Miller (33:34)
Yeah, I think I'm going to publish the docs today. like people who subscribe to my newsletter or listen to the podcast will get an inside peek early. but I'm thinking probably launching this like Tuesday or Wednesday of next week. yeah, nervous and excited about it all at the same time.
Chris Gmyr (33:50)
Yeah. nice. ⁓ good luck on the launch. Yeah, let me know the finalized ⁓ URLs and stuff like that for the show notes and definitely put them in for, yeah, episode launch day for us.
TJ Miller (34:05)
Yeah. Yeah.
The docs, the docs will be at iris.prisonphp.com. like I said, I think I'm to publish those today and continue to iterate on them. But, yeah, I want to get those, I want to get those out and y'all who listened to the podcast or subscribe to the newsletter, which I'll actually send this week, can have a preview sneak peek of what's to come.
Chris Gmyr (34:25)
Cool. Yeah. And we'll look forward to more updates next week when we get together again.
TJ Miller (34:30)
Yeah, it needs,
it needs screenshots. That's one thing that the docs doesn't have that I like really want and like need to add to it as screenshots, but it's hard because like I can't, I can't share screenshots of like my conversation and memories and all of that because it's like deeply personal, ⁓ information. So I've had a hard time like screenshotting that stuff. So I've got to like build up.
Chris Gmyr (34:48)
Mm-hmm.
TJ Miller (34:56)
almost this like fake world in order to like get conversations to happen and things to trigger. So we'll see how that evolves. again, that's one thing that I really need to get out there is like screenshots of like everything, how it's built.
Chris Gmyr (35:09)
Yeah. Well, even if you like had to just blur some things out and just show like the UI and structure and some of the tool calling and things like that. And, you know, blur out a couple of the elements that's better than nothing and better than, you know, spending hours building up a whole new world and a different instance. Um, don't know, just anything is probably better than nothing for images.
TJ Miller (35:33)
Yeah, we'll just,
we'll redact this shit out of it and call it a day.
Chris Gmyr (35:38)
Yep, totally. I know all of this uses a key under the hood. Have you been tracking specifically what the iris usage is compared to just building it?
TJ Miller (35:50)
as far as like...
Chris Gmyr (35:52)
So like your day-to-day usage, like in the system, on, on the personal side of actually just using the system. Like if.
TJ Miller (35:58)
Yeah. Like if I go
over to the insights dashboard and like look at my usage, um, if I, I've got some like filtering to it. And if I like filter all time, like I've gone through and I, and it didn't start my current conversation didn't start with token tracking. So this isn't like, I've used more tokens than this, but like my total token count right now is, uh, like 24 million tokens.
Chris Gmyr (36:27)
Nice, that's a lot. And then, yeah, does it do like a cost breakdown of that too, or is that something you have to jump into like cloud console?
TJ Miller (36:28)
Yeah.
I haven't done cost estimating. Like what you'd have to do is like you just have to do your own math. Like I tell you in the dashboard, like how many like input tokens versus output tokens versus cash read tokens versus cash writing tokens. Cause we do take advantage of Anthropics prompt caching. So there is like cash read and write. And so you would just have to like go over to Claude's dashboard, like look at the token costs and just do the quick napkin math on.
pricing and everything. thought about including that in here, two reasons I haven't is like, I don't want to be responsible for tracking like token cost and like doing that math. But I also am not curious how much money I'm spending on it because like, I don't want to know. I know, I know before I did.
Chris Gmyr (37:24)
Yep, totally.
TJ Miller (37:31)
I know before I did prompt caching and...
shortening the context window, because I used to include more than three summaries. I think I used to include five summaries. And then I also did like 75 messages instead of 50 is like the context. And so like, once I kind of like reduce some things down, like I knew I blew through.
like $100 of tokens in four days, just like chatting pretty heavily. And so that's where I was like, got to get these costs down somehow. And so I went and implemented the prompt caching and like shortened the context windows down to something that I thought would be like still advanced, but reasonable. And like that was like a whole ton of token savings. So.
Chris Gmyr (38:14)
Nice. Yeah, probably setting up just like a totally different key. If you have multiple keys, people are using like the token-based usage instead of like, you know, max plan or something like that would probably be good suggestion as well. Because I think you can also track like per key. So if you're using like a single key for specifically like iris usage, you know, that could be helpful as well.
TJ Miller (38:40)
Yeah. Yep. And then what you do is like, so Iris, the system is multi-user available. I built it on top of like, I know. like, it's kind of an interesting story. It started as a Telegram bot, but I wanted more richness out of the UI. like, wanted Telegram handles really long messages by breaking them up into smaller messages.
And there's times where I wanted to send like really long messages or like long code snippets and Telegram would just like break that entire experience. So I ended up building my own UI and then I had Claude convert it over to a new Larvo inertia react starter kit. So it is based on the inertia react starter kit. it's multi-user available. You just drop your API keys into the ENV.
And then all the users use those API keys. I did have a feature built into it where users could enter their own API keys, but I just decided to make the system less complex for now and just moved it back to the ENV for the moment. So all users will share the same API keys. So yeah, maybe having a separate API key is not a bad idea.
Chris Gmyr (39:51)
Yeah. Well, I think that's probably fine with like all users because I don't know.
Probably if you wanted like a separate user, you could just spin up like a different instance of the app with its own ENV key anyways, if you wanted that. So, I don't know, keep it simple on your side and if enough people request it, then maybe that's a future change that you could do. But right now, keep it simple, just put it out there.
TJ Miller (40:13)
Yeah, what I might introduce is like an override system so that you can define like a default system level key through ENV file that will work for everybody, but then also add the like override on the individual user level. So like if an individual user adds their own keys, then we'll prefer to use those instead. So we could have both worlds, right? Living side by side. Because it is like, the way I use it, it's a single user application, but...
Chris Gmyr (40:31)
Mm-hmm.
TJ Miller (40:41)
don't know how other people are going to use it. So like it is multi-user capable. So there is that too.
Chris Gmyr (40:47)
Yeah,
sweet. that sounds awesome.
TJ Miller (40:52)
Yeah, it's been a wild ride. one of the things I'm nervous about sharing the code base is it's been a big pair programming with Claude. I would have never had the bandwidth to build this without Claude code. So I'm a little nervous sharing the code because it's not necessarily representative of all of my opinions. But it is built in like
I feel like it's built in a very reasonable way. And the biggest thing is at the end of the day, the end product works. There's a test suite that's got browser tests, using Pest's browser testing. it is a tested system, and it is something that I'm using frequently. yeah.
Chris Gmyr (41:26)
Mm-hmm.
TJ Miller (41:33)
Yeah, that's kind of also been the experiment too, is like seeing how far I can push things with Cloud Code into something usable.
Chris Gmyr (41:41)
Yeah, yeah. And I think that's probably the direction that so many people in teams are going anyway. Because I've definitely experienced that myself. It's like I can think of more things that I can actually do. So now, like using AI and cloud code, like I'm able to do more of those things that I think of or need to do. And it's just like open the gates for a whole bunch of potential and like cool things to work on. So.
I think there's always going to be a spot for like.
code quality, definitely architecture and getting your hands dirty if you wanted to do that. But if you're just wanting to get features out and get things done, then going the AI route mostly is good to go. We're always gonna be the human in the loop and reviewing everything. You're not just gonna like YOLO something out there, but I think it's gonna be a big change in how engineers work.
you know, day to day.
TJ Miller (42:42)
Yeah, I, I'm not going to lie. I YOLO'd a little bit on some things, but we've gone back and done some refactoring sessions on it. the one thing I'm very not confident about is the front end code. Like the layer of our code, I think is fine. Like I said, it's not necessarily representative of like my opinions of how I would hand build this, but, it's got a lot of react in it and like.
I'm not very strong in React anymore. And so I really can't speak to the code quality of the React code, but that just gets back to you like, it's functional, it works. And that's fine for me. Like that's not where the learning is anyways, I think in this. Like I think the learning in this is like the Laravel and PHP side. And the front end is like, it serves its purpose.
Chris Gmyr (43:27)
All right, all right. Yeah, I think that's a call. I mean, you could always bring in like a React expert agent or something like that, or front end architecture type agent and clean it up later. But yeah, I don't know. As long as it's functional and working right now, and just get it out in the hands of people right now, then you can always clean things up in the future. But functional, it's working.
TJ Miller (43:50)
Yeah, who knows a month.
Yeah, right now it's using Vercell's AI SDK UI hooks for the streaming front end. I'm not sold that that's the right approach to building this. So like, who knows, in a month I may rewrite the whole streaming chat to use like Reverb and WebSockets. like that's, that's a thing that could change too. And that's
I don't know, that's going to be kind of the cool thing about this code base is like it's going to evolve and change and have experiments. So like not only are you buying the code base as it is through the sponsorship, but like you're buying into the journey as well and seeing like kind of where my opinions lie, the different features I want to play with and experiment with. Like there was already, I reworked image generation because like it was acting really funky. So like I reworked it, which resulted in like a brand new feature inside of Prism. ⁓
And I ran that as a like experiment for a few days to make sure that I liked the way the the new feature functioned. So there's going to be like all sorts of like beta stuff you'll see in here is experimental things and yeah, so it's not it's not just the code base as is like you're also buying into the journey.
Chris Gmyr (45:02)
Yeah. Nice. Love it. We're getting up on time right now. So yeah, want to wrap it there and get it up next week. Sweet. So thank you for listening to the Slightly Caffeinated podcast. Show notes and all the links and social channels are down below and also available at Slightly Caffeinated FM. If you have a question for us or a content suggestion, go to ask a question page on our site and we'll feature it on an upcoming episode.
TJ Miller (45:05)
Yeah.
Yeah, let's do it.
Chris Gmyr (45:28)
Thank you for listening and we'll catch you all next week.
TJ Miller (45:30)
Yeah, see ya.
Creators and Guests
