Lower Costs, AI Flows, and Bigger Pictures

The Weekly Variable

The Weekly Variable

A Wave release may finally be in the near future.

But where does Wave flow after that?

Plus moderation and some evolution.

Topics for this week:

App Runner vs Lambda

After getting unstuck last week with AWS, I had nearly all the Docker images I needed up and running in App Runner and everything seemed to be working as expected.

Feeling confident on Monday, I happened to check the Billing dashboard when I realized I forgot something.

All these services are constantly running regardless of activity, so the dashboard showed $12.40 already estimated operating cost for less than half of the month, much more than I was expecting.

At some point, I think Gemini Pro convinced me I could operate on App Runner cheaper than Lambda if I was scaling the services up and down, but it turns out there’s not a great way to do that in App Runner.

One service would have to stay live and listen for calls to the other services so it could try to bring them up and down as they are needed, but that doesn’t seem like the best solution.

Back to Lambda it is.

Lambda is designed to only run when called, and should be able to handle a decent amount of traffic for a while.

Once Wave passes a few thousand active users, I’ll consider switching back to the full App Runner services, but for now I’m in the process of converting all these services to operate in Lambda instead.

Scalability vs cost vs tech debt are tricky things to balance.

I want something that works now, that can also work for 10s of 1000s of users, but also isn’t too expensive (maybe even free).

Fortunately Lambda should be able to handle that for now, and I’ll just have to keep an eye on usage spikes to determine when it’s time to go full scale.

Right now I’ve got 3 of the 4 services working in Lambda, and hopefully will have the rest wrapped up this weekend.

Then it’s on to finally preparing for an App Store submission…

Moderation

One of the last services I built for Wave was a simple moderation service.

OpenAI generously offers a free omni-moderation model that can flag text and images for inappropriate content.

OpenAI omni-moderation result

It doesn’t cover maybe everything we’d want to keep an eye on in the app, but it’s a solid foundation for a preliminary check on anything that users submit to make sure there’s nothing obviously questionable.

And can’t beat the price of free.

Even though the model says “omni” it does not moderate video.

Luckily I now know how to get ffmpeg running on AWS so there’s an easy workaround.

Since omni-moderation will take images, and the plan for wave is short video, no more than 15 seconds, I can use ffmpeg to capture frames at least every second from the video and submit those images for moderation instead.

It won’t catch everything though because it will miss if something happens briefly between those 1 second frame intervals, but again, a good free foundation to catch a majority of cases.

All without needing someone to constantly keep an eye on what people are posting.

AI Workflow

No major upsets in the world of AI models this week, as far as my workflow is concerned anyway.

One tweak I’ve made recently though, is that I’ve set Cursor to only use GPT 4.1 at this point.

Claude is solid, but it’s a little verbose sometimes, and more frustratingly, will randomly change a few minor lines here and there, usually CSS styling, that weren’t part of the request.

It got to the point that I just expected it and had to go review all the changes so I could reject the random 2 or 3 lines that I didn’t need changed.

I had seen on X that GPT-4.1 has been performing well in terms of coding being that it was trained more heavily for code, so I figured I’d give it a shot.

So far it seems really solid.

Ridiculously enough, most of the time I ask Gemini Pro or GPT-o3 for major code changes, and will copy parts if not the whole answer and paste into the GPT-4.1 Cursor chat to have it apply the changes for me.

This is also where 4.1 has an advantage over Claude, because 4.1 is trained to handle step-by-step instructions much better and can take the change instructions from Gemini and apply them more accurately.

Claude undoubtedly will skip over an instruction or do something close but not follow it exactly.

This is a lot to put into words, and I will definitely have a video to showcase this workflow soon, but the key take away is that in the last few weeks I’ve whittled my workflow down to primarily 4 models:

  • Gemini Pro 2.5

  • GPT-o3

  • GPT-4.1

  • Perplexity Pro (a retrained version of Deepseek I think…)

Claude has been sidelined for now but we’ll see.

Never know what model will be on top next week.

The Bigger Picture

It’s usually pretty easy to convince me to watch (or listen to) a YouTube video with Sam Altman, which I’m sure YouTube has figured out by now.

Sam was talking at the Sequoia Capital’s AI Ascent event in this particular video and mentioned something that gave me an idea of where to take Wave or possibly how to look at other development projects in the future.

He was detailing how OpenAI got started and how they were trying to figure out what product to offer as a business.

Sam noticed from his time at Y Combinator that APIs tend to end up having significant upside and so did making things incredibly easy to use.

Given those two thing, that created a natural product for them - make an AI model easy to use through an API.

Hearing that idea suddenly changed my perspective on where to take Wave.

Having an easy to use app is a great way to get users, but either way, at a large enough scale, there has to be a background platform to support it.

Which reinforces the idea that the app is just what it is, a client.

The real app is what the mobile app connects to, the platform that runs in the cloud in the background.

And opening up that platform for other businesses to adopt in addition to users could be a solid strategy for growth.

As software eats the world, technology inevitably becomes a core part of any business and if technology is at the heart, it’s pretty hard to avoid using APIs at some point.

Following Sam’s logic then, having an easy-to-use API, ready to be consumed by other businesses, could be a real advantage in any market, or offer a considerable upside as he mentioned.

So maybe the ultimate goal isn’t Wave the app, but something like Nightlife OS.

First priority is getting the app launched, but doesn’t hurt to keep the bigger picture in mind for the future.

Also here’s Altman’s talk at AI Ascent:

AlphaEvolve

And for an even bigger picture, Google announced their AlphaEvolve project this week.

They already have AlphaFold solving all protein structures in existence, but AlphaEvolve is here to solve algorithms.

The big numbers from the release are that it’s helps recover 0.7% of all resources used for Google services which I’m sure is a massive number.

It’s also found a new solution for a certain kind of matrix multiplication thought to have been solved since 1969.

In addition, it seems to be improving itself, with AlphaEvolve helping to optimize the training for Gemini Pro which is part of how it operates.

Super interesting stuff.

It’s not publicly available, I filled out a 6 page form to try and see if they’d let me have access to it, but I would be genuinely shocked if I got it.

In the meantime, here’s a 4 minute recap from Fireship and a much longer recap from Wes Roth depending on your level of interest:

And that’s it for this week! Lower costs, faster flows, and bigger pictures.

Those are the links that stuck with me throughout the week and a glimpse into what I personally worked on.

If you want to start a newsletter like this on beehiiv and support me in the process, here’s my referral link: https://www.beehiiv.com/?via=jay-peters. Otherwise, let me know what you think at @jaypetersdotdev or email [email protected], I’d love to hear your feedback. Thanks for reading!