Vibing Too Hard, Debugging and Maintenance

The Weekly Variable

May be back to old-fashioned learning after this week.

And old-fashioned debugging.

Plus old-fashioned maintenance.

With some new stuff too.

Vibing Too Hard

Decided a last-minute refactor of Wave was a good idea this week, but unfortunately I didn’t make the cut for a new version to be approved in time for the App Store this weekend.

I was hoping to squash some remaining stability bugs in the version that’s live, but it’s been trickier than expected.

I might be back to the point of pushing AI’s limits, or pushing my limits of understanding it.

Gemini has been doing it’s best to try to identify the issue but React Native race conditions are subtle and potentially very complex.

I had Gemini try to a few times to fix a bug in the log out process of the app, but I wasn’t having any luck.

It confidently told me “this is the definitive fix”, or “this is the final, shippable version” multiple times only to have that version still have the same issue or create a new one.

After a few overly optimistic assurances, I figured I’d give other AIs a shot, namely o3 and Grok 4.

o3 is interesting because it goes back and forth, saying “yes this looks good, and here’s 8 other things you could do and probably should do but this overall looks good” like it’s trying to be both reassuring but also outlining that there’s still potentially a lot left.

Grok 4 had some solid interpretations as well, but it’s much more straightforward with just a numbered list of things to fix and examples of how to fix them.

Finally I gave o3-pro a shot too, and it gave the best suggestions overall.

Not only did it seem to understand the issue more thoroughly but it I’m pretty sure it spun up it’s own code to replicate the situation and do it’s own testing, which is kind of amazing.

The higher levels of Deep Think for Gemini and Grok 4 Heavy probably perform more on that level, but I didn’t quite feel the need to invest another $550 to upgrade those accounts to find out.

My final conclusion after trying o3-pro’s impressive analysis was that a human may still be needed.

Granted I haven’t given o3-pro full code context, just working with a few files since it has a much smaller context window than Gemini.

It may be able to analyze the entire app structure and get a better understanding, resulting in a better answer.

But for now, I think I’m going to have to dig a little deeper into understanding all of my generated code better than I thought I understood it.

May have been vibing a little too hard…

System.out

Testing on iOS simulator is clearly not cutting it.

Wave always performs great on my laptop.

As soon as it gets to my iPhone, the big bugs show up.

The problem right now is the development pipeline.

I’ve been using EAS to build, which is great, really convenient I don’t have to maintain the build process, but it makes the build lifecycle a little too slow, and is also getting a little expensive at $2 a build.

Using EAS to test on my own iPhone right now looks like this:

  • run command to send the build off to EAS

  • wait about 7 minutes for the build to complete

  • run command to submit that build to Apple

  • wait another 3-5 minutes for Apple to accept the build

  • refresh the TestFlight screen a few times for that build to be show up for testing

  • add my account to that new build for testing

  • wait a few minutes for TestFlight to update on my phone with the new version of the app

  • update to the new version of the app

  • open the app to discover the same bug is still there

Testing on iOS Simulator looks like this:

  • update code

  • press cmd+S to save changes

  • watch app flash for a few seconds while it reloads

  • see no bugs on iOS Simulator

So build pipeline is a huge issue.

The other issue is debugging.

A core part of my coding approach is there’s almost nothing that System.out debugging can’t solve.

In the Java days, I used this everywhere:

System.out.println("### here 1")

System.out.println("### here 2")

System.out.println("### here 3...")

With enough console logs, you can figure out what’s going on.

The problem with app development is that it’s not nearly as straightforward.

Plus I’ve found my remote logging system isn’t exactly working as expected so that’s an issue too.

I need to know what’s happening on the backend of the iPhone to see what’s truly the issue.

So that will be part of the focus this weekend.

Either log better or build better.

Looking forward to creating an overly complicated solution to this problem, hopefully involving a lot of “### here“s!

Filling the Pipeline

The content pipeline has begun.

Streaming is back.

I now have about 6 hours of streamed video from this week, ready to start processing for infinite content potential.

With that I probably have enough posts and videos to last for months in all honesty.

But I have not transcribed it all yet which is a bit of a blocker.

It’s been a minute since I used my previous stream-to-transcript service I was using last year so when I tried using it for these newer streams, Whisper wasn’t cooperating.

I was already troubleshooting enough problems with Wave this week so I deprioritized the video transcribing for another time.

But I might use that system as a nice break from app development this weekend.

All of this in an effort to appease the mighty algorithms and “build the brand”.

Which hasn’t been bad!

Up to 65 members in the community so far and YouTube is slowly creeping up to 800 with a few comments trickling in during the week for both.

I’ll gladly take that progress for posting one YouTube video a week still.

This was supplemented with 4 live streams, which was 5x my normal YouTube output, but live streams are handled much differently than YouTube videos, so probably won’t see much in terms of growth with live streams for a while.

The real value is turning the live streams into proper YouTube videos.

That’s the goal anyway.

The foundation is there.

Hopefully next week I’ll be talking about having documents full of text pulled straight from live streams.

Video Starter Kit

The one YouTube video I did uploaded this week was for the n8n video starter kit I previewed last week.

Inspired by n8n’s AI Starter Kit, I wanted something more foundational to what I’ve been doing with this whole “video manipulation” obsession.

It installs self-hosted n8n, Whisper and ffmpeg and sets up a simple service so that n8n can connect to the service to get word- or segment-based timestamped transcripts, as well as provide ffmpeg ready to execute a custom command.

All the stuff I’ve been making videos about for a while now, wrapped in a neater, ready to run package with Docker.

I thought it’d be useful, and I’m sure I’ll be building more into it, but figured it’s better to get a video up and follow the “build in public” approach.

Let me know what you think if you end up checking it out!

Maintaining Production

As much as I’ve been questioning vibe coding lately, I’m glad to see it probably wasn’t the cause for the Tea app hack that’s been the hot topic this week.

From my experience, AI has been diligently reminding me about Row Level Security and catches obvious issues like that.

Not exposing private information like drivers licenses to the public is pretty baked in to app security basics so AI would be well trained on that concept.

I’d have to agree that this sounds more like a careless accident than a vibe-coding error.

Or potentially a migration issue if it sounds like they stopped using that storage in February 2024, but never moved everything to a more secure bucket.

Even that is not an easy fix.

Flipping from public to private would more than likely break the app so a complete migration plan spanning a long time would probably be involved.

Another great point I saw from Gergely also pointed out how planning for downtime to handle that situation is probably not the way to go:

A good rule of thumb I forget to live by for estimating timelines in tech is:

  • take the estimate, double it, and add half

That will get you a closer estimate but it’ll still probably be wrong.

Not surprising to see 6 hours turn into 32 hours.

There’s too many cascading effects to make a change and expect to know all the repercussions.

It makes way more sense to think about the app as a live service, and plan for that transition period where not everyone will be using the same version of the app.

This way you’re building redundancies and planning for edge cases and can update in phases as opposed to flipping the switch and changing everything in one session.

Despite the recent React Native growth opportunities I’ve been recently faced with, I’m pretty comfortable with the production support lifecycle.

It’s stressful at times for sure, but can also be fun when handled well.

Best of luck to all the devs out there managing production crises like these.

And that’s it for this week! Maybe too much vibing, plus debugging, streaming, and maintaining. It’s certainly never boring.

Those are the links that stuck with me throughout the week and a glimpse into what I personally worked on.

If you want to start a newsletter like this on beehiiv and support me in the process, here’s my referral link: https://www.beehiiv.com/?via=jay-peters. Otherwise, let me know what you think at @jaypetersdotdev or email [email protected], I’d love to hear your feedback. Thanks for reading!