jaypeters.dev
Posts
Sora, Pro Mode, and Willow

Sora, Pro Mode, and Willow

The Weekly Variable

Jay Peters
December 13, 2024

The Weekly Variable

Too many things to cover this week with Shipmas in full swing, access to Sora, o1, and Google coming through with a big announcement in Quantum Computing.

Here’s the short list:

Another Finally!
Pro Mode
Slow Flow
HeyGen In Action
Willow

Another Finally!

The internet predictions were correct!

Monday, OpenAI released Sora to the public.

But… sign-ups crashed and I wasn’t able to create an account until this morning.

So unfortunately I haven’t had a ton of time to generate anything fun, but I did lose at least 20 minutes just scrolling through what others have generated.

Some truly wild ideas out there, and it’s unbelievable how you can submit a prompt and in less than a few minutes, potentially have a realistic looking video pop out.

Like an assembly line of videos…

As previously mentioned, I’m sure this Sora release was motivated by the growing number of video generation models available to the public now, including the Hunyuan model surprise last week.

I’m sure the model is no where near what OpenAI would like it to be in terms of quality output, but it’s still in an incredible place.

I’ve already seen a few videos in the Featured section that were hard to tell they weren’t real.

If I had seen them on social media, I would have believed it, at least at first glance.

Looking forward to spending more time generating my own attempts to see what it can do.

Insane to think you can create realistic looking video in a few seconds for $20 a month.

Pro Mode

If you’re up for spending a little more than $20, say $200 a month, you can get the ChatGPT Pro membership which gives you access to o1 Pro Mode.

I was debating signing up last week after it was announced, and I waited all of one day before giving in.

One tweet from @McKayWrigley and I was pretty much convinced on the spot.

OpenAI o1 pro is *significantly* better than I anticipated.
This is the 1st time a model’s come out and been so good that it kind of shocked me.
I screenshotted Coinbase and had 4 popular models write code to clone it in 1 shot.
Guess which was o1 pro.
— Mckay Wrigley (@mckaywrigley)
5:44 PM • Dec 6, 2024

When I first looked at this post, I thought the image in the top left was a screenshot of Coinbase.

The other 3 images looked much more like what I expected from AI generated dashboards.

It wasn’t until I scrolled down the post thread and saw the actual screenshot of what he prompted:

Seeing that, I involuntarily said “wow” out loud to myself.

The fact o1 was able to create something that close to the original with one screenshot and a single line prompt is an amazing leap forward.

I’ve tried to generate dashboards before but it always turns out much closer to the other 3 images, basic and acceptable.

o1 is next level in terms of code generation.

Getting closer to that senior level engineer with each release.

After signing up, I even had a chance to test it out myself.

I was doing a little app development and ran into a persistent issue where a default image kept showing during the splash screen when you first open the app.

I replaced all the images, cleaned and rebuilt the whole app and all it’s data multiple times, spent at least 2 hours going back and forth with Sonnet 3.5, perplexity, and o1-mini before giving o1 pro a shot.

Admittedly first time it gave me the same recommendations as the other models, but when I told it that I had tried everything else and was still stuck with the same issue, it spit out way deeper answers than anything else, suggesting I delete cached files stored in a separate Xcode folder in my Mac, outside the scope of the project I was working on.

Deleting those files fixed the issue first try.

I was really impressed, I had been going in circles with other AIs and o1 pro was able to reason a much more thorough answer.

I’m looking forward to really pushing o1 pro to see what it can do, and thankfully with the Pro membership, I should get unlimited prompting so I can prompt my money’s worth quickly.

Slow Flow

I’m sure this will change soon, but the only downside to o1 pro mode is that it’s locked down to just the ChatGPT interface.

It’s not API accessible which is a bummer.

Once o1 pro is in Cursor, the code will fly.

McKay Wrigley, once again, has been pushing the bounds of o1 pro, and posted a video outlining his workaround to this problem, but it’s not exactly convenient.

He uses a separate app to tokenize his code repo so that it can all be broken down and added to a prompt for o1, saving the effort of copying and pasting files over and over.

Then he uses a specific prompt to format the o1 output into an XML template so he can use another custom app to extract the changes recommended by o1 and apply them back to the codebase.

It’s really pretty impressive, and a great solution to an immediate problem.

Hopefully o1 pro mode will be API accessible soon so it can plug straight into Cursor, making this whole flow much easier.

Or maybe with the next update, it will just do all the coding for you and there will no longer be a need for IDEs or programmers.

I’ll wait for the API update first though 🤞

HeyGen In Action

HeyGen is another AI video tool I’ve seen floating around, and referenced a few times.

I haven’t committed to trying yet but it piqued my interesting this week.

Watching a Devin Nash live stream, someone from his community posted a video of themself that was completely AI.

I haven’t used HeyGen myself, but I was guessing the process was really involved to train a video model on yourself, assuming HeyGen would push that you use their pre-trained actors for videos, which I’ve seen a few times now.

But this guy said he recorded a two minute video and was able to create a realistic looking and sounding video of himself right after.

Then he was able to have himself speak in Chinese instead.

Really incredible.

I can already see use for this creating customized intro videos for business agreements, automatically creating an individualized video greeting for a new client.

At $100 per month for the service, it really seems reasonable, but all these AI subscriptions are starting to add up…

Devin’s reaction isn’t too far off from my own in the video below:

Willow

I haven’t given up on my aspirations for the next big thing:

Quantum Computing

I just recognize that I may have a few years before the technology really breaks into real world applications, but it looks like it might have gotten a lot closer this week.

Google announced their newest Quantum Computing chip, Willow.

Right now the biggest problem for quantum computing is scaling.

It’s incredibly difficult to keep entangled qubits from interacting with anything in the real world, whether it’s the air conditioner turning on, or someone sneezing in another room.

Because of this the error rate for quantum processing is often unusably high.

The more qubits used for processing, the more chances of them not working as intended.

But Google says this chip reliably stabilizes qubits and actually reduces errors as you add more qubits for processing.

This could be a huge breakthrough for quantum computing.

Given this new stability, they said they were able to run a standard benchmark test that would have normally taken a top performing super computer 10²⁵ or 10,000,000,000,000,000,000,000,000 years to process.

Willow processed the answer in less than 5 minutes.

It can do this because quantum computers process in a superposition of 0+1, or 0, 1, and everything in between, at the same time.

Traditional computers are stuck with either 0 or 1 but not both or anything in between.

Today’s processors consist of billions of transistors so that’s billions of 0 or 1 in parallel, but that still reaches limits pretty quickly depending on the complexity of the problem.

While a traditional computer can scale with something like O(n³ ) given a complicated enough problem, a quantum computer can scale at O(n), where 10 becomes 1000 for one, 10 remains 10 for the other. Which leads to unfathomably different processing times like 10²⁵ years versus 5 minutes.

It’s a super exciting announcement, and I’ll be anxious to see more develop from it.

I may have to switch from AI to Quantum sooner than expected.

And that’s it for this week! AI is full steam ahead, and quantum computing is trying to enter the mix. Always too much to cover.

Those are the links that stuck with me throughout the week and a glimpse into what I personally worked on.

If you want to start a newsletter like this on beehiiv and support me in the process, here’s my referral link: https://www.beehiiv.com/?via=jay-peters. Otherwise, let me know what you think at @jaypetersdotdev or email [email protected], I’d love to hear your feedback. Thanks for reading!