ZD 24.41: Bringing AI Home for the Holidays
OpenAI and Google are having a generous holiday season.
In this issue: Swapping mainstream and fringe, pentagon theses, your AI Assistant Kit and the Christmas AI Quarterly.
The Distilled Spirit
A word from our host: Happy Holidays! I really appreciate everyone taking their time to read this week in and week out.
📢 As an experiment I have added an AI-generated voiceover for part of the article. It just covers the AI quarterly, the Distilled Spirit did not quite sound out. If you like it hit the ♥. If you hate it or have anything to share please let me know in the comments!
Thanks for sharing a bit of your time with me! Merry Christmas! Now to continue with our regularly scheduled program.
😎 Mainstream-Fringe Role Swap ()
According to my NotebookLM reviewer, the sea change in the role of media has been a recurring theme on this publication. Streamers are driving culture; the old media is just following the lead. Will the new wave of AI tools empower creators to continue to invert this relationship?
🚀 Pentagon Reformation Conversation Starter ()
Palantir CTO Shaym Sankar proposes a series of reforms to the Defense procurement process and industry to prepare us for the conflict with China, now a peer and an adversary. He sees the current defense industry as a rotting monopsony, leading to a failure of our processes and capture of the agencies. This piece is clearly self-serving if you are a defense startup, but the concept is solid and goes beyond enlightened self interest and into honest assessments of our rotting defense industry.
🤖 Your AI Assistant Kit ( w/ )
Two venerable publications in the AI space team up to give you a curated list of a dozen AI tools to enhance your day. Pairs well in comparison to the 12 days of Shipmas from OpenAI. More on that below the break.
AI Quarterly: All I Wanted For Christmas was More Foundation Model Options
It has been a breathtaking few weeks in the AI world. OpenAI has 12 days of Shipmas. Google launched quantum computing, robots and a new thinking model. In this edition of the Zeitgeist Distilled AI Quarterly we are going to explore these announcements and then conclude with short guide on what this all means and which tool you should be using for what. Happy holidays!
Ranking the 12 Days of Shipmas
Over the course of 2024 OpenAI had demonstrated a lot of very compelling features. Until very recently, they had launched very few of these features — especially headliners like Apple integration and advanced vision. OpenAI changed their tune and hosted a dozen holiday-themed Youtube Live sessions where they shipped a dozen features. Here is my ranking of the dozen announcements based on utility to me in the next six to twelve months. Your mileage may vary.
Search (Day 8) looks amazing. Search had been in beta, and I had been trying to like it for a while and had not got quite there. This version solves for a lot of the flaws. Most important is that it seems to get smaller stuff right now — my “what soccer matches are on today” question works well with the current version. A few weeks back it would struggle and grab random matches from the month. I wonder if OpenAI will be able to keep this clean, good and still monetize it. On the other hand I think that Google should be a bit worried about this one, it is a better mousetrap especially given how useless Google results have been getting of late.
Advanced Voice and Video (Day 6) are finally here! What do they mean by advanced? Share your screen or your camera output for real-time, live interaction. ChatGPT can now see and hear. This is really fun for your holiday party tricks, and probably very useful in a lot of real world contexts. It is the kind of thing you need to try — turn on advanced voice mode and hit the camera button today, you will be amazed.
Since inception, ChatGPT has had no provision for any sort of in-app organization. That has finally changed. OpenAI has created Projects (Day 7) to solve for this. They can act as a pretty dumb folder — just containing chats for easy reference. You can also load them with files and instructions, not too unlike a custom GPT. At this time there does not appear to be a way to share a project, limiting its utility but hopefully something that is easily addressed in future updates.
I love ChatGPT Canvas. It is a key writing tool for me, allowing me to integrate AI into my work with ease. The updates to ChatGPT Canvas (Day 4) are therefore quite exciting. If you have been using Canvas there is not a lot of material change, which is good. Let’s not break a great thing. But the integration with more of the tools and the free tier roll out is nice. I also love the custom GPT integration — that was a bit of a stumbling block for me as I employ a few of them in writing and getting them to work with canvas was a challenge.
I am on the fence about OpenAI o3 and o3-mini (Day 12). I’m putting it higher up the list than I should because lots of folks seem to think it really is a big step towards AGI. It is also almost here — if OpenAI is to be believed it is landing in January. Usage will be very expensive, purportedly running in the hundreds of dollars a question in compute costs. We will see, but this one is largely for thee and not for me.
Developer Day (Day 9) had some important additions. WebRTC support is pretty nifty, and probably necessary to use advanced video so why not open it up? OpenAI o1 API support is going to enable lots of apps to leverage the deeper thinking model to great effect. Not one many will use directly, but those who use it will be very effective force multipliers.
Apple Intelligence Integration (Day 5) has been long promised. This was not particularly important to me — I have not really found a use case for Siri or Apple Intelligence and I’m not that invested in the apple ecosystem myself. At the end of the day this is really just as good as search is on the platform. The demo did great, but the real world is messy. And we have not yet seen monetization which is the downfall of many great user experiences.
On Day 11, OpenAI announced Desktop App Integration. It enables a handful of apps on your Mac to work with ChatGPT’s desktop app directly. It is a deeper integration than just a screen scrape — apps can expose data to the tool. Still very limited by platform and app, but promises of Windows support soon. Notion and Apple Notes look pretty handy at the moment, this gets real interesting when it has 2-way canvas integration with browsers, Word and Excel.
OpenAI kicked off the week announcing the $200 per month ChatGPT Pro (Day 1). The main selling points for this level are the use of a new deeper thinking version of the OpenAI o1 model and much better Sora access. Rumor has it this version can handle a few PhD-level questions, but fore mere mortals who don’t need those sorts of questions answered it is a pass. On the video side, you can get a lot of Runway credits for $200 a month. Was this just a stunt to prepare the world for o3 pricing?
Reinforcement Fine Tuning (Day 2) — this is probably important in the right circles but I’m not quite in that circle to truly appreciate this. I suspect this will make many things better using OpenAI as a back end over time.
Sora (Day 3) got a lot of headlines because, hey, video is cool after all. It just is not that interesting. Months ago, when first announced, video was very cutting edge. Tools like Runway and Pika have long since passed Sora’s fairly limited public feature set. Veo and others in the field are making advances. To OpenAI’s credit, they have addressed one of my big complaints — they have opened up Sora to ChatGPT Teams subscribers. But it still is hardly exciting except in a “now a lot of folks can touch not-so-great-nor-useful AI generated video” sort of way.
I appreciate the concept. Accessibility is a wonderful thing, and it is a really cool thing to come out of lab day. For most folks 1-800-ChatGPT (Day 10) isn’t really useful. It could be a fun party trick, or pretty cool in an older vehicle. The interesting takeaway is that they believe enough in the product that they think it can power a pure-voice phone experience. And it was probably the most interesting and fun video shoot they did all week. But it is a demo day stunt not needle moving . . . unless the WhatsApp bit takes off.
Overall they had a few big advances buried in here but this was really good and orderly progress well presented and dressed up in ugly holiday sweaters.
Google Goes Big and Hits
It has been said that “Google's 12 days of shipping are more exciting and they skip days.” He is spot on — Google dropped the mic. Here are the highlights of what they released:
Gemini 2.0 is a massive update to the already not-so-shabby Gemini 1.5. Overall it feels very, very on par with ChatGPT 4o. Gemini 2.0 Advanced, their new thinking model, is equivalent to o1. The arms race is heating up.
Google is multimodal. Veo2 and Imagen 3 are state-of-the-art video and image generation tools. Imagegen 3 feels a lot, lot better than Dalle3 for some tasks. I have not yet seen Veo2 — the wait list is so long they are not adding to it at this time. It is, however, getting rave reviews from technologists who have seen it.
One amazing feature Google released was a Gemini 1.5-based deep research model. Handles the “give it a well thought out question and get a good, well researched answer” scenario that is fairly common in some workflows. I have been really impressed by this one.
NotebookLM is an amazing tool. It has moved out of the lab and now can be bought on your Google Enterprise account and will be included in your Google One AI Premium if not. They also updated the interface added the ability to interact with the audio feed. It will be interesting to see what direction this takes now that the product team has moved on. Hopefully Google will not ruin it.
Google unveiled Willow, a new quantum chip that cracked a key challenge in quantum error correction, bringing the next wave of computer much closer to reality. AI is a CPU bound task at the moment and quantum will radically change the economics of compute.
Google DeepMind partnered with Aptronik to build humanoid robots using Gemini 2.0. We are living in the future.
For a deep dive see The Zvi’s excellent writeup on the updated platform.
What Should I Spend my $20 on in December 2024?
We have a few great tools to choose from, it is hard to go horribly wrong if you stick to the major players like ChatGPT, Claude, Perplexity or Gemini. Any of the paid versions of those tools are safe and functional and it would be hard to go wrong.
Unfortunately most of us cannot quite write off $80 or more worth of AI services a month so you need to choose. If you need to pick just one, the correct general choice for most people is to get ChatGPT Plus. That SKU has all the bells and whistles on the platform short the specialized pro-level tools. OpenAI is still moves a bit faster in terms of getting features into user’s hands. Canvas is a wonderful things that does not quite have an equivalent. Perplexity has some better search features, but ChatGPT search is pretty good and getting better. Claude has a bigger context window but it cannot actually draw if you need a picture.
The one case I might think differently is if I were deeply embedded in the Android or Google ecosystem. In that case, Gemini really starts to make sense. The recent updates have caught it up to ChatGPT for the most part. If you are on Android devices where Google’s integration really shines, one might even tilt towards Gemini over GPT. The models themselves are good and they have some features that OpenAI does not have deployed. Particularly the deep research tool that does an amazing job if you give it the right task. Google’s AI plan also get you a premium version of NotebookLM which is a tool OpenAI does not quite match one for one. It is a platform much more worth considering today than a few months ago. I am going to keep an eye on this one going forward.
AI at the End of ‘24
As a consumer it is a wonderful time. At the end of 2024 we are at a point in AI where it is hard to go wrong. We all have access to very, very good multimodal AI tools for nothing or next to nothing. The free versions of ChatGPT, Claude, Gemini, or Perplexity are now very useful out of the box. The barrier to entry to mechanized and automated intelligence is shrinking by the day. As Ethan Mollick puts it smart AIs are now everywhere.
The next wave looks even more powerful. OpenAI o1 is slow and pricey; o3 is hideously expensive. Intelligence was once a bottleneck, now you will have to spend time figuring out the right sorts of intelligence to deploy for a given task rather than wondering if AI can do a given task. If you are not yet thinking about how to automate your intelligence processes, now is probably a good time to start.
What is the thing that excites you most about AI in 2025? What is the thing that scares you the most? Let us know in the comments.
The Look
Happy Holidays! Enjoy the EDM light show in support of a great cause. Please consider giving a little back this season!
Did you enjoy reading this post? Hit the ♥ button above or below because it helps more people discover great Substacks like this one and it helps train your algorithm to get you more posts you like. Please share here or in your networks to help us grow!