Thoughts on Tech News of Note - 04-03-2026
- Google's TurboQuant Algorithm
- Microsoft's New Models
- Reflections on Fifty Years of Apple
Google's TurboQuant Algorithm
For a semi-normal person, trying to understand, much less explain how Google's new algorithm works isn't as easy as I would have hoped. The best way to deal with it is of course to oversimplify it, which I shall now do unironically. You can skip the next paragraph if you just want to get to the point of it all.
One of the challenges of using large language models (LLMs) is the amount of memory required to process the data needed to respond to prompts. It's an issue because longer and more detailed prompts generally elicit better and more complex responses from LLMs, so as users have leaned into this, artificial intelligence (AI) companies have expanded the context window for entering prompts. The full prompt needs to be held in memory to be processed and as more data is entered, more memory and computational power is needed and eventually things slow down. There is even a name for this issue: the "key-value (KV) cache bottleneck". Google went after this bottleneck by finding a novel way to compress the cached data so that it takes an average of 6X less memory and performance increases by up to 8X. Google did this by employing mathematical frameworks PolarQuant and Quantized Johnson-Lindenstrauss (QJL), which is why "quant" appears in the name of Google's algorithm. Prompt data is typically stored in memory as vectors. The vector data is compressed or quantized from decimals to integers and in that process, data is lost and over time and this can lead to hallucinations or incoherent responses. PolarQuant instead converts the vectors into polar coordinates that use a radius and angle system instead of x/y/z Cartesian coordinates. Through the magic of math, when you randomly rotate this data, the distribution of the angles becomes predictable and can be mapped onto a fixed circular grid. This also eliminates the need to store normalization constants, which are usually needed to improve reliability with mixed format prompt data. Nevertheless, some errors will remain after this step, so QJL is used to assign a +1 or -1 to each error number so the model is still able to decide which words in the prompt are most relevant.
Yeah, that was a lot, I know. Now you probably have images of spiraling circles in your head. Fortunately, understanding the mathematical details to how all of this works is less important than understanding the resulting impact. Local processing can now take up less memory and be completed faster. And as Google released this under an open framework, this approach can be used by other companies and indeed developers are already using it. The performance improvement is perhaps most keenly needed for modest hardware like most consumer-grade computing products. The ability to run AI models locally on phones, tablets, and laptops means normal people will be able to get more work done on their existing machines without having to offload as much processing to the cloud. Developers who have begun using the technology have reported positive results with significant reduction in KV cache size with no accuracy loss. This means users can now theoretically process longer and more complex conversations with models without running into as many issues with data loss and computational errors.
Although TurboQuant won't do anything to minimize the memory needed to train LLMs, the impact for the inference steps means that the memory crunch on the end-user side could be relieved somewhat. Google's announcement immediately caused a dip in the stock prices of memory manufacturers due to the theoretical reduction in demand for the high-bandwidth memory (HBM) that is in short supply right now. It remains to be seen whether this one innovation can truly end the memory crunch, but anything that can reduce the demand for HBM is probably a good thing. It may be true that companies using LLMs may be able to reduce their spend on graphics processing units (GPUs) and that may also help reduce the demand for HBM. Demand on the inference side remains high. As AI tools become more useful and popular, more people will use them, and they will want to use them for increasingly complex tasks. We need breakthroughs on the training side as well to reduce the need for all those power-hungry GPUs running in racks in data centers. But this change could perhaps buy us some time for someone to get us to that next step on the training side.
Microsoft's New Models
Microsoft recently renegotiated their contract with OpenAI so that they are able to pursue their own internally developed AI models and this week they introduced three to the world. They are all branded "MAI", which stands for Microsoft AI. MAI Transcribe is a speech-to-text model that Microsoft says outperforms competing models at transcription tasks in up to 25 languages. MAI Voice is a text-to-speech model that can generate 60 seconds of audio in only one second and can be used to produce voice matching models with only a few seconds of sampled audio. And MAI Image 2 is an image-generation model designed to product photorealistic images and improved text rendering. Microsoft says it is incorporating Image 2 directly into Bing and PowerPoint. Microsoft also says its models are more efficient and were trained on less data than competing models from Anthropic and OpenAI. Microsoft is pricing usage to be attractive to users who might otherwise consider competing models.
The real story here is not so much the models. They do look to be competitive, and Microsoft will surely install them in more of its own products so that more people are exposed to them and hopefully become sticky customers willing to pay for additional services and features. What is more interesting is Microsoft's not-so-slow pivot away from OpenAI. Microsoft retains a license to use products from OpenAI for several more years, but this isn't just a hedge against a potentially shaky future (i.e., demise) for OpenAI as many are prophesying. This shows an intent to become a player in this space and to begin to take market share from their competitors. Mustafa Suleyman, Microsoft's AI CEO, has stated that the goal is to develop state-of-the-art AI capabilities by 2027. Suleyman leads the group at Microsoft that is tasked with creating super intelligent tools that will move the company toward artificial general intelligence (AGI). You don't establish a group inside of Microsoft with these goals without the express intent and desire to become completely independent from an AI perspective. Microsoft is aiming to compete, not just with OpenAI, but with all of the major AI companies and because they like Google have a big profitable business behind them, they are perhaps better positioned than most to make this happen.
Reflections on Fifty Years of Apple
Outside of iPhones and iPads, I have managed to live my entire life without ever owning an Apple computer. My early years were forged with the Commodore 64 at home and various terrible Tandy machines at school. I did have some exposure to Apple II machines at school as I have fond memories of playing Oregon Trail and frequently dying of dysentery. After the Commodore 64, our household moved to IBM clones and never looked back. I learned how to use autoexec.bat to play games with as much memory as possible. I adored text adventure games from Infocom because as an avid reader, they were fun to play and didn't require as many floppies. As games added graphics, I also grew to love games from Sierra and Activision (who acquired Infocom after people stopped playing text-only games and Infocom's forays into business software flopped). My first programming trysts with BASIC were done on an IBM clone. I knew Apple computers existed; I was a faithful reader of Computer Shopper and PC Magazine, so I knew what was out there. I thought the PowerBook was a beautiful laptop. But when I was old enough to buy my own first laptop, I went with a custom build running Windows because it was the golden age of design-it-yourself computers and I loved the idea of getting exactly what I wanted. I wouldn't have any real experience with an Apple product again until the iPhone debuted. And even then, I wasn't initially sold on it. I was a PDA and stylus devotee; I'd used the Sony Clie, HP IPAQs, and lusted after the Sharp Zaurus. I'd even imported a pen-enabled Samsung Windows CE machine with a keyboard cover. At the time the iPhone came out, I was using Windows Mobile and Nokia devices. Even my Windows laptops were pen-enabled. I remember well when I finally decided to try an iPhone - because I was a programmer by day and I thought maybe I could build an app or two - I went to the Apple store to buy an iPhone 3G and the salesman noticed I had a Nokia N95 and told me that the phone I had was better than the iPhone and I told him I was aware. I used that iPhone until the 3GS came out, which I also bought. After that, I gave up on the iPhone completely and haven't owned one since. But I would go on to buy an iPod Touch so I could keep an eye on what was happening on the other side of the tech world because I liked knowing where things were going. When the iPad came out, I jumped in and bought one nearly every year until the original iPad Pro, which I bought but eventually sold, disappointed in its frailties as a pen-enabled tablet. I didn't buy anything else with an Apple logo on it until 2022 when I bought a 1TB 12.9" M1 iPad Pro, thinking that with Stage Manager I might finally be able to use it for work and maybe some music production. I was mainly wrong about that; there were still so many weird quirks with iPadOS and so many limitations with the Safari browser, but I still have that iPad to this day. I even bought the Magic Keyboard for it, which is where it lives full-time.
Somewhere along the way, I realized I really wasn't an Apple person. In some ways, I do fit the mold; I'm a musician and songwriter who uses digital audio workstation software, music transcription software, and sheet music reading software. I'm a firm believer in the benefits of a good stylus and handwritten notes. I'm also a typical early adopter who loves to play with new gadgets and values having experiences across as many platforms and systems as possible (never ask me about my crazy years when I was triple-booting operating systems on my personal laptop). I like my hardware to be aesthetically pleasing and well-built, and I value buying products from a company I can trust that will stand behind its products. In many ways, that's almost like a textbook description of the perfect Apple customer. And I probably would have been happy with a MacBook had I ever allowed myself to get past the initial stress of re-learning how to use a computer, but I could never get into the swing of using iPhones and iPads despite understanding very well how to use them. They were beautiful machines and the apps are undoubtedly better and prettier than what is available even now for Android. But I always felt constrained. In the early days of the iPhone, I jailbroke it and ran whatever I wanted and that was acceptable. But when I wanted to use it for work email, I couldn't run it jailbroken anymore. That's when I moved to Android and never looked back. I found that even though things weren't as pretty, I could do pretty much anything I wanted. Apple has been gradually opening up over the years, but some things remain frustratingly locked down. I hate that seemingly only the European Union can get them to open things up.
Yet I'm so very glad Apple has been here making beautiful products that mainly "just work". They have pushed every phone, tablet, and PC manufacturer to be better. They have encouraged developers to make apps that don't just function but look good and perform well. Apple popularized the idea of making music and art on mobile devices. They forced the music industry to bend to their will in many ways and are probably nearly single-handedly responsible for the explosion of the Bluetooth audio market (even if some of us are still a little bummed at how that has turned out). It's because of their efforts with cameras that almost no one who isn't a photographer carries around a dedicated camera anymore, yet people are taking more photos than ever before. Even the popularity of podcasts can be attributed in some ways to Apple making them easy to access. We owe Apple a lot.
I want to see Apple continue to push the envelope and re-embrace some of the moxie and intrigue of those early years to continue to push technology markets forward. Lately it has seemed that Apple has mainly been coasting but I am hopeful that their future entrance into the foldable mobile device and touchscreen laptop markets will propel all of us into the next wave of mobile devices that will inspire new generations to want to imagine, build, and reshape the digital and tangible worlds they want to see.