- Affects engineers across Experiences & Devices — the unit that builds Windows, Microsoft 365, Teams, Outlook and Surface; migration to GitHub Copilot CLI by June 30.
- Cause is unit economics, not quality: token-based billing consumed the full annual AI budget in months because engineers used Claude Code constantly.
- Hits HN at #2 with 418 points and 398 comments — the comments treat it as proof that enterprise coding-AI pricing doesn't work at current token rates, not that Copilot won on merit.
- Echoed by Uber CTO disclosing the same pattern — full 2026 AI budget gone in four months.
- Quiet GA rollout May 29, ten days after the I/O 2026 unveil — Google's first always-on personal agent, running on Antigravity in the cloud whether your devices are on or off.
- Connects Gmail, Calendar, Docs and Sheets via MCP; you can schedule recurring tasks like 'audit my card for forgotten subscriptions' or 'summarize the school inbox daily'.
- Beta in name only — Spark is the headline feature of the $100/month Ultra tier and now everyone with that plan in the US has it.
- Payments authorization, the obvious next step, is deliberately not live yet.
- Reuters/The Information: Microsoft will announce an in-house coding model — plus reasoning, transcription, speech and image models — at Build 2026 in SF on June 2–3.
- First major model push from Mustafa Suleyman's Microsoft AI team since the April OpenAI deal renegotiation freed it to train frontier-tier models.
- Aimed squarely at boosting GitHub Copilot, which has visibly lost ground to Claude Code with developers — including, until last week, Microsoft's own.
- Microsoft shares rose ~3% on the report; keynote streams free at 9:30am PT June 2.
- Filed in SDNY May 28: first AI copyright suit from any US TV network, alleging Perplexity outputs are 'identical or substantially similar' to CNN originals.
- Seeks unspecified damages and an injunction blocking further use of CNN content.
- Perplexity's response in full: 'You can't copyright facts.'
- Joins active suits from the New York Times, Reddit and Dow Jones — Perplexity is now the most-sued AI search product.
- Front-page essay drawing the parallel between the React-era erosion of frontend craft and what AI assistants are doing to general software expertise.
- Argument: AI writes plausible code fast enough that developers stop questioning whether the boilerplate, the architecture, or the abstractions are needed at all.
- Nearly 200 HN comments; the most-upvoted threads are senior engineers describing juniors who can ship features but can't debug what they shipped.
- Pairs neatly with this week's Microsoft/Uber budget-blowout stories — AI is simultaneously too expensive and too deskilling, and the industry is arguing about both at once.
- Jędrzej Maczan's deliberately-small sibling to vLLM — a teaching codebase that loads Llama 3.2 1B from safetensors and runs a full GPU forward pass.
- Implements FlashAttention-style softmax, PagedAttention, plus both static and continuous batching.
- Ships with a free C++/CUDA course walking through every kernel, scheduler decision and piece of math.
- Aimed at developers who want to understand how serving stacks actually work, not deploy another one.
01
Microsoft Cancels Internal Claude Code Licenses After Burning a Year of AI Budget in Months
industry windowscentral.com
02
Gemini Spark Goes Live for Every US Google AI Ultra Subscriber
tools 9to5google.com
03
Microsoft Plans to Unveil Homegrown Coding Model at Build Tomorrow
models reuters.com
04
CNN Sues Perplexity Over 17,000 Scraped Articles, Photos and Videos
industry cnn.com
05
Viral HN Essay: 'Is AI Causing a Repeat of Frontend's Lost Decade?'
community news.ycombinator.com
06
Show HN: Tiny-vLLM, an LLM Inference Engine Built from Scratch in C++/CUDA
open-source github.com