AI DAILY / DEV
MONDAY
June 1, 2026

    Microsoft Cancels Internal Claude Code Licenses After Burning a Year of AI Budget in Months

    • Affects engineers across Experiences & Devices — the unit that builds Windows, Microsoft 365, Teams, Outlook and Surface; migration to GitHub Copilot CLI by June 30.
    • Cause is unit economics, not quality: token-based billing consumed the full annual AI budget in months because engineers used Claude Code constantly.
    • Hits HN at #2 with 418 points and 398 comments — the comments treat it as proof that enterprise coding-AI pricing doesn't work at current token rates, not that Copilot won on merit.
    • Echoed by Uber CTO disclosing the same pattern — full 2026 AI budget gone in four months.
    industry windowscentral.com

    Gemini Spark Goes Live for Every US Google AI Ultra Subscriber

    • Quiet GA rollout May 29, ten days after the I/O 2026 unveil — Google's first always-on personal agent, running on Antigravity in the cloud whether your devices are on or off.
    • Connects Gmail, Calendar, Docs and Sheets via MCP; you can schedule recurring tasks like 'audit my card for forgotten subscriptions' or 'summarize the school inbox daily'.
    • Beta in name only — Spark is the headline feature of the $100/month Ultra tier and now everyone with that plan in the US has it.
    • Payments authorization, the obvious next step, is deliberately not live yet.
    tools 9to5google.com

    Microsoft Plans to Unveil Homegrown Coding Model at Build Tomorrow

    • Reuters/The Information: Microsoft will announce an in-house coding model — plus reasoning, transcription, speech and image models — at Build 2026 in SF on June 2–3.
    • First major model push from Mustafa Suleyman's Microsoft AI team since the April OpenAI deal renegotiation freed it to train frontier-tier models.
    • Aimed squarely at boosting GitHub Copilot, which has visibly lost ground to Claude Code with developers — including, until last week, Microsoft's own.
    • Microsoft shares rose ~3% on the report; keynote streams free at 9:30am PT June 2.
    models reuters.com

    CNN Sues Perplexity Over 17,000 Scraped Articles, Photos and Videos

    • Filed in SDNY May 28: first AI copyright suit from any US TV network, alleging Perplexity outputs are 'identical or substantially similar' to CNN originals.
    • Seeks unspecified damages and an injunction blocking further use of CNN content.
    • Perplexity's response in full: 'You can't copyright facts.'
    • Joins active suits from the New York Times, Reddit and Dow Jones — Perplexity is now the most-sued AI search product.
    industry cnn.com

    Viral HN Essay: 'Is AI Causing a Repeat of Frontend's Lost Decade?'

    • Front-page essay drawing the parallel between the React-era erosion of frontend craft and what AI assistants are doing to general software expertise.
    • Argument: AI writes plausible code fast enough that developers stop questioning whether the boilerplate, the architecture, or the abstractions are needed at all.
    • Nearly 200 HN comments; the most-upvoted threads are senior engineers describing juniors who can ship features but can't debug what they shipped.
    • Pairs neatly with this week's Microsoft/Uber budget-blowout stories — AI is simultaneously too expensive and too deskilling, and the industry is arguing about both at once.
    community news.ycombinator.com

    Show HN: Tiny-vLLM, an LLM Inference Engine Built from Scratch in C++/CUDA

    • Jędrzej Maczan's deliberately-small sibling to vLLM — a teaching codebase that loads Llama 3.2 1B from safetensors and runs a full GPU forward pass.
    • Implements FlashAttention-style softmax, PagedAttention, plus both static and continuous batching.
    • Ships with a free C++/CUDA course walking through every kernel, scheduler decision and piece of math.
    • Aimed at developers who want to understand how serving stacks actually work, not deploy another one.
    open-source github.com