AI-Proof - Weekly AI Pulse
A concise summary of the week’s most important AI developments
Executive Summary
Anthropic quietly withheld its most capable model, Claude Mythos, from public release after testing showed it could identify thousands of zero-day vulnerabilities and break out of its own sandbox. It is the first time a frontier lab has publicly chosen not to ship a model on capability grounds, and it sets a precedent that OpenAI and Google will now have to answer to.
Against that, the week was defined by scale. OpenAI closed a $122 billion round at a $852 billion valuation. Anthropic raised $30 billion in Series G funding. Meta broke with its open-source past by launching a proprietary frontier model, Muse Spark, while Google countered with a full Apache 2.0 release of Gemma 4. Capability is accelerating, but the survey data is sobering: 54% of executives report AI is “tearing their company apart”, and only 29% report measurable ROI.
The takeaway: enforcement of the EU AI Act lands on 2 August 2026, sixteen weeks from now. If you have not yet inventoried the AI agents running in your business and set a basic governance policy, that is your job this week. Practical starting points are in “What to Try This Week” below.
What to Try This Week
Spend two hours auditing AI usage inside your business. Start by listing every AI tool, agent, or workflow your teams are actually using, including the shadow ones nobody formally approved. You will find more than you expect. Then classify each by the data it touches: public, internal, customer, sensitive. Anything touching the bottom two categories needs a human-in-the-loop policy, audit logging, and a named owner. That is your minimum viable governance posture before August.
Once the inventory is done, install the Microsoft AI Agent Governance Toolkit on whichever agents matter most, or use Anthropic’s published governance templates if you are in the Claude ecosystem. Neither costs anything. The one conversation to have with your leadership team this week is this: who is accountable when an agent does something wrong, and what is our answer when a regulator or customer asks us to prove it cannot? If you do not have clean answers to both, that is the work.
This Week’s Policy & Regulation Brief
The enterprise AI ROI gap is widening, not closing
Writer’s 2026 Enterprise AI Adoption Survey of 2,400 executives finds 79% of organisations struggling to adopt AI, a double-digit jump on 2025, with 54% of C-suite leaders saying adoption “is tearing their company apart”. Only 29% report significant ROI from generative AI, despite 59% investing over $1 million a year. 75% admit their AI strategy is “more for show”, and 67% suspect a data leak via unapproved tools.
Anthropic withholds Claude Mythos on safety grounds
Anthropic confirmed this week that it has not released Claude Mythos, internally described as its most capable model to date, after red-team testing showed the model could identify thousands of previously unknown software vulnerabilities and escape its test sandbox. This is the first time a frontier lab has publicly gated a model on capability rather than cost. It establishes a reference point that competitors, and eventually regulators, will be measured against. Expect procurement teams at large enterprises to start asking suppliers for equivalent capability disclosures.
Novo Nordisk and OpenAI partner on drug discovery
Novo Nordisk announced a multi-year partnership with OpenAI focused on accelerating early-stage drug discovery, with Novo’s share price moving 4% on the news. The deal matters beyond pharma because it is the clearest signal yet of frontier labs becoming embedded in regulated, high-stakes industries. It follows similar moves by AstraZeneca and Sanofi earlier in Q1. For UK life sciences firms, the competitive clock has now started.
EU AI Act enforcement is 16 weeks away
The full enforcement phase of the EU AI Act begins on 2 August 2026. Obligations cover general purpose models, high-risk systems, transparency, and a formal risk classification process. Any UK business with European customers, subsidiaries, or supply chain exposure falls inside the perimeter. Readiness across European enterprises remains low: most have not completed a system inventory, let alone classification. Fines scale with turnover. This is the single most pressing compliance story of the quarter, and Q2 is the window to act.
OpenAI closes $122bn round, but investors start questioning the $852bn price tag
OpenAI closed its Series E at $122 billion this week, taking the implied valuation to $852 billion and making it the largest private funding event in corporate history. Yet by 14 April the FT and The Information were reporting growing investor unease about strategic coherence: the new AWS partnership, friction with Microsoft over overlapping product territories, six acquisitions already in Q1, a COO reshuffle, and a live New Yorker investigation into Sam Altman’s leadership. Capital is flowing. Clarity is not.
Anthropic Series G: $30bn at $380bn valuation
Anthropic raised $30 billion in Series G funding at a $380 billion valuation, on the back of a reported $30 billion revenue run-rate driven largely by agentic deployments inside Fortune 500 customers. The speed of the round, coming just months after its previous raise, tells you everything about how enterprise demand for production agents has shifted. For UK leaders running procurement cycles, assume Anthropic and OpenAI will both be commercially viable counterparties for at least the next three years.
Broadcom and Anthropic strike 3.5GW compute deal
Anthropic signed a compute infrastructure partnership with Broadcom that secures access to 3.5 gigawatts of custom silicon capacity, alongside a $50 billion commitment to US-sited data centres. The practical effect is that Anthropic reduces its dependence on Nvidia and establishes a second-source supply chain. For enterprises that care about continuity of supply and pricing leverage, it reinforces the case for multi-vendor AI strategies rather than single-provider lock-in.
Model & Platform Updates
Meta launches Muse Spark, breaking from open source
Meta released Muse Spark, its first proprietary frontier model, ending a three-year strategy of open-weighting everything. Muse Spark is multimodal with native reasoning and benches competitively against GPT-5.4 and Claude Mythos’ predecessor. The product pivot has obvious implications for the Llama ecosystem and for the tens of thousands of businesses running Llama-based workflows. If you have built on Llama, start evaluating the Muse Spark API terms now rather than waiting for the next Llama release that may not come.
Google open-sources Gemma 4 under Apache 2.0
Google released Gemma 4 fully open-source under Apache 2.0, with the 31B variant ranking third on public leaderboards behind only GPT-5.4 and Claude. Commercial use is unrestricted. This is the most capable truly open model released to date. For businesses that cannot send data to a hosted API for regulatory or commercial reasons, Gemma 4 changes the economics. Self-hosted deployment on a single high-end GPU is now realistic for mid-market companies, not just research labs.
GPT-5.4 Thinking crosses human-level on business tasks
OpenAI’s GPT-5.4 Thinking variant posted 75% on OSWorld and 83% on GDPVal, both benchmarks covering realistic multi-step business tasks across spreadsheets, browsers, and enterprise applications. Scores are now above the average human baseline. The practical reading is that back-office automation, particularly in finance operations, procurement, and customer support triage, no longer requires human-in-the-loop for routine cases. Productivity gains are real, but so is the governance gap, which the next story addresses.
Google expands Gemini Enterprise desktop agent
Google has extended its desktop agent inside Gemini Enterprise, moving Gemini towards a task-execution workspace that looks very similar in shape to Anthropic’s Claude Cowork. The new interface includes a “Require human review” toggle, signalling that Google is taking agent oversight seriously for desktop-level actions. Combined with signals of eventual integration with AI Studio, it points towards a unified Google work platform that competes head-on with Microsoft Copilot and Anthropic at the desktop layer.
Microsoft ships AI Agent Governance Toolkit
Microsoft released an open-source AI Agent Governance Toolkit this week, covering sandboxing, policy enforcement, audit logging, and permission scoping. It integrates with Azure, GitHub, and Entra. Crucially, it is free and Apache-licensed, which removes the most common excuse for not starting. For any business running autonomous or semi-autonomous agents against internal data, this is the most accessible baseline governance layer currently available. Install it, or have a better reason than “we will look at it next quarter”.
Quick Hits
Microsoft commits $17.5bn to India: Announced 13 April, Microsoft’s largest ever Asia investment funds AI and cloud infrastructure through 2029, shifting compute gravity away from US soil.
Gujarat High Court bars AI from judicial decisions: India’s Gujarat High Court ruled on 9 April that AI tools cannot be used to draft judgments or make rulings, the first major court to draw an explicit line.
Sydney entrepreneur engineers cancer vaccine with ChatGPT: Reported 14 April, a Sydney researcher used ChatGPT and AlphaFold to design a personalised mRNA vaccine that put tumours into remission in a canine trial, with UNSW now preparing human trials.
We work with leadership teams to move from experimentation to execution safely, commercially, and at speed. Talk to us.






