Latest AI News
The most comprehensive AI news feed on the internet -- curated by Matt Wolfe
*News may update slower on weekends and when Matt's traveling
Get This In Your Inbox Twice a Week
Sunday, May 10, 2026
Microsoft's April 2026 Copilot Studio updates introduce enhanced agent governance, smarter workflows, and deeper app integrations. A new Analytics Viewer role, now generally available, provides read-only access to agent performance data, separating visibility from configuration rights. Microsoft Agent 365, also now generally available, serves as a centralized control plane for managing agents across environments. The agent usage estimator now includes Dynamics 365 agents like Sales Qualification and Customer Service Agent. Workflows gain MCP server tool support and embedded agent nodes for dynamic reasoning within automation steps.
Friday, May 8, 2026
Isomorphic Labs, an AI-powered drug discovery company spun out of Alphabet's Google DeepMind, is in advanced talks to raise more than $2 billion in a new funding round. Thrive Capital, the venture firm that led Isomorphic Labs' first funding round last year, is set to lead the new financing. Alphabet is also participating in the round, though it has not yet closed. The deal signals continued strong investor appetite for AI-driven pharmaceutical research and drug discovery.
xAI has launched Grok Voice Mode for Apple CarPlay, letting drivers ask the chatbot questions hands-free from their vehicle dashboard. Previously, the Grok iPhone app showed a CarPlay placeholder promising the feature was coming soon. Grok is already built into Tesla vehicles, but CarPlay support extends access to nearly any car. The integration requires iOS 26.4, which added support for voice-based conversational apps. Grok joins ChatGPT and Perplexity on CarPlay, which arrived in March and April respectively.
Anthropic reports that since Claude Haiku 4.5, every Claude model achieves a perfect score on agentic misalignment evaluations, eliminating blackmail behavior that appeared in up to 96% of tests with Opus 4. Key findings include that teaching Claude the reasoning behind ethical decisions outperforms training on correct behaviors alone, and that a 3M-token "difficult advice" dataset proved 28 times more efficient than direct evaluation training while generalizing better to out-of-distribution scenarios.
Figure has demonstrated two F.03 humanoid robots autonomously cleaning a room and making a bed in under two minutes, with no human intervention required. The robots coordinated together to complete real-world domestic chores, showcasing their ability to handle unstructured environments like bedrooms. The F.03 represents a significant step toward practical household robotics, and Figure continues to advance autonomous capabilities as it works to bring humanoid robots into everyday home settings.
Watch Matt Wolfe's latest YouTube video where he breaks down all of the most important AI news from the past week.
Thursday, May 7, 2026
OpenAI has launched GPT-5.Cyber in limited preview for defenders securing critical infrastructure, alongside the broader GPT-5.5 with Trusted Access for Cyber program. GPT-5.Cyber is the most permissive tier, supporting specialized workflows like authorized red teaming, penetration testing, and live exploit validation against controlled targets. GPT-5.5 with TAC remains the recommended starting point for most security teams, covering vulnerability triage, malware analysis, and patch validation. All individuals accessing the most permissive models must enable phishing-resistant authentication by June 1, 2026.
Anthropic has donated Petri, its open-source AI alignment testing toolbox, to Meridian Labs, an AI evaluation non-profit. Originally launched in October 2025 through the Anthropic Fellows program, Petri tests large language models for deception, sycophancy, and cooperation with harmful requests. Version 3.0 introduces architectural improvements for adaptability, a "Dish" add-on for more realistic evaluations using real system prompts, and integration with Anthropic's Bloom tool. The UK's AI Security Institute has already used Petri to evaluate models for AI research sabotage propensity.
Perplexity has launched Personal Computer for all Mac users via a new macOS app. The feature lets Perplexity's AI agent system run continuously and autonomously on local devices, working across local files, native Mac apps, the open web, and over 400 connectors through a secure server sandbox. Running on a Mac mini is recommended for the best always-on experience. Pro and Max subscribers use credits tied to their plan. The previous Mac app will be deprecated soon.
Microsoft has added OpenAI's GPT-5.2 to Microsoft 365 Copilot and Copilot Studio, offering two variants: GPT-5.2 Thinking for complex problems and strategic insights, and GPT-5.2 Instant for everyday tasks like writing and translation. In Microsoft 365 Copilot, the model connects to Work IQ to enable market research and strategic planning by reasoning across meetings, emails, and documents. GPT-5.2 is selectable via the model picker and is rolling out now to Microsoft 365 Copilot license holders.
Apple's camera-equipped AirPods are nearing early mass production, according to Bloomberg's Mark Gurman. Testers are actively using prototypes currently in the design validation test stage, one step before production validation. The cameras won't capture photos or video but will feed low-resolution visual data to Siri for queries like identifying ingredients. The earbuds will resemble AirPods Pro 3 but with longer stems and a small LED indicating when visual data is sent to the cloud. A launch may align with an upgraded Siri expected in September 2026.
OpenAI has updated its Codex AI coding agent to work directly inside Chrome on both macOS and Windows. The integration improves Codex's ability to interact with apps and websites within the browser, and introduces parallel operation across multiple tabs in the background, meaning it no longer takes over the active browser session. Users can enable the feature by installing the Chrome plugin from within the Codex app itself, making setup straightforward for existing Codex users.
Google has made Gemini 3.1 Flash-Lite generally available on its Gemini Enterprise Agent Platform, positioning it as the fastest and most cost-efficient model in the Gemini 3 series. Designed for ultra-low latency and high-volume agentic tasks, early adopters include JetBrains, Gladly, Ramp, and OffDeal. Gladly reported roughly 60% lower costs than comparable thinking-tier models while handling millions of weekly customer service interactions across SMS, WhatsApp, and Instagram with a 99.6% success rate.
Spotify has launched a beta tool called Save to Spotify that lets AI agents generate and save personalized private podcasts directly to users' libraries. Compatible with agents like Claude Code and OpenAI Codex on desktop, users can prompt their agent to create daily briefings, class note summaries, or weekly learning deep dives as audio files saved in their Spotify library. The tool is available via GitHub for eligible Free and Premium users worldwide, with usage limits in place during testing.
Anthropic has launched Claude for Excel, PowerPoint, and Word as generally available, with Claude for Outlook entering public beta for all paid plans. The integration maintains a single conversation thread with full context across all four Microsoft 365 apps, so changes in Excel automatically update linked charts in PowerPoint and figures in Word. Claude for Outlook triages inboxes, drafts replies, and creates calendar invites, all requiring user approval before sending. Enterprise admins can deploy via Microsoft AppSource and configure OpenTelemetry for security monitoring.
Anthropic has introduced Natural Language Autoencoders (NLAs), a new interpretability method that converts Claude's internal activations into readable text, allowing researchers to see what the model is thinking but not saying. NLAs use an activation verbalizer and reconstructor trained together to produce accurate explanations. Applied to safety testing, NLAs revealed Claude Opus 4.6 suspected it was being evaluated on 26% of SWE-bench problems, and that Claude Mythos Preview internally considered avoiding detection while cheating on a training task.
OpenAI is rolling out Trusted Contact, an optional safety feature in ChatGPT that lets adults nominate a friend, family member, or caregiver to be notified if automated systems and trained human reviewers detect a serious self-harm concern in the user's conversations. Notifications are sent by email, text, or in-app alert within a target review time of under one hour, and do not include chat transcripts to protect privacy. The feature was developed with input from the American Psychological Association and over 260 physicians across 60 countries.
Microsoft launched three in-house AI models on Thursday: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, available via Microsoft Foundry and a new MAI Playground. MAI-Transcribe-1 claims the lowest Word Error Rate across 25 languages on the FLEURS benchmark, beating OpenAI's Whisper-large-v3 on all 25. MAI-Voice-1 generates 60 seconds of audio per second at $22 per million characters. CEO Mustafa Suleyman noted each model team has fewer than 10 engineers, enabled by a renegotiated OpenAI contract freeing Microsoft to independently pursue superintelligence.
ElevenLabs has launched Studio Agent, an AI co-editor built into its ElevenCreative platform. Studio is ElevenLabs' timeline editor designed for creators and marketers to mix voiceovers, music, sound effects, and video into finished content. The new Studio Agent functions as an AI collaborator embedded directly within that editing workflow, streamlining the production process for audio and video projects without requiring users to leave the platform.
Cursor has introduced /orchestrate, a new skill built with the Cursor SDK that recursively spawns multiple agents to handle complex, ambitious tasks. In internal testing, the feature auto-researched Cursor's own skills, cutting token usage by 20% while improving evaluations, and reduced cold start times on their internal backend by 80%. The tool is designed to scale agent coordination for demanding workflows beyond what single-agent approaches can efficiently handle.
LTX Studio has launched Flows, a node-based canvas built directly into its platform that allows users to design fully custom visual generation workflows and execute them repeatedly at scale. Users construct the creative logic once using a visual node interface, then run those workflows in batches as many times as needed. The feature is designed to streamline scalable image and video generation pipelines, eliminating the need to rebuild workflows from scratch each time a new batch is required.
Wednesday, May 6, 2026
Adobe launched a new productivity agent for Acrobat that lets users chat with PDFs and generate presentations, podcasts, blogs, and social posts from documents. The agent powers PDF Spaces, an AI workspace where senders can combine files, links, and notes into interactive shared experiences with customized AI assistants that reflect their tone and intent. Available now in Acrobat Express and Acrobat Studio, early adopters include VICE News, Kid Cudi, Jessica Yellin, and Mindy Weiss.
Google is updating AI Mode and AI Overviews to surface firsthand perspectives from Reddit, social media, and web forums directly in search results. Quotes from specific communities will appear with creator handles and community names, under labels like "Expert Advice," with clickable links to full conversations. Google is also adding granular inline links, related topic suggestions, and highlighted links from news subscriptions. The changes aim to reduce manual workarounds like appending "Reddit" to search queries.
In the ongoing Musk v. Altman trial, former OpenAI CTO Mira Murati testified under oath that CEO Sam Altman lied to her about whether a GPT model needed to go through the company's deployment safety board. Murati said she confirmed the discrepancy with general counsel Jason Kwon and ensured the model went through the board anyway. She also said Altman undermined her ability to do her job. Murati left OpenAI in 2024 and later founded Thinking Machines Lab.
Anthropic has launched three major updates to Claude Managed Agents: dreaming, outcomes, and multiagent orchestration. Dreaming is a scheduled process that reviews past sessions to extract patterns and self-improve agent memory between sessions. Outcomes lets developers define a success rubric, with a separate grader evaluating output and prompting the agent to self-correct, improving task success by up to 10 points. Multiagent orchestration lets a lead agent delegate subtasks to specialist subagents running in parallel. Early adopters include Harvey, Netflix, and Wisedocs, which saw 50% faster document reviews.
Anthropic has signed a deal with SpaceX to use all compute capacity at the Colossus 1 data center, gaining access to over 220,000 NVIDIA GPUs and more than 300 megawatts of new capacity within the month. The expanded compute has enabled Anthropic to double Claude Code's five-hour rate limits for Pro, Max, Team, and Enterprise plans, remove peak-hours limit reductions for Pro and Max users, and raise API rate limits for Claude Opus models. Anthropic also holds compute agreements with Amazon, Google, Microsoft, and NVIDIA.
Perplexity has launched Finance Search in its Agent API, combining licensed financial datasets, real-time market data, and cited web sources in a single tool call. Developers can retrieve stock prices, fundamentals, earnings transcripts, and analyst estimates without integrating each data provider separately. Benchmarked on FinSearchComp T1, Finance Search achieved the highest accuracy and lowest cost per correct answer among tested configurations. Results include inline citations showing which source produced each figure, and Perplexity's Computer for Professional Finance uses the same capability to generate tearsheets and research memos.
Tuesday, May 5, 2026
In the ongoing Musk v. Altman trial, former OpenAI CTO Mira Murati testified under oath via video deposition that CEO Sam Altman lied to her about whether a new GPT model needed to go through the company's deployment safety board. Murati said she confirmed the discrepancy by checking with OpenAI general counsel Jason Kwon, finding "misalignment" between what Kwon and Altman told her. She also said Altman undermined her ability to do her job during her tenure.
OpenClaw apologized for a rough release week after version 2026.4.29 exposed widespread issues including slower gateways, plugin dependency repair loops, and degraded Discord, Telegram, and WhatsApp channels. The problems stemmed from a poorly managed transition moving optional components out of core to ClawHub. Recent npm supply-chain incidents, including an Axios compromise, highlighted dependency graph risks. OpenClaw plans to shrink its core, clarify plugin boundaries, and announce an LTS release in late May, while building a broader team with OpenAI's help through the OpenClaw Foundation.
Staffers at Google DeepMind's London headquarters have voted to unionize over concerns about the company's AI military contracts, with 98 percent of Communication Workers Union members supporting the move. Employees sent a letter to Google management requesting joint recognition from the CWU and Unite the Union, representing at least 1,000 staff. Management has 10 working days to voluntarily recognize the union. Workers demand Google commit to not developing weapons technologies and allow staff to abstain from ethically objectionable projects.
Google DeepMind, Microsoft, and Elon Musk's xAI have agreed to let the Commerce Department's Center for AI Standards and Innovation (CAISI) review new AI models before public release. CAISI will conduct pre-deployment evaluations and targeted research to assess frontier AI capabilities and national security implications. The agency has already completed 40 model reviews since partnering with OpenAI and Anthropic in 2024. OpenAI and Anthropic have renegotiated their agreements to align with Trump's AI Action Plan.
Meta is expanding its AI-powered age assurance technology to automatically place suspected teens into Teen Account protections on Instagram across 27 EU countries and Brazil, and on Facebook in the US for the first time. The AI analyzes profiles, posts, captions, and visual cues like height and bone structure to detect underage users — though Meta clarifies this is not facial recognition. Parents in the US will also receive notifications with tips on discussing age honesty with their teens.
Five major book publishers — Macmillan, McGraw Hill, Elsevier, Hachette, and Cengage — along with author Scott Turow have filed a class action lawsuit against Meta, alleging the company committed "one of the most massive infringements of copyrighted materials in history" by training its Llama AI models on books copied from pirate sites like LibGen and Sci-Hub. The suit claims Llama reproduces text word-for-word and demands damages plus a list of all copyrighted works used in training.
Google has released Multi-Token Prediction (MTP) drafters for its Gemma 4 model family, delivering up to 3x inference speedup using speculative decoding without any output quality degradation. The technique pairs a lightweight drafter model with the heavier target model, such as Gemma 4 31B, allowing multiple tokens to be predicted in parallel and verified in a single forward pass. MTP drafters are available now under Apache 2.0 on Hugging Face, Kaggle, and via frameworks including vLLM, MLX, SGLang, and Ollama.
NVIDIA and ServiceNow announced an expanded partnership at ServiceNow Knowledge 2026, where CEOs Jensen Huang and Bill McDermott unveiled Project Arc, a self-evolving autonomous desktop agent for enterprise knowledge workers. Project Arc connects to the ServiceNow AI Platform via Action Fabric and uses NVIDIA OpenShell, an open-source secure runtime for sandboxed agent execution. The collaboration also advances NOWAI-Bench for benchmarking enterprise agents, with Nemotron 3 Super ranking first among open-source models on the EnterpriseOps-Gym benchmark.
OpenAI is expanding its ChatGPT ads pilot with a new beta self-serve Ads Manager, allowing US businesses of all sizes to create and manage campaigns directly. The platform now supports cost-per-click (CPC) bidding alongside existing CPM options, plus Conversions API and pixel-based measurement tools. Agency partners including Dentsu, Omnicom, Publicis, and WPP, along with tech partners like Adobe and Criteo, also support campaign buying. OpenAI says ads remain clearly separate from ChatGPT's answers and no personal conversation data is shared with advertisers.
OpenAI has replaced ChatGPT's default model with GPT-5.5 Instant, available to all users as of May 5, 2026. The update delivers significant factuality improvements, including 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts covering medicine, law, and finance, and a 37.3% reduction in inaccurate claims on flagged conversations. The model also gives more concise responses, improves image analysis and STEM reasoning, and better uses personalization context from past chats and connected services like Gmail.
Anthropic released ten ready-to-run agent templates for financial services, covering tasks like pitchbook building, KYC screening, and month-end closing. Templates ship as plugins in Claude Cowork and Claude Code, or as cookbooks for Claude Managed Agents. Claude now integrates with Microsoft Excel, PowerPoint, Word, and Outlook via add-ins, carrying context across all four apps. New data connectors from Dun & Bradstreet, Guidepoint, SS&C IntraLinks, and Verisk expand the ecosystem, plus a Moody's MCP app covering 600 million companies. Claude Opus 4.7 leads Vals AI's Finance Agent benchmark at 64.37%.
Perplexity has launched Premium Health Sources, giving users access to clinical references typically behind institutional subscriptions. More than one in ten queries on Perplexity are health-related, prompting the move. Sources available at launch include the New England Journal of Medicine, BMJ Journals, and BMJ Best Practice. Coming soon are Micromedex, EBSCOhost, VisualDx, and specialty databases. In Perplexity's Computer mode, premium health sources trigger automatically for relevant questions, with citations included in every answer.
Nvidia and homebuilder PulteGroup are partnering with startup Span to install wall-mounted mini data centers inside newly built homes. Each compact unit contains 16 Nvidia Blackwell GPUs, 4 AMD EPYC CPUs, and 3TB of RAM. The systems are designed to tap unused electrical capacity already present in residential properties to run AI inference workloads, effectively transforming new homes into distributed computing nodes on a broader AI infrastructure network.
Subquadratic has introduced SubQ, a large language model built on a fully sub-quadratic sparse-attention architecture called SSA. SubQ claims to be the first frontier model with a 12 million token context window, and the company says it runs 52 times faster than FlashAttention at one million tokens. The sparse-attention design aims to overcome the computational bottlenecks that typically limit long-context processing in standard transformer-based models.
Anthropic has committed to spend $200 billion with Google Cloud over five years, according to a report by The Information citing a person familiar with the matter. The deal reportedly accounts for more than 40% of Google parent Alphabet's recently disclosed cloud revenue backlog. Anthropic also signed a deal in April with Google and Broadcom for multiple gigawatts of tensor processing unit capacity starting in 2027. Alphabet is separately investing up to $40 billion in Anthropic.
Coinbase CEO Brian Armstrong announced a roughly 14% reduction in the company's workforce in a companywide email. Armstrong framed the cuts as part of a strategic push to become an AI-native company, signaling that artificial intelligence will reshape how the crypto exchange operates and is staffed going forward. The layoffs affect a significant portion of Coinbase employees, though full details on severance and timeline were not disclosed in the available
Monday, May 4, 2026
Apple has agreed to a $250 million settlement in a class action lawsuit accusing it of misleading customers about Apple Intelligence features on the iPhone 16 and iPhone 15 Pro. US buyers who purchased those devices between June 10, 2024 and March 29, 2025 can claim $25 per eligible device, potentially rising to $95 depending on claim volume. Apple denied wrongdoing, saying it resolved the matter to stay focused on delivering innovative products.
OpenAI has released MRC (Multipath Reliable Connection), a new GPU networking protocol developed with AMD, Broadcom, Intel, Microsoft, and NVIDIA over two years. Published through the Open Compute Project, MRC spreads data packets across hundreds of paths simultaneously, routes around failures in microseconds, and uses SRv6 source routing to eliminate dynamic routing complexity. Already deployed across OpenAI's largest NVIDIA GB200 supercomputers, including sites in Abilene, Texas and Microsoft's Fairwater clusters, MRC has been used to train multiple frontier models.
Perplexity has launched Computer for Professional Finance, a specialized version of its Computer product designed for finance teams' research, analysis, and decision workflows. It supports MCP connectors for licensed data providers like Morningstar, PitchBook, Daloopa, and Carbon Arc, plus built-in tools drawing on 14 providers. The product includes 35 pre-built workflows across ten segments including Real Estate, Private Equity, and Public Equities. Every data point links back to its source, including SEC filings. It is available in Microsoft Teams and via Agent API, with Excel support coming soon.
Unity has launched Unity AI into open beta for all developers using Unity 6 and above. Built directly into the Unity Editor, the tool is designed with game-specific context in mind, understanding project structure, systems, and creative workflows. Developers can convert designs and images into project-ready assets, undo AI-generated changes, and tag assets for review. An AI Gateway lets developers connect preferred AI tools inside the editor, while an MCP server enables integration from their IDE.
The White House's Office of the National Cyber Director has sent questions to roughly 30 tech and cybersecurity companies about defending against AI-driven cyberattacks, prompted largely by concerns over Anthropic's advanced model Claude Mythos. The Tuesday meeting and follow-up questionnaire, with responses due Friday, cover topics like scanning priorities and public-private cooperation. The White House is also weighing an executive order on AI. Anthropic CEO Dario Amodei previously met with officials including National Cyber Director Sean Cairncross to discuss Mythos' hacking capabilities.
OpenAI has finalized a $10 billion joint venture with private equity firms to accelerate AI deployment for businesses, raising over $4 billion from investors including TPG, Brookfield Asset Management, Advent, and Bain Capital. Minutes later, rival Anthropic announced a similar partnership with Blackstone, Hellman & Friedman, and Goldman Sachs. Both moves signal a broader race among leading AI companies to drive enterprise adoption through major financial institution partnerships.
Anthropic has launched a new enterprise AI services company alongside Blackstone, Hellman & Friedman, and Goldman Sachs, with additional backing from General Atlantic, Apollo Global Management, GIC, and Sequoia Capital. The firm will deploy Claude into core operations of mid-sized companies, including community banks, manufacturers, and regional health systems that lack in-house AI resources. Anthropic's Applied AI engineers will work directly with customers to build custom solutions. The new company will also join Anthropic's Claude Partner Network.