November 2025: AI updates from the past month

Several new updates make their way into the MCP specification

It has been one year since Anthropic first open sourced Model Context Protocol (MCP), and to celebrate this anniversary a new version of the specification is being released.

“It’s hard to imagine that a little open-source experiment, a protocol to provide context to models, became the de-facto standard for this very scenario in less than twelve months,” the MCP Core Maintainers wrote in a blog post.

The latest release includes support for task-based workflows (experimental). According to the maintainers, tasks provide a new type of abstraction for tracking the work an MCP server performs. It enables several new capabilities, such as active polling to check the status of ongoing work anytime and result retrieval to see results of completed tasks. Tasks also support many different states including working, input_required, completed, failed, and cancelled.

Anthropic releases Claude Opus 4.5 with improvements to complex reasoning

Anthropic has also released the latest version of its largest Claude model, Opus. Claude Opus 4.5 is better at handling complex reasoning than previous Claude models and makes improvements across agentic tool use, computer use, novel problem solving, and more.

The company says early testers of the new model claim that it handles ambiguity better and reasons over tradeoffs without needing human intervention. “They told us that, when pointed at a complex, multi-system bug, Opus 4.5 figures out the fix. They said that tasks that were near-impossible for Sonnet 4.5 just a few weeks ago are now within reach. Overall, our testers told us that Opus 4.5 just ‘gets it,’” Anthropic wrote in a post.

This release also coincides with a new effort parameter being introduced in the Claude API, allowing developers to decide how much effort Claude should spend on a problem. According to Anthropic, Opus 4.5 uses significantly fewer tokens than its predecessors to solve problems, even at its highest effort level. For example, at a medium effort level, Opus 4.5 matches Sonnet 4.5’s score on SWE-bench Verified while using 76% fewer output tokens, while at the highest effort level it uses 48% fewer tokens while exceeding Sonnet 4.5’s performance by 4.3%.

During the Microsoft Ignite conference, Microsoft announced several new products and features designed to enable the agent-powered enterprise.

“The future of work will be shaped by Frontier Firms—organizations that are human-led and agent-operated. These companies are reshaping how work gets done, empowering every employee with an AI assistant, amplifying impact with human-agent teamwork, and reinventing business processes with agents. Today at Microsoft Ignite, we introduced new capabilities in Microsoft 365 Copilot to help every customer become Frontier,” Microsoft wrote in a blog post.

It announced Agent 365, a control plane for managing agents, whether they are created in Microsoft’s ecosystem or from third-party partners.

Google announces agentic development platform, Google Antigravity

Coinciding with its announcement of Gemini 3, Google announced the launch of a new agentic development platform, Google Antigravity.

The company sees Antigravity as an evolution of the IDE into an agent-first future, with capabilities like browser control and asynchronous interaction patterns.

“With models like Gemini 3, we have started hitting the point in agentic intelligence where models are capable of running for longer periods of time without intervention across multiple surfaces. Not yet for days at a time without intervention, but we’re getting closer to a world where we interface with agents at higher abstractions over individual prompts and tool calls. In this world, the product surface that enables communication between the agent and user should look and feel different – and Antigravity is our answer to this,” Google wrote in a blog post.

Cloudflare announces acquisition of AI platform Replicate

According to Cloudflare, by bringing Replicate—an AI platform that allows developers to deploy and run AI models—into its portfolio, it will be able to turn Cloudflare Workers into a leading platform for building and running AI applications. “Soon, developers building on Cloudflare will be able to access any AI model globally with just one line of code,” the company wrote in an announcement.

Replicate has over 50,000 production-ready AI models, which will be available in Cloudflare Workers AI. Cloudflare will also leverage Replicate’s expertise to add new capabilities to Workers AI, such as the ability to run custom models and pipelines.

Existing Replicate users will be able to keep using their APIs and workflows without interruption, and will soon be able to benefit from Cloudflare’s network.

OpenAI’s latest update delivers GPT-5.1 models and capabilities to give users more control over ChatGPT’s personality

According to the company, users will now have more control over ChatGPT’s tone and style. It had added a few preset tone options earlier this year, and now it is refining the options and adding new ones. The existing ones that will remain unchanged are Cynical (originally Cynic) and Nerdy (originally Nerd), while other presets will be updated, including Default, Friendly (originally Listener), and Efficient (originally Robot). Three entirely new presets are being added as well: Professional, Candid, and Quirky.

GPT-5.1 Instant is warmer and more conversational than its GPT-5 counterpart, and is also better at following instructions. “Based on early testing, it often surprises people with its playfulness while remaining clear and useful,” OpenAI wrote.

It can use adaptive reasoning to decide when it should think before responding, which results in more thorough and accurate answers while still being able to provide quick turnaround times.

The other new model, GPT-5.1 Thinking, also adapts thinking time to the question, meaning it will spend longer working through complex problems and shorter answering simple prompts.

Compared to GPT-5 Thinking, the newer model offers clearer responses with less jargon and fewer undefined terms, according to OpenAI.

Cloudsmith launches MCP Server

Cloudsmith is a company that provides cloud-native artifact management, and this MCP server will allow developers to integrate Cloudsmith’s capabilities directly into their workflows.

Developers can use it to get answers about their repositories, packages, and builds, and can initiate certain actions with full audit logs to maintain visibility over interactions.

“AI is redefining how developers work, moving from manual clicks to natural language interactions. We see this shift every day with our customers. Cloudsmith’s MCP Server is a necessary bridge to this new way of working,” said Alison Sickelka, VP of Product at Cloudsmith. “By integrating directly with tools like Claude and CoPilot, we ensure engineers can manage, secure, and make decisions about their software artifacts simply by asking a question within the environment they already use. This isn’t just about convenience, it brings trusted artifact data and governance exactly where developers build, making the AI part of the secure software supply chain, not separate from it.”

Legit Security releases VibeGuard

VibeGuard is an AI agent for securing AI-generated code when it is created, as well as providing more security controls over coding agents. It links directly into a developer’s IDE to monitor agents, prevent attacks, and prevent vulnerabilities from reaching production. Additionally, it injects security and application context into AI agents to train them to be more secure.

According to recent research by the company, 56% of security professionals cited lack of control over AI-generated code as a top concern. Meanwhile, traditional security tools are reliant on human workflows and reactive scanning, and Legit Security believes that model does not work when code is being generated by AI. It hopes that VibeGuard helps bring the level of security to these tools that is needed today.

Webflow launches new vibe coding capability called App Gen

The web design platform Webflow announced new updates to its platform to align it more with the vibe coding experience, allowing any user to bring their ideas to life regardless of their coding skills.

According to the company, this new capability, App Gen, enables users to move from creating websites into creating web experiences.

It builds on the launch of Webflow Cloud, a full-stack platform for hosting apps directly in Webflow that was announced earlier this year. App Gen leverages a site’s existing design system, content, and structure so that each new creation aligns with their brand and can scale up using Webflow’s cloud infrastructure.

The new capability automatically applies all of a site’s topography, colors, and other layout variables to provide a consistent visual experience between the existing site and new AI-generated features. It also can reuse existing Webflow components to further ensure brand consistency and can connect to the site’s CMS to turn structured content into data-driven interfaces that stay up-to-date across the site.

Microsoft announces release of .NET 10 (LTS)

Microsoft has announced the release of .NET 10, the latest Long Term Support (LTS) release of .NET that will receive support for the next three years. As such, Microsoft is encouraging development teams to migrate their production applications to this version to take advantage of that extended support window.

This release comes packed with features for developers wanting to build with AI. For example, it comes with the Microsoft Agent Framework, which can be used to build agentic systems; Microsoft.Extensions.AI and Microsoft.Extensions.VectorData, which provide abstractions for integrating AI services into applications; and support for MCP.

Syncfusion Code Studio now available

Code Studio is an AI-powered IDE that offers capabilities like autocompletion, code generation and explanations, refactoring of selected code blocks, and multistep agent automation for large-scale tasks.

Customers can use their preferred LLM to power Code Studio, and will also get access to security and governance features like SSO, role-based access controls, and usage analytics.

“Every technology leader is seeking a responsible path to scale with AI,” said Daniel Jebaraj, CEO of Syncfusion. “With Code Studio, we’re helping enterprise teams harness AI on their own terms, maintaining a balance of productivity, transparency, and control in a single environment.”

Linkerd to get MCP support

Buoyant, the company behind Linkerd, announced its plans to add MCP support to the project, which will enable users to get more visibility into their MCP traffic, including metrics on resource, tool, and prompt usage, such as failure rates, latency, and volume of data transmitted.

Additionally, Linkerd’s zero-trust framework can be used to apply fine-grained authorization policies for MCP calls, allowing companies to restrict access to specific tools or resources based on the identity of the agent.

OpenAI starts creating new benchmarks that more accurately evaluate AI models across different languages and cultures

English is only spoken by about 20% of the world’s population, yet existing AI benchmarks for multilingual models are falling short. For example, MMMLU has become saturated to the point that top models are clustering near high scores, and OpenAI says this makes them a poor indicator of real progress.

Additionally, the existing multilingual benchmarks focus on translation and multiple choice tasks and don’t necessarily accurately measure how well the model understands regional context, culture, and history, OpenAI explained.

To remedy these issues, OpenAI is building new benchmarks for different languages and regions of the world, starting with India, its second largest market. The new benchmark, IndQA, will “evaluate how well AI models understand and reason about questions that matter in Indian languages, across a wide range of cultural domains.”

There are 22 official languages in India, seven of which are spoken by at least 50 million people. IndQA includes 2,278 questions across 12 different languages and 10 cultural domains, and was created with help from 261 domain experts from the country, including journalists, linguists, scholars, artists, and industry practitioners.

SnapLogic introduces new capabilities for agents and AI governance

Agent Snap is a new execution engine that allows for observable agent execution. The company compared it to onboarding a new employee and training and observing them before giving them greater responsibility.

Additionally, its new Agent Governance framework allows teams to ensure that agents are safely deployed, monitored, and compliant, and provides visibility into data provenance and usage.

“By combining agent creation, governance, and open interoperability with enterprise-grade resiliency and AI-ready data infrastructure, SnapLogic empowers organizations to move confidently into the agentic era, connecting humans, systems, and AI into one intelligent, secure, and scalable digital workforce,” the company wrote in a post.

Sauce Labs announces new data and analytics capabilities

Sauce AI for Insights allows development teams to turn their testing data into insights on builds, devices, and test performance, down to a user-by-user basis. Its AI agent will tailor its responses based on who is asking the question, such as a developer getting root cause analysis info while a QA manager gets release-readiness insights.

Each response comes with dynamically generated charts, data tables, and links to relevant test artifacts, as well as clear attribution as to how data was gathered and processed.

“What excites me most isn’t that we built AI agents for testing—it’s that we’ve democratized quality intelligence across every level of the organization,” said Shubha Govil, chief product officer at Sauce Labs. “For the first time, everyone from executives to junior developers can now participate in quality conversations that once required specialized expertise.”

Google Cloud’s Ironwood TPUs will soon be available

The new Tensor Processing Units (TPUs) will be available in the next few weeks. They were designed specifically for handling demanding workloads like large-scale model training or high-volume, low-latency AI latency and model serving.

Ironwood TPUs can scale up to 9,216 chips in a single unit with Inter-Chip Interconnect (ICI) networking at 9.6 Tb/s.

The company also announced a preview for new instances of the virtual machine Axion, N4A, as well as C4A, which is an Arm-based bare metal instance.

“Ultimately, whether you use Ironwood and Axion together or mix and match them with the other compute options available on AI Hypercomputer, this system-level approach gives you the ultimate flexibility and capability for the most demanding workloads,” the company wrote in a blog post.

DefectDojo announces security agent

DefectDojo Sensei acts like a security consultant, and is able to answer questions about cybersecurity programs managed through DefectDojo.

Key capabilities include evolution algorithms for self-improvement, generation of tool recommendations for security issues, analysis of current tools, creation of customer-specific KPIs, and summaries of key findings.

It is currently in alpha, and is expected to become generally available by the end of the year, the company says.

Testlio expands its crowdsourced testing platform to provide human-in-the-loop testing for AI solutions

Testlio, a company that offers crowdsourced software testing, has announced a new end-to-end testing solution designed specifically for testing AI solutions.

Leveraging Testlio’s community of over 80,000 testers, this new solution provides human-in-the-loop validation for each stage of AI development.

“Trust, quality, and reliability of AI-powered applications rely on both technology and people,” said Summer Weisberg, COO and Interim CEO at Testlio. “Our managed service platform, combined with the scale and expertise of the Testlio Community, brings human intelligence and automation together so organizations can accelerate AI innovation without sacrificing quality or safety.”

Kong’s Insomnia 12 release adds capabilities to help with MCP server development

The latest release of Insomnia aims to bring MCP developers a test-iterate-debug workflow for AI development so they can quickly develop and validate their work on MCP servers.

Developers will now be able to connect directly to their MCP servers, manually invoke tools with custom parameters, inspect protocol-level and authentication messages, and see responses.

Insomnia 12 also adds support for generating mock servers from OpenAPI spec documents, JSON samples, or a URL. “What used to require hours of manual set up, like defining endpoints or crafting realistic responses, now happens almost instantaneously with AI. Mock servers can now transform from a ‘nice to have if you have the time to set them up’ into an essential part of a developer’s workflow, allowing you to test faster without manual overhead,” Kong wrote in a blog post.

OpenAI and AWS announce $38 billion deal for compute infrastructure

AWS and OpenAI announced a new partnership that will have OpenAI’s workloads running on AWS’s infrastructure.

AWS will build compute infrastructure for OpenAI that is optimized for AI processing efficiency and performance. Specifically, the company will cluster NVIDIA GPUs (GB200s and GB300s) on Amazon EC2 UltraServers.

OpenAI will commit $38 billion to Amazon over the course of the next several years, and OpenAI will immediately begin using AWS infrastructure, with full capacity expected by the end of 2026 and the ability to scale as needed beyond that.