GPT-5, multi-cloud AI, and next-gen developer tools: key updates from OpenAI, Anthropic, GitHub, and Cursor

EPAM AI SDLC experts explore how GPT-5’s routed system, multi-cloud GPT-OSS deployments, and new developer tools in Copilot, VS Code, and Cursor are shaping the future of autonomous coding and AI-driven workflows.

This digest was prepared by:

Alex Zalesov, Systems Architect
Tasiana Hmyrak, Director of Technology Solutions

Published in AI18 August 20256 min read

OpenAI positions GPT-5 as a routed system, not a single model and the release is already landing in Copilot and Cursor.
Open models go multi-cloud: gpt-oss-120b and gpt-oss-20b are now deployable on AWS and GCP. Cerebras, a separate inference provider, can serve gpt-oss-120b at ~3 000 tokens/s — a distinctive speed advantage for autonomous agents that need low latency while keeping data local.
On the tooling side, GitHub and VS Code add agent-governance features — per-agent tool allowlists, task-plan visibility, checkpoints, system notifications, Git worktrees, and path-scoped instructions. Copilot trims its model lineup (GPT-4o removed) and previews Opus 4.1 in Ask-only mode. Cursor focuses on steerability, observability, and native terminal support, and ships a multi-IDE CLI.

GenAI Adoption Training for Test Automation Engineers

Transform your approach to automation with advanced AI skills for effective testing and code enhancement.

View offer

GPT-5: a system, not a single model

SWE-bench verified software engineering graph

OpenAI framed GPT-5 as a system rather than a monolithic model. It combines a low-latency "smart and fast" model for most queries, a "deeper reasoning" model for complex tasks, and a real-time router that dispatches between them based on conversation type, complexity, tool needs, and explicit intent.

"GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, and explicit intent."

Table: model progressions (previous → GPT-5)

Previous model	GPT-5 model
OpenAI o4-mini	gpt-5-thinking-mini
OpenAI o3 Pro	gpt-5-thinking-pro
OpenAI o3	gpt-5-thinking
GPT-4o-mini	gpt-5-main-mini
GPT-4o	gpt-5-main
GPT-4.1-nano	gpt-5-thinking-nano

Sources:

GPT-OSS goes multi-cloud

Comparison of API providers for gpt-oss-120B high model

OpenAI’s GPT-OSS models are now available on both AWS and GCP in addition to Azure — the first time OpenAI models can be deployed across both clouds. If your application runs on AWS, you can select models from Anthropic, OpenAI, or open-weights models. On GCP, you gain access to all major model providers. For enterprise teams, this expands deployment flexibility and reduces lock-in when standardizing AI across environments.

Cerebras is an inference service provider that builds custom ASIC-based wafer-scale systems to accelerate LLM inference. Historically, because it did not produce its own models, Cerebras primarily served open-weight models. With the GPT-OSS model family, Cerebras now runs OpenAI’s gpt-oss-120B at world‑record speeds of about 3,000 tokens/second on the Cerebras AI Inference Cloud, materially lowering end-to-end latency for agentic coding workflows.

This level of speed isn’t necessary for chat interfaces, since humans can only read about 15 tokens per second. It also doesn’t benefit collaborative agents much, as they’re limited by the time it takes for a human to approve actions. The real advantage is for autonomous coding agents that can take on a task and complete it end to end without human intervention.

GPT-OSS 20B also enables a true local coding setup. Previously, achieving strong coding quality typically required cloud inference, sending repository data to a provider and back. With GPT-OSS 20B, teams can run inference locally with good results and keep codebases local. The trade-off remains: prioritize data isolation and local processing with slightly reduced intelligence, or use frontier models for peak coding quality while accepting cloud-based inference.

Sources:

Anthropic upgrades Opus to v4.1

Anthropic has released an incremental update to the Opus model, moving from version 4.0 to 4.1. Performance on benchmarks is similar or slightly improved compared to the previous version. Pricing remains unchanged. The main benefits are increased precision and reliability, which support more complex and longer agentic workflows.

Source: https://www.anthropic.com/news/claude-opus-4-1

GitHub

1. VS Code July release

GPT-5 Public Preview

OpenAI’s latest frontier model, GPT-5, is rolling out in Copilot. Copilot Enterprise and Business administrators must opt in by enabling the new GPT-5 policy in Copilot settings.

Chat Checkpoints

Checkpoints allow you to restore different states of your chat conversations. You can revert edits and return to specific points in your chat. Selecting a checkpoint in VS Code reverts both workspace changes and chat history to that point.

Interactive Tool Picker

Previously, the agent had access to all built-in VS Code tools and those exported by enabled MCP servers. Now, you have fine-grained control over which tools a specific agent can use. A new drop-down menu with a tree structure allows you to enable or disable individual tools. This reduces context window usage by excluding irrelevant tools and increases accuracy by letting the agent choose from a shorter list.

Task Lists

The agent decomposes assigned tasks into actionable steps and executes them sequentially. You can now view these steps after the agent finishes planning, and the task list updates as the agent executes the plan.

System Notifications for User Approval

Agents may run for extended periods and sometimes require user approval to execute tools. VS Code now sends a system notification when your input is needed, allowing you to approve or decline the action and then continue with your work.

Currently, you need to switch to VS Code to approve or decline actions after receiving a notification. In future versions, you’ll be able to take action directly from the notification.

Worktrees Support in VCS

VS Code now supports git worktrees in the Source Control Repositories view. This allows you to have several agents working in parallel on a single codebase by checking out separate branches to the same working copy. You can list available worktrees and open one as needed.

Start Coding Agent from the Chat

You can now manage a coding agent session from a dedicated chat editor. This allows you to follow the progress of the coding agent, provide follow-up instructions, and view the agent's responses in the same editor.

You can now view all running agent sessions (local and remote) in a single view.

This improves integration with autonomous coding agents and increases their steerability.

Source: https://code.visualstudio.com/updates/v1_103

2. GPT-4o removed from Copilot Chat

Starting 6 Aug 2025, GPT-4o is no longer available in Copilot Chat; GPT-4.1 is now the default. Code completions still use GPT-4o. Teams relying on GPT-4o-specific behavior should retest prompts.

Source: https://github.blog/changelog/2025-08-06-deprecation-of-gpt-4o-in-copilot-chat/

3. Opus 4.1 preview in Copilot

Anthropic has released Opus 4.1, an incremental update to its largest model. It is available in VS Code for Enterprise and Pro+ plans; Business licenses are excluded.

Opus 4.1 excels at agentic tasks, but Copilot offers it only in Ask-only mode, limiting its benefits for autonomous workflows.

Source: https://github.blog/changelog/2025-08-05-anthropic-claude-opus-4-1-is-now-in-public-preview-in-github-copilot/

4. Granular path-based agent instructions

Copilot Coding Agent now supports path-scoped instructions. Place .instructions.md files under .github/instructions with YAML front matter that targets specific files or directories. This enables directory-level guidance without bloating prompts.

Source: https://github.blog/changelog/2025-07-23-github-copilot-coding-agent-now-supports-instructions-md-custom-instructions/

Cursor

1. Cursor update: steerability and control

Agent Steerability During Sessions

The agent can run for extended periods and is now most steerable during these times. You can use Option+Enter to queue a message, which the agent will process at the next suitable point, typically after a tool call. Alternatively, use Cmd+Enter to send a message immediately, interrupting the agent's current activity. This allows you to guide the process without stopping and restarting the session.

Sidebar Panel Shows All Agents

You can now view all agents—both autonomous agents running remotely and collaborative agents running locally—in a single sidebar panel.

Faster agent startup time

Background agents startup time substantially improved - from approximately 80s to 20s.

Agents Can Use Native Terminal

Agents can now use your native terminal. Previously, commands were run in the chat window, so you could not see the commands and their output in your usual terminal.

Observability: Context and Usage Limits

There are two improvements to observability. You can now see the context usage for a specific chat session, as well as track how much of your overall usage limit you have spent in the current month.

Source: https://cursor.com/changelog/1-4

2. Cursor CLI: multi-IDE support

Cursor has released a CLI version of its agent for the terminal, JetBrains IDEs, Visual Studio, and CI/CD pipelines.

The CLI offers a different model set—Opus 4.1, GPT-5, and Claude Sonnet 4.0—and integrates with your Cursor subscription to give unified access to OpenAI and Anthropic models.

The tool is in beta, lacks some features found in Codex CLI and Claude Code, and its security guardrails are still evolving, so test in a restricted environment.

Source: https://cursor.com/blog/cli

The views expressed in the articles on this site are solely those of the authors and do not necessarily reflect the opinions or views of EngX Space or its members.