在 Gemini CLI 中运行交互式命令
Run interactive commands in Gemini CLI

原始链接: https://developers.googleblog.com/en/say-hello-to-a-new-level-of-interactivity-in-gemini-cli/

## Gemini CLI 现在支持交互式 Shell 命令 Gemini CLI 得到了显著增强,增加了伪终端 (PTY) 支持,允许用户*直接*在 CLI 中运行复杂的交互式命令——无需切换到单独的终端!这意味着像 `vim`、`top` 和交互式 `git rebase -i` 这样的工具现在可以无缝运行。 此前,在 Gemini 之外运行这些命令会丢失宝贵的上下文。现在,一个虚拟终端会向您的屏幕传输实时的双向 Shell 会话,捕获文本、颜色和光标位置。输入会直接发送到进程,并且终端会适应窗口大小调整。 此功能在 v0.9.0 中默认启用(使用 `npm install -g @google/gemini-cli@latest` 进行升级)。它通过将所有内容保留在 Gemini 的上下文中,并提供更自然的命令行体验,从而大大改善工作流程。团队鼓励在他们的 GitHub 仓库中提供反馈,以便他们继续完善集成。

## Gemini CLI:褒贬不一 谷歌Gemini CLI交互式命令的推出引发了大量讨论,但总体上是负面的。虽然该功能——允许在CLI内使用shell——被视为一个有价值的进步,但许多用户报告Gemini模型本身不可靠。 常见的抱怨包括Gemini难以处理基本任务,如文件路径,持续忘记系统指令,以及过度注释代码。一些用户强调了一个令人沮丧的问题:达到速率限制会强制切换到能力较弱的模型(Flash),且没有简单的恢复方法。 许多人认为Gemini CLI落后于Claude Code等竞争对手,尽管Gemini的网页界面提供了更好的模型质量。一些人推测谷歌优先开发用于提升简历的功能,而非真正满足用户需求。另一些人则认为该工具更侧重于维持表面上的良好形象,而非提供真正有用的产品。 尽管存在批评,但人们仍然希望Gemini 3.0能够改进代理编码体验,并且一些人认为CLI与node-pty等工具的集成在特定、短期的任务中具有潜力。
相关文章

原文

We're excited to announce an enhancement to Gemini CLI that makes your workflow more powerful and familiar. We've upgraded the terminal to allow you to run complex, interactive commands—like vim for editing, top for monitoring, or even an interactive git rebase -i—all directly within Gemini CLI. You no longer have to jump to a separate terminal or deal with an agentic CLI that “hangs” for interactive commands. Everything stays right where you are.

Keeping everything in context

This matters because everything now remains within Gemini CLI’s context. Previously, you would have had to exit Gemini CLI to run interactive shell commands. More importantly, these commands were being run outside Gemini CLI’s context. By introducing pseudo-terminal (PTY) support, commands that require rich capabilities– such as text editors, system monitors, or reliance on terminal control codes, can now all be run from within Gemini CLI and within its context.

How it works: Serializing the terminal state

Now, when you run a shell command, Gemini CLI spawns a new process within a pseudo-terminal in the background, leveraging the node-pty library. The PTY acts as an intermediary, providing the necessary interface for the operating system to recognize the session as a terminal. This allows applications and commands to be run as they were naturally designed to.

So how does this virtual terminal running in the background show up on your screen? Think of it like a video stream. Our new serializer takes a snapshot of the pseudo terminal at every moment—capturing every piece of text, every color, and even the cursor's position. These snapshots are then streamed to you, allowing you to see and interact with the terminal application in real-time. It's not just a stream of text; it's a live feed.

Full two-way interaction

This new architecture enables two-way communication. We've added new capabilities to write input to the terminal and even resize it on the fly. When you type, your keystrokes are sent to the running process, and when you resize your window, the application inside Gemini's shell will adapt its layout, just like in a native terminal. You can focus on the terminal by pressing ctrl+f.

We've also improved our output handling to correctly render colorful terminal output, so you can enjoy your favorite command-line tools in all their glory.

Getting started with the interactive shell

The new interactive shell is enabled by default in Gemini CLI as of v0.9.0.

Upgrade to the latest version using the following command:

npm install -g @google/gemini-cli@latest

For more information please refer to the official Gemini CLI documentation.

Here are a few examples of the type of commands you can now run with the interactive shell:

  • Edit code with vim, nvim or nano.
  • Manage your commits with interactive git commands.
  • Use interactive REPLs for your favorite languages.
  • Run full-screen terminal applications like htop or mc.
  • Effortlessly navigate interactive setup scripts like npm init or ng new.
  • Respond to interactive prompts for certain gcloud commands.

This is a major step for our shell integration, and we are actively working to refine input handling across all platforms. We encourage you to share your feedback on our GitHub repository if you encounter any inconsistencies.

Try it out and let us know what you think!

联系我们 contact @ memedata.com