Desktop Automation (Windows)
List, focus, minimize, maximize, restore, or close top-level windows.
windows_windowGives the agent control of any visible desktop window the same way Alt-Tab + the title bar buttons do. The agent picks a window by a fragment of its title (the 'list' op helps disambiguate when several windows match), then asks to focus / minimize / maximize / restore / close it — and you approve each step before it runs.
Implements: list (process name + PID + visible title), focus (bring to foreground via SetForegroundWindow), minimize/maximize/restore (ShowWindow), close (WM_CLOSE). User32 is invoked via PowerShell P/Invoke, so the only runtime dep is PowerShell itself — no native libraries to install. Title matching is a case-insensitive substring; the first match is acted on, so be specific or use list to disambiguate. Pass apply=true to execute; otherwise the tool returns a dry-run plan.
When a user asks:
Bring the Excel budget workbook to the front and maximize it.
the agent calls the tool:
windows_window(operation="focus", title="Budget", apply=true) then windows_window(operation="maximize", title="Budget", apply=true)and gets back: two approval prompts, then the matching window is brought forward and maximized.
Set these before calling the tool. Values marked required must be present or the tool call will fail.
swarmai.tools.windows.enabled required Master switch for the Windows tool category.
swarmai.tools.windows.window.timeout optional Per-PowerShell-invocation timeout for window ops. Default 10s.
Wire this tool into a SwarmAI crew. Use the YAML DSL for declarative workflows, or the Java builder API when you want full programmatic control.
YAML DSL
# focus-and-tile.yaml
name: focus-and-tile-crew
process: SEQUENTIAL
agents:
- id: operator
role: Window Operator
goal: Find a target window and bring it forward
tools:
- windows_window
tasks:
- id: focus-task
agent: operator
description: Find a window matching 'Budget', bring it to the foreground, and maximize it.Java
import ai.intelliswarm.swarmai.agent.Agent;
import ai.intelliswarm.swarmai.task.Task;
import ai.intelliswarm.swarmai.swarm.Swarm;
import ai.intelliswarm.swarmai.swarm.SwarmOutput;
import ai.intelliswarm.swarmai.process.ProcessType;
import ai.intelliswarm.swarmai.tool.windows.WindowsWindowTool;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
@Autowired ChatClient chatClient;
@Autowired WindowsWindowTool windowsWindowTool;
Agent operator = Agent.builder()
.role("Window Operator")
.goal("Find a target window and bring it forward")
.chatClient(chatClient)
.tool(windowsWindowTool)
.build();
Task operatorTask = Task.builder()
.description("Find a window matching 'Budget', focus it, then maximize.")
.agent(operator)
.build();
SwarmOutput result = Swarm.builder()
.agent(operator)
.task(operatorTask)
.process(ProcessType.SEQUENTIAL)
.build()
.kickoff();Real scenarios where agents put this tool to work.
Implementation lives at swarmai-tools/src/main/java/ai/intelliswarm/swarmai/tool/windows/WindowsWindowTool.java in the swarm-ai repository.