Operate real apps
Open apps, navigate feeds, dismiss popups, type into fields, tap controls, and move across apps when the workflow needs it.
iPhone agent
An iPhone agent is an AI system that can see a phone screen, reason about what to do next, and operate the phone through taps, swipes, typing, app switching, and screenshots. TapKit supplies the real-device control layer.

Device
Real iPhone
The agent works through the same App Store apps, accounts, notifications, and permissions a person sees.
Agent loop
See, decide, act
Capture the screen, let the model choose the next step, then execute taps, swipes, typing, and app navigation.
Interfaces
MCP, API, SDK
Use TapKit from Claude, Codex, custom agents, HTTP clients, or the Python SDK.
Definition
An iPhone agent is not just a chatbot on your phone. It is an agent that can observe the phone, decide on actions, and operate the same mobile interface a human would use.
That matters because many valuable workflows only exist inside mobile apps. Public APIs may be missing, limited, delayed, or unavailable. Simulators cannot reproduce every production account, notification, app install, payment prompt, or verification flow.
TapKit turns your own iPhone into an API surface for agents. The model gets visual context from screenshots and streaming, while TapKit executes the actions on the physical device.
Capabilities
Open apps, navigate feeds, dismiss popups, type into fields, tap controls, and move across apps when the workflow needs it.
Work through push prompts, SMS codes, iCloud files, Apple Pay checks, and app-specific UI that never appears on the public web.
Watch the phone stream, inspect session history, interrupt the agent, or hand the phone back to a human when needed.
Start with MCP in an agent client, then graduate to the REST API or Python SDK for repeatable production workflows.
Fit
Positioning
On-device assistants are useful when the phone itself owns the agent experience. TapKit is different: it lets the agent you are already building use a real iPhone as an external tool.
That means Claude, Codex, custom CUA agents, eval harnesses, QA systems, and internal automation workers can all use the same phone-control layer without being rebuilt as an iOS app.