What Makes SFA Architecture Unique
Single-File Agents (SFA) represent a approach to AI tool development that prioritizes simplicity, portability, and functionality in a single Python file. Unlike traditional multi-file applications that require complex dependencies and configuration, SFAs encapsulate everything needed for operation in one coherent unit.
Core Architectural Principles
Self-Contained Design
The entire agent operates from a single Python file, eliminating dependency management challenges and making deployment dramatically simpler. This same file, or the ‘SFA’, accrues more tools over time, is able to develop their own workflows after the flow has been activated, and in each phase of the ever-changing loops, they tend to decide to think about their next moves.
Variable Configuration
SFAs use a modular JSON configuration approach that allows for quick adaptation to different use cases without modifying the core codebase. The JSON is really just a normal prompt you’d message to a chat LLM, but broken into structured categories and default to included where to find context for their task as well as where to output the results of their work.
Function-First Approach
The architecture prioritizes clear functional boundaries that make extending agent capabilities straightforward while maintaining core stability. The idea is to take any cognitive effort that isn’t directly related to the task at hand, and move it into a function and part of the code, while designing it in a way that implies rather than directs.
LLM Integration
SFAs leverage large language models through standardized API calls, providing a consistent interface while allowing different models to be used interchangeably. The SFA decide to take a more orchestrating approach to a task flow, looking at the big picture, delegating tasks to other agents who are sometimes called as tools and other times called to work in parallel. They’ve also got the option to work in sync, collaborating and communicating with the other model of your choice in real time.
Key Components
Configuration Handler
The configuration system processes input JSON to determine agent behavior, allowing runtime customization without code changes. That is to say, literally nothing other than a new SFA file is needed to completely change the behavior of the agent from something like deep research to reviewing and creating meta data for images, to sending out updates on different Discord and Slack channels as needed.
API Connector
A standardized interface for communicating with various LLM services (OpenAI, Anthropic, etc.) that abstracts away provider-specific implementation details. In fact, tools like LiteLLM offer a single API call, or use of their SDK, to have access to over 100 different models. So next time you need a 1M token window for a giant document you can call in Gemini 2.5 Pro, which is still free, or call in o3-mini(high) and get an interactive digital animation of a ball bouncing around the inside of a 4D hyper cube with perfect physics.
Prompt Engineering Layer
A sophisticated system for constructing and managing prompts that incorporates variables, templates, and context management. Whether the AI is opting to add another phase of up to 20 loops and stay in their current content to keep working, or writing out details for the next LLM called in via their phase summary output document, every layer of what takes an SFA from generic to specific is a layered and carefully crafted prompt.
Data Processing Pipeline
Handles input and output formatting, ensuring consistent data flow throughout agent operations. From documents to imagery to markdown and JSON, the SFA is able to process and output in a wide variety of formats thanks to how the SFA is constructed.
Result Manager
Processes responses from the LLM and formats them according to project specifications. Again, here we’re taking cognitive load off of our lead LLM wherever we can. Even the activation of a new workflow using the setup script results in a full README document all about the workflow, without our LLM, or human, having to do any of the heavy lifting.
Implementation Guidelines
When implementing the SFA architecture, follow these guidelines:
- Maintain the single-file principle even when tempted to split functionality
- Validate the SFA by coming up with two completely different workflows
- Run the workflows back to back, a truly modular system will be handle both with ease
- Use the standard JSON configuration format for all variable aspects
- They’re nothing more than a path for context, a location for a system prompt, and another for the task prompt
- Better yet, describe what you need to have done to a chat LLM and let them fill out the JSON template for you
- Implement the core functional interfaces consistently
- Does the SFA code look a little boring? Good
- With each addition, refactoring, or change, review the codebase in its entirety
- Keep API-specific code isolated and abstracted
- At first it might not seem easy to keep specifics out of the main SFA
- Give it time and contemplation and you’ll soon see that the logic is a lot easier than you thought
- Document all functions and components thoroughly
- Not just one technical document, but a few at different levels of detail, comprehension, and complexity
- Create a rule wherever possible to end any work on the SFA by updating all documents in the project directory