π FixLoop β AI Agent Loop for Self-Correcting Code
An AI agent that writes code, runs tests, spots failures, and retries until everything passes. Supports OpenAI, Gemini, Groq, and Ollama.
π― What It Does
FixLoop is an AI agent loop that writes code from a natural-language description, runs pytest tests to verify, spots failures, feeds the error output back to the LLM, and retries until all tests pass.
$ python main.py --challenge fibonacci
ββββββββββββββββββββββββββββββββββββββββββββββ
β π― Challenge: fibonacci β
β Trying iteration 1... β
β β 2/3 tests passed β
β β Feeding errors back to LLM... β
β Trying iteration 2... β
β β
3/3 tests passed! β
β πΎ Saved to solutions/fibonacci.py β
ββββββββββββββββββββββββββββββββββββββββββββββ
π§± Tech Stack
| Component | Technology |
|---|---|
| Language | Python 3.10+ |
| LLMs | OpenAI SDK, google-genai SDK, Groq SDK, Ollama |
| Testing | pytest |
The LLM client abstraction layer supports 4 providers behind a common generate() interface.
ποΈ Architecture
main.py # argparse CLI entry point
fixloop/
βββ runner.py # FixLoop class: tempdir β generate β test β loop
βββ coder.py # LLM prompt to write code from description
βββ tester.py # Runs pytest --tb=short, parses results
βββ debugger.py # LLM prompt with code + errors to fix bugs
βββ llm.py # LLMClient abstraction (4 providers)
βββ utils.py # Strip markdown code fences from LLM output
βββ challenges.py # Challenge dataclass + built-in challenges dict
Each module is under 100 lines β a deliberate design goal for maintainability.
π The FixLoop Algorithm
1. Create temp directory
2. Prompt LLM to write {filename}.py from the challenge description
3. Run pytest on the generated file
4. Parse pytest output for passed/failed/error counts
5. If all pass β save solution, exit 0
6. If any fail β prompt LLM (debugger) with:
- Generated code
- pytest error output (stdout + stderr)
7. LLM returns fixed code (may be same as before)
8. Go to step 3, up to max_iterations
9. If exhausted β save best attempt, exit 1
π Built-in Challenges
| Challenge | Description |
|---|---|
fibonacci | Return nth Fibonacci number |
prime_checker | Check if a number is prime |
valid_parentheses | Validate balanced parentheses |
two_sum | Find two indices summing to target |
valid_sudoku | Validate a 9x9 Sudoku board |
Each challenge includes:
- A natural-language description (fed to the coder)
- 3+ pytest test cases (fed to the tester)
- An entry point function name
Custom challenges can be added via the Challenge dataclass.
π Provider Support
# OpenAI (default)
python main.py --challenge fibonacci
# Gemini
python main.py --challenge two_sum --provider gemini
# Groq
python main.py --challenge valid_parentheses --provider groq
# Ollama (local)
python main.py --challenge prime_checker --provider ollama --model llama3.1
The LLMClient abstraction wraps different SDKs behind a unified generate(system_prompt, user_prompt) interface, making provider switching transparent.
π§Ή Key Detail: Markdown Fence Stripping
LLMs love wrapping code in markdown code fences:
```python
def fibonacci(n):
...
```
The utils.py module strips these automatically with a regex, along with any surrounding explanation text from the LLM response.
π Quick Start
pip install fixloop
# Run with OpenAI
export OPENAI_API_KEY=sk-...
python main.py --challenge fibonacci
# Run with Ollama (local)
python main.py --challenge two_sum --provider ollama --model llama3.1
# Custom challenge
python main.py --challenge my_challenge --max-iterations 10
π‘ Why Itβs Interesting
FixLoop is a minimal, focused implementation of the βwrite code β test β fix β retestβ loop that powers more complex AI coding agents. It strips away everything non-essential: each module is <100 lines, the loop is explicit and inspectable, and the provider abstraction is clean. Despite its simplicity, it solves a genuinely hard problem β LLMs rarely write perfect code on the first try, and the self-correcting loop dramatically improves success rates. Itβs a great reference for understanding how AI coding agents actually work under the hood.