33 KiB

Raw Permalink Blame History

Simplified Phase 3: Todo-Based Autonomous Development System

Status: 📋 PLANNED Approach: n8n + OpenHands SDK (direct integration, no SSH) Duration: 3-4 hours

🎯 Vision Statement

Create a simple, autonomous system where:

User pushes prompt: "Create an MVP [app type]"
OpenHands analyzes and creates todos
n8n loops through todos: Execute → Test → Commit → Next
Result: Complete full-stack application built step-by-step

Goal: Prove OpenHands can build full-stack applications autonomously through structured todos

1. Architecture Overview

Current vs Simplified Approach

Aspect	Current Phase 3	Simplified Plan
Integration	SSH wrapper + CLI	Direct SDK (Python)
Workflow	11 nodes (complex)	6 nodes (minimal)
Approach	Build + Retry	Todo-based loops
Complexity	High	Low
Proof	Build tests	Full-stack app

System Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Gitea Repository                         │
│  Prompt Push → Code Changes → Build Artifacts               │
└────────────────────┬────────────────────────────────────────┘
                     │
                     │ Webhook
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                    n8n Workflow (6 nodes)                    │
│  [1] Webhook → [2] Extract Data → [3] Get Next Todo        │
│         ↓                                                    │
│  [4] Execute Todo (SDK) → [5] Test → [6] Commit/Push        │
│         │                            │                      │
│         └──────────── Loop ←─────────┘                      │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│               OpenHands SDK (Python)                         │
│  • Creates todos from prompt                                │
│  • Executes todos autonomously                              │
│  • Returns structured results                               │
│  • Built/test cycle for each todo                           │
└─────────────────────────────────────────────────────────────┘

Data Flow

Git Push
  ↓
n8n Webhook Triggered
  ↓
Extract: repo_name, branch, commit_sha, prompt
  ↓
Check: Are there todos to execute?
  ├─ YES → Get next todo from prompt
  │         ↓
  │      Execute with OpenHands SDK
  │         ↓
  │      Test the changes
  │         ↓
  │      Commit & push to Gitea
  │         ↓
  │      Loop back to check for next todo
  │
  └─ NO → All todos complete! Final commit

2. Todo Management System

How Todos Are Created

Step 1: Initial Prompt Analysis When user pushes initial prompt, OpenHands SDK analyzes and creates todos:

# Task sent to OpenHands SDK
task = """
Analyze this prompt: "Create a full-stack todo app with React + Node.js + PostgreSQL"

Create a TODO.md file with structured tasks. Each task should:
1. Be atomic (one specific feature)
2. Include build/test steps
3. Be executable in isolation
4. Have clear completion criteria

Format:
# Development Tasks

## 1. [Category] Task Name
**Description:** Brief explanation
**Files to create:** list of files
**Tests to run:** test commands
**Expected outcome:** what success looks like

## 2. Next Task...
"""

Step 2: TODO.md Structure Example

# Development Tasks for Todo App MVP

## 1. Backend API Setup
**Description:** Initialize Node.js Express API with PostgreSQL
**Files to create:**
  - package.json
  - server.js
  - database/schema.sql
**Tests to run:** npm install && npm test
**Expected outcome:** API server starts, connects to DB

## 2. Frontend React App
**Description:** Create React app with routing
**Files to create:**
  - src/App.js
  - src/components/TodoList.js
  - package.json
**Tests to run:** npm install && npm start
**Expected outcome:** React app runs, displays todo list

## 3. API Integration
**Description:** Connect frontend to backend
**Files to create:**
  - src/services/api.js
  - src/hooks/useTodos.js
**Tests to run:** npm test
**Expected outcome:** Can fetch/create todos via API

## 4. Database Integration
**Description:** Implement CRUD operations
**Files to create:**
  - server/routes/todos.js
  - server/models/Todo.js
**Tests to run:** npm run test:integration
**Expected outcome:** All CRUD operations work

## 5. Styling & UI
**Description:** Add CSS/styling for better UX
**Files to create:**
  - src/App.css
  - src/components/TodoList.css
**Tests to run:** npm run build
**Expected outcome:** App builds and looks good

## 6. Final Integration Test
**Description:** End-to-end testing
**Files to create:**
  - test/e2e/todo-app.test.js
**Tests to run:** npm run test:e2e
**Expected outcome:** All features work end-to-end

How n8n Tracks Todos

Pattern: Store todo state in n8n's staticData (persistent between runs)

// In Get Next Todo node
const workflow = $workflow;

// Initialize todo state
workflow.staticData = workflow.staticData || {};
workflow.staticData.todos = workflow.staticData.todos || {};

// Parse TODO.md to get current todo index
const currentIndex = workflow.staticData.todos.current_index || 0;
const todos = workflow.staticData.todos.list || [];

if (currentIndex < todos.length) {
  const nextTodo = todos[currentIndex];

  return {
    action: 'execute',
    todo: nextTodo,
    index: currentIndex,
    total: todos.length,
    status: 'IN_PROGRESS'
  };
} else {
  // All todos complete
  return {
    action: 'complete',
    message: 'All todos executed successfully',
    total_todos: todos.length,
    status: 'SUCCESS'
  };
}

Data Persistence Pattern:

staticData.todos.list: Array of all todos
staticData.todos.current_index: Current todo number
staticData.todos.results: Array of results from each todo
staticData.todos.completed: Boolean flag

3. Iterative Development Loop

The 6-Node Workflow

[1] Git Push (Gitea Webhook)
     ↓
[2] Extract Repo Info (Code Node)
     ↓
[3] Get Next Todo (Code Node)
     ↓
[4] Execute Todo (Code Node → calls SDK)
     ↓
[5] Test Changes (Code Node)
     ↓
[6] Commit & Push (HTTP Node to Gitea)
     ↓
     └─ Loop back to [3]

Node Details

Node 1: Git Push (Webhook)

Trigger: Gitea push event
Input: Repository changes, commit data
Output: Push event payload

Node 2: Extract Repo Info (Code)

const payload = $json;

return {
  repo_name: payload.repository.name,
  repo_owner: payload.repository.owner.name || payload.repository.owner.login,
  branch: payload.ref.replace('refs/heads/', ''),
  commit_sha: payload.after,
  commit_message: payload.head_commit.message,
  pusher: payload.pusher.name,

  // Check if this is an initial prompt or a build push
  is_initial_push: payload.head_commit.message.startsWith('MVP Prompt:'),

  // Extract prompt from commit message
  prompt: extractPrompt(payload.head_commit.message)
};

Node 3: Get Next Todo (Code)

const repoInfo = $json;
const workflow = $workflow;

// Initialize or get todos
workflow.staticData.todos = workflow.staticData.todos || {};

if (repoInfo.is_initial_push) {
  // First push - extract prompt and create todos
  const prompt = repoInfo.prompt;

  const createTodosTask = `
Analyze this MVP prompt: "${prompt}"

Create a comprehensive TODO.md file with development tasks.
Each task should be atomic and executable.

Return the TODO.md content as JSON:
{
  "tasks": [
    {
      "title": "Task name",
      "description": "Description",
      "category": "Backend|Frontend|Integration|Testing",
      "files": ["file1", "file2"],
      "commands": ["npm install", "npm test"],
      "expected_outcome": "What success looks like"
    }
  ]
}
  `;

  // Store in staticData for next node
  workflow.staticData.todos.pending = createTodosTask;
  workflow.staticData.todos.current_index = 0;
  workflow.staticData.todos.status = 'CREATING_TODOS';

  return {
    action: 'create_todos',
    prompt: prompt,
    status: workflow.staticData.todos.status
  };
} else if (workflow.staticData.todos.current_index !== undefined) {
  // Continue with existing todos
  const index = workflow.staticData.todos.current_index;
  const todos = workflow.staticData.todos.list || [];

  if (index < todos.length) {
    return {
      action: 'execute_todo',
      todo: todos[index],
      index: index,
      total: todos.length,
      status: 'IN_PROGRESS'
    };
  } else {
    return {
      action: 'complete',
      message: 'All todos completed successfully',
      status: 'SUCCESS'
    };
  }
} else {
  return {
    action: 'error',
    message: 'No todos found. Please push MVP prompt first.',
    status: 'ERROR'
  };
}

Node 4: Execute Todo (Code - Calls SDK)

const todoData = $json;
const repoInfo = $node["Extract Repo Info"].json;
const workflow = $workflow;

// Handle different actions
if (todoData.action === 'create_todos') {
  // Call OpenHands to create TODO.md
  const sdkOutput = callOpenHandsSDK(todoData.prompt);

  // Parse and store todos
  workflow.staticData.todos.list = sdkOutput.tasks;
  workflow.staticData.todos.current_index = 0;
  workflow.staticData.todos.status = 'READY';

  return {
    action: 'next_todo',
    message: 'Todos created, starting execution',
    total_todos: sdkOutput.tasks.length
  };
} else if (todoData.action === 'execute_todo') {
  // Execute the current todo
  const task = `
Execute this development task:

**Task:** ${todoData.todo.title}
**Description:** ${todoData.todo.description}
**Category:** ${todoData.todo.category}

**Steps:**
1. Create/modify the required files
2. Run the specified commands
3. Ensure the expected outcome is achieved
4. If tests fail, fix them
5. Commit your changes

**Files to work with:**
${todoData.todo.files.join(', '))}

**Commands to run:**
${todoData.todo.commands.join('\n')}

**Expected outcome:**
${todoData.todo.expected_outcome}

Current directory: /workspace/${repoInfo.repo_name}
  `;

  const sdkOutput = callOpenHandsSDK(task);

  // Store result
  workflow.staticData.todos.results = workflow.staticData.todos.results || [];
  workflow.staticData.todos.results.push({
    todo: todoData.todo,
    output: sdkOutput,
    success: sdkOutput.success,
    timestamp: new Date().toISOString()
  });

  // Increment index for next iteration
  workflow.staticData.todos.current_index++;

  return {
    action: 'todo_executed',
    todo: todoData.todo,
    index: todoData.index,
    success: sdkOutput.success,
    output: sdkOutput
  };
}

Node 5: Test Changes (Code)

const executeResult = $json;

if (executeResult.success) {
  return {
    status: 'SUCCESS',
    message: `Todo "${executeResult.todo.title}" completed successfully`,
    commit_message: `✅ Complete: ${executeResult.todo.title}`,

    // For Gitea status
    state: 'success',
    description: `Todo ${executeResult.index + 1}/${executeResult.todo.total}: ${executeResult.todo.title}`
  };
} else {
  // Mark as failed but continue (for debugging)
  return {
    status: 'FAILED',
    message: `Todo "${executeResult.todo.title}" failed`,
    commit_message: `❌ Failed: ${executeResult.todo.title}`,

    state: 'failure',
    description: `Todo ${executeResult.index + 1}/${executeResult.todo.total}: ${executeResult.todo.title} - FAILED`,
    error: executeResult.output.error || 'Unknown error'
  };
}

Node 6: Commit & Push (HTTP Node)

const testResult = $json;
const repoInfo = $node["Extract Repo Info"].json;

// Create commit
POST https://git.oky.sh/api/v1/repos/{owner}/{repo}/git/commits

{
  "message": testResult.commit_message,
  "tree": getCurrentTreeSha(),
  "parents": [repoInfo.commit_sha]
}

// Update status
POST https://git.oky.sh/api/v1/repos/{owner}/{repo}/statuses/{commit_sha}
{
  "state": testResult.state,
  "description": testResult.description,
  "context": "openhands/todo-build",
  "target_url": "https://n8n.oky.sh"
}

// Return to loop
return {
  loop: true,
  should_continue: testResult.state === 'success' || // Continue on success
                   (testResult.state === 'failure' && // Or even on failure for debugging
                    $workflow.staticData.todos.current_index < $workflow.staticData.todos.list.length)
};

4. OpenHands SDK Integration

Direct SDK Call (No SSH)

Current Setup:

SDK wrapper: /home/bam/openhands-sdk-wrapper-fixed.py
Python-based
Returns JSON output
No SSH needed

How to Call from n8n:

// In Code Node (Execute Todo)
function callOpenHandsSDK(task, workspace = "/home/bam") {
  // Using Python subprocess or HTTP call to the SDK wrapper
  const { execSync } = require('child_process');

  const command = `python3 /home/bam/openhands-sdk-wrapper-fixed.py "${task}" --workspace ${workspace} --json`;

  try {
    const output = execSync(command, { encoding: 'utf-8' });
    const result = JSON.parse(output);

    return {
      success: result.success,
      files_created: result.files_created || [],
      files_copied: result.files_copied || [],
      error: result.error || null,
      log_output: result.log_output || []
    };
  } catch (error) {
    return {
      success: false,
      error: error.message,
      files_created: [],
      files_copied: []
    };
  }
}

Alternative: HTTP Call to SDK Service

If we want to avoid subprocess, we can create a simple HTTP service:

# sdk-server.py
from flask import Flask, request, jsonify
from openhands_sdk_wrapper_fixed import run_openhands_task

app = Flask(__name__)

@app.route('/execute', methods=['POST'])
def execute():
    data = request.json
    task = data.get('task')
    workspace = data.get('workspace', '/home/bam')

    result = run_openhands_task(task, workspace)
    return jsonify(result)

if __name__ == '__main__':
    app.run(port=5000, host='0.0.0.0')

Then call from n8n:

// HTTP Request node
POST http://localhost:5000/execute
Content-Type: application/json

{
  "task": "Create TODO.md from prompt: ...",
  "workspace": "/home/bam/mvp-project"
}

Why Direct SDK is Better:

✅ No SSH overhead
✅ Structured JSON output
✅ Faster execution
✅ Direct Python control
✅ Easier debugging
❌ Need to manage Python environment (but we already have this)

5. Full-Stack App Example

Example: Todo App MVP

Initial Push

Commit Message: MVP Prompt: Create a full-stack todo app with React + Node.js + PostgreSQL

What Happens

First n8n Loop:

Node 2 extracts: prompt = "Create a full-stack todo app..."
Node 3: Detects is_initial_push = true
Node 4: Calls OpenHands to create TODO.md
Node 5-6: Create TODO.md file and commit it

TODO.md Created:

# Development Tasks for Todo App MVP

## 1. Backend API Setup
**Description:** Initialize Node.js Express API
**Files:** package.json, server.js, database/schema.sql
**Commands:** npm install, npm test

## 2. Frontend React App
**Description:** Create React app
**Files:** src/App.js, src/components/TodoList.js
**Commands:** npm install, npm start

[... 4 more todos ...]

Subsequent Loops:

Loop 1:
  Get Todo #1: Backend API Setup
  OpenHands creates: package.json, server.js, schema.sql
  Test: npm install && npm test
  Commit: "✅ Complete: Backend API Setup"

Loop 2:
  Get Todo #2: Frontend React App
  OpenHands creates: src/App.js, components, package.json
  Test: npm install && npm start
  Commit: "✅ Complete: Frontend React App"

[... continues for all 6 todos ...]

Loop 7:
  Get Next Todo: None (all complete)
  Commit: "🎉 MVP Complete: Todo App - All 6 todos finished"
  Status: SUCCESS

Expected Outcomes by Loop

Loop	Todo	What OpenHands Does	Test	Commit Message
1	Create TODOs	Analyzes prompt, creates TODO.md	N/A	"📋 TODOs created from prompt"
2	Backend Setup	Creates package.json, server.js, DB schema	npm install && npm test	"✅ Complete: Backend API Setup"
3	Frontend App	Creates React components, routing	npm install && npm start	"✅ Complete: Frontend React App"
4	API Integration	Connects frontend to backend	npm test	"✅ Complete: API Integration"
5	Database CRUD	Implements todo operations	npm run test:integration	"✅ Complete: Database Integration"
6	Styling & UI	Adds CSS, improves UX	npm run build	"✅ Complete: Styling & UI"
7	E2E Testing	Creates end-to-end tests	npm run test:e2e	"✅ Complete: Final Integration Test"
8	Final	All todos complete	All tests pass	"🎉 MVP Complete: Todo App"

Proof of Concept: Why This Works

Evidence 1: Todo Structure

Each todo is atomic and testable
Clear expected outcomes
Build/test cycle for each

Evidence 2: OpenHands Capabilities

Can create files
Can run commands
Can fix errors
Can commit changes

Evidence 3: n8n Loop

Persists state between iterations
Tracks progress (current_index)
Automatically advances to next todo
Stops when complete

Evidence 4: Full-Stack Coverage

Backend: Node.js + Express + PostgreSQL
Frontend: React + Routing
Database: Schema + CRUD
Integration: API calls
Testing: Unit + E2E
Deployment: Build scripts

6. n8n Workflow Design

Minimal 6-Node Structure

┌──────────────┐
│  Node 1:     │
│ Git Push     │
│ (Webhook)    │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Node 2:     │
│ Extract      │
│ Repo Info    │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Node 3:     │
│ Get Next     │
│ Todo         │
└──────┬───────┘
       │
       ├─ Create Todos ──┐
       │                 │
       │                 ▼
       │            ┌──────────────┐
       │            │  Node 4:     │
       │            │ Execute      │
       │            │ (calls SDK)  │
       │            └──────┬───────┘
       │                 │
       ▼                 ▼
┌──────────────┐    ┌──────────────┐
│  Node 5:     │    │  Node 4:     │
│ Test         │    │ Execute      │
│ Changes      │    │ (continues)  │
└──────┬───────┘    └──────────────┘
       │
       ▼
┌──────────────┐
│  Node 6:     │
│ Commit &     │
│ Push         │
└──────┬───────┘
       │
       └─ Loop back to Node 3

Node Configuration Summary

Node	Type	Purpose	Complexity
1	Webhook	Receive Gitea push	Simple
2	Code	Extract repo/prompt data	Simple
3	Code	Get next todo or finish	Medium
4	Code	Call OpenHands SDK	Medium
5	Code	Test and format results	Simple
6	HTTP	Commit to Gitea	Simple

Total Complexity: Much lower than 11-node design!

Data Preservation Pattern

// Each node preserves previous data
const current = $json;  // Current node output
const previous = $node["Previous Node Name"].json;  // Preserve data

return {
  ...previous,  // ← PRESERVE ALL PREVIOUS DATA
  current_data: current
};

Critical: This ensures we don't lose:

Repository info (Node 2)
Todo list (Node 3)
Execution results (Node 4)
Test results (Node 5)

7. Proof of Concept

What We'll Prove

Primary Claim: OpenHands can build a complete full-stack application autonomously through structured todos

Supporting Evidence Required:

Analyzes Prompt Correctly
- Can extract requirements from user prompt
- Can break down into atomic todos
- Can define clear expected outcomes
Creates Executable Todos
- Each todo has specific files to create
- Each todo has tests to run
- Each todo has measurable success criteria
Executes Todos Autonomously
- Creates/modifies files correctly
- Runs build/test commands
- Fixes errors when tests fail
- Commits changes after each todo
Loops Through All Todos
- n8n tracks current todo index
- Advances to next todo after success
- Continues until all complete
- Handles failures gracefully
Produces Working Application
- Backend API functional
- Frontend UI working
- Database integrated
- Tests passing
- End-to-end flow works

Test Scenario

Scenario: Full-Stack Todo App Prompt: "Create a full-stack todo app with React + Node.js + PostgreSQL"

Expected Outcome:

Initial Push:
  → TODOs created (8 items)
  → Commit: "📋 TODOs created from prompt"

Loop 1 (Todo #1):
  → Backend API created
  → Tests pass
  → Commit: "✅ Complete: Backend API Setup"

Loop 2 (Todo #2):
  → React app created
  → Tests pass
  → Commit: "✅ Complete: Frontend React App"

[... 5 more loops ...]

Final Loop (Todo #8):
  → All features complete
  → All tests pass
  → Commit: "🎉 MVP Complete: Todo App"

Total Execution Time: ~30-45 minutes

Success Criteria

Must Have:

Initial prompt creates TODO.md with ≥5 todos
Each todo executes independently
Each todo commits changes
Loop continues until all todos complete
Final application builds successfully
At least 3 todos execute without errors

Proof Points:

TODO.md created and committed
Backend files created (package.json, server.js, etc.)
Frontend files created (React components, etc.)
Database schema created
Tests created and passing
Final application works end-to-end

Verification Steps

After implementation:

Check TODO.md exists
```
cat /workspace/project/TODO.md
```

Check commit history

git log --oneline
# Should see: TODOs created, Complete: [todo name], Complete: [next], ...

Check file structure

tree /workspace/project
# Backend: package.json, server.js, routes/, models/
# Frontend: src/, components/, package.json
# Tests: test/
# Database: database/schema.sql

Run the application

cd /workspace/project
npm install
npm run build
npm start

Verify all tests pass

npm test
npm run test:integration
npm run test:e2e

Example Output

Gitea Commit History:

abc123f (HEAD) 🎉 MVP Complete: Todo App - All 8 todos finished
def4567 ✅ Complete: Final Integration Test
789abcd ✅ Complete: Styling & UI
012cdef ✅ Complete: Database Integration
345fgh ✅ Complete: API Integration
6789ij ✅ Complete: Frontend React App
klmno ✅ Complete: Backend API Setup
pqrst 📋 TODOs created from prompt

File Structure:

/workspace/todo-app/
├── backend/
│   ├── package.json
│   ├── server.js
│   ├── routes/
│   │   └── todos.js
│   └── models/
│       └── Todo.js
├── frontend/
│   ├── package.json
│   ├── src/
│   │   ├── App.js
│   │   ├── components/
│   │   │   └── TodoList.js
│   │   └── services/
│   │       └── api.js
├── database/
│   └── schema.sql
├── test/
│   ├── unit/
│   ├── integration/
│   └── e2e/
├── README.md
└── TODO.md

Application Test:

$ curl http://localhost:3000/api/todos
[]
$ curl -X POST http://localhost:3000/api/todos \
  -H "Content-Type: application/json" \
  -d '{"title": "Test todo"}'
{"id": 1, "title": "Test todo", "completed": false}

8. Implementation Steps

Step-by-Step Plan (8 Steps, 3-4 Hours)

Step 1: Setup Test Repository (20 min)

Create test repo in Gitea: todo-app-mvp-test
Add initial commit with prompt
Configure Gitea webhook to n8n
Verify webhook triggers correctly

Test Command:

curl -X POST https://n8n.oky.sh/webhook/todo-mvp-test \
  -H "Content-Type: application/json" \
  -d '{"repository": {"name": "todo-app-mvp-test"}, "head_commit": {"message": "MVP Prompt: Create a full-stack todo app"}}'

Step 2: Create n8n Workflow Skeleton (30 min)

Create new workflow: "Todo-Based MVP Builder"
Add 6 nodes (Webhook, Extract, Get Todos, Execute, Test, Commit)
Configure Webhook node
Test manual trigger works

Verification:

Workflow ID created
Webhook URL accessible
Manual trigger executes without errors

Step 3: Implement SDK Integration (45 min)

Test OpenHands SDK wrapper directly
Create SDK call function in Node 4
Handle JSON output parsing
Test with simple task: "Create a test file"

Test Code:

// In Node 4, test this:
const result = callOpenHandsSDK("Create a file named sdk-test.txt with content: Hello from SDK");
console.log(result);
// Should return: { success: true, files_created: ['sdk-test.txt'], ... }

Step 4: Implement Todo Creation (30 min)

Add prompt analysis logic
Create TODO.md generation task
Parse TODO.md and store in staticData
Test with sample prompt

Test:

const prompt = "Create a full-stack todo app";
const todoResult = callOpenHandsSDK(`Analyze prompt and create TODOs...`);
// Should parse output and create array in staticData

Step 5: Implement Todo Execution Loop (45 min)

Add current_index tracking
Implement "get next todo" logic
Add todo result storage
Test loop with 2-3 simple todos

Test Sequence:

Push 1: Initial prompt
  → TODO.md created (Commit 1)

Push 2: Execute todo #1
  → Files created (Commit 2)

Push 3: Execute todo #2
  → Files created (Commit 3)

Push 4: Execute todo #3
  → Files created (Commit 4)

Push 5: No more todos
  → Final completion (Commit 5)

Step 6: Add Test & Validation (30 min)

Add test command execution
Parse test results
Continue on failure (for debugging) or stop
Add error formatting

Test with intentional error:

// In todo task, intentionally break code
// Should detect failure and log error
// Workflow continues to next todo (for now)

Step 7: Implement Commit/Push to Gitea (30 min)

Add Gitea API calls for commits
Add commit status updates
Format commit messages per todo
Test commit flow

Expected:

git log --oneline
# Commit 1: "📋 TODOs created from prompt"
# Commit 2: "✅ Complete: Backend API Setup"
# Commit 3: "✅ Complete: Frontend React App"

Step 8: Full End-to-End Test (45 min)

Use real prompt: "Create a full-stack todo app"
Let system execute all todos
Verify final application works
Document any issues found

Complete Test:

Push initial prompt
Watch 5-8 automatic commits
Check file structure
Run application
Verify all features work

Time Breakdown

Step	Activity	Time	Cumulative
1	Setup test repository	20 min	20 min
2	Create workflow skeleton	30 min	50 min
3	SDK integration	45 min	1h 35m
4	Todo creation logic	30 min	2h 5m
5	Todo execution loop	45 min	2h 50m
6	Test & validation	30 min	3h 20m
7	Commit/push	30 min	3h 50m
8	Full E2E test	45 min	4h 35m

Estimated Total: 4-5 hours (with buffer)

Success Metrics

After Implementation:

Can push prompt and get TODO.md
Can execute ≥3 todos automatically
Each todo commits changes to Gitea
Final application exists and builds
Workflow ID stable and reusable

Proof Complete:

Commit history shows progression
File structure matches todos
Application runs successfully
Tests pass (if any)

9. Advantages Over Current Phase 3

Complexity Reduction

Aspect	Current 11-Node	Simplified 6-Node	Reduction
Nodes	11	6	45% fewer
Logic	Complex retry loops	Simple todo iteration	60% simpler
State	Multiple decision points	Linear progression	Easier to debug
Testing	3 retry scenarios	1 success scenario	Simpler tests

Better Proof of Concept

Current Phase 3:

Tests: Build → Retry on error → Max 3 retries
Focus: Error handling
Outcome: Build succeeds or fails
Proof: Can retry builds

Simplified Plan:

Tests: Full MVP creation
Focus: Autonomous development
Outcome: Complete application
Proof: Can build full-stack apps

More Practical

Real Usage:

User: "I want a React app with API"
  → System: Creates todos, executes them
  → Result: Full React app with API

User: "Build me a Django site"
  → System: Creates todos, executes them
  → Result: Full Django site

User: "Make a mobile app"
  → System: Creates todos, executes them
  → Result: Full mobile app (React Native)

This is what users actually want to do!

10. Risks & Mitigation

Risk 1: OpenHands Can't Handle Complex Tasks

Mitigation:

Start with simple 2-3 todo app
Verify each todo executes
Gradually increase complexity
Use test-first approach

Risk 2: n8n State Management Issues

Mitigation:

Use staticData correctly
Add logging at each step
Test manual workflow execution
Verify data persistence

Risk 3: Git Commit Loop Issues

Mitigation:

Add commit limits (max 20 commits)
Check for infinite loops
Add circuit breaker pattern
Monitor execution time

Risk 4: Performance/Time Issues

Mitigation:

Each todo: 3-5 minutes max
All todos: 30-45 minutes total
Add timeouts per todo
Fail fast on errors

11. Next Steps After Proof

Phase 4: Production Enhancement

Once proof is complete, add:

Error Recovery
- Retry failed todos
- Continue on non-critical errors
- Rollback on major failures
Parallel Execution
- Run independent todos in parallel
- Maintain order for dependencies
- Speed up development
Smart Scheduling
- Queue-based execution
- Rate limiting for API calls
- Optimize token usage
Advanced Features
- Automatic testing with Playwright
- Docker containerization
- Cloud deployment
User Interface
- Dashboard to track progress
- Manual todo editing
- Progress notifications

12. Conclusion

Why This Simplified Plan Works

1. Simplicity

6 nodes vs 11 nodes
Linear progression vs complex branches
One concept: todo iteration

2. Proof of Concept

Tests actual user scenario: building apps
Shows OpenHands can create complex software
Demonstrates autonomous development

3. Practical Value

Solves real problem: "I want to build an app"
Can be used immediately after proof
Scalable to many application types

4. Foundation for Growth

Can add retry logic later (Phase 4)
Can add parallel execution (Phase 4)
Can add UI (Phase 4)

Expected Outcome

After implementation, we'll have:

Working system that builds full-stack apps
Proof OpenHands can execute complex development tasks
Reusable workflow for any MVP
Clear path to production features

Most Important: We'll prove that OpenHands can build complete applications autonomously through structured todos. This is the key innovation that makes this system valuable.

Appendix: Implementation Checklist

Pre-Implementation

Review this plan with stakeholder
Approve simplified approach
Allocate 4-5 hours for implementation
Prepare test repository

During Implementation

Test each step before moving to next
Document any deviations from plan
Keep commit history clean
Verify at each checkpoint

Post-Implementation

Run full E2E test
Document results
Create user guide
Plan Phase 4 enhancements

Files to Create/Modify

/home/bam/claude/mvp-factory/SIMPLIFIED_PHASE3_PLAN.md (this file)
n8n workflow: "Todo-Based MVP Builder"
Test repository: todo-app-mvp-test
TODO.md (generated by system)

Simplified Phase 3 Plan - Ready for Approval Estimated Implementation: 4-5 hours Expected Outcome: Autonomous full-stack app builder

33 KiB Raw Permalink Blame History

Simplified Phase 3: Todo-Based Autonomous Development System

🎯 Vision Statement

1. Architecture Overview

Current vs Simplified Approach

System Architecture

Data Flow

2. Todo Management System

How Todos Are Created

How n8n Tracks Todos

3. Iterative Development Loop

The 6-Node Workflow

Node Details

Node 1: Git Push (Webhook)

Node 2: Extract Repo Info (Code)

Node 3: Get Next Todo (Code)

Node 4: Execute Todo (Code - Calls SDK)

Node 5: Test Changes (Code)

Node 6: Commit & Push (HTTP Node)

4. OpenHands SDK Integration

Direct SDK Call (No SSH)

5. Full-Stack App Example

Example: Todo App MVP

Initial Push

What Happens

Expected Outcomes by Loop

Proof of Concept: Why This Works

6. n8n Workflow Design

Minimal 6-Node Structure

Node Configuration Summary

Data Preservation Pattern

7. Proof of Concept

What We'll Prove

Test Scenario

Success Criteria

Verification Steps

Example Output

8. Implementation Steps

Step-by-Step Plan (8 Steps, 3-4 Hours)

Step 1: Setup Test Repository (20 min)

Step 2: Create n8n Workflow Skeleton (30 min)

Step 3: Implement SDK Integration (45 min)

Step 4: Implement Todo Creation (30 min)

Step 5: Implement Todo Execution Loop (45 min)

Step 6: Add Test & Validation (30 min)

Step 7: Implement Commit/Push to Gitea (30 min)

Step 8: Full End-to-End Test (45 min)

Time Breakdown

Success Metrics

9. Advantages Over Current Phase 3

Complexity Reduction

Better Proof of Concept

More Practical

10. Risks & Mitigation

Risk 1: OpenHands Can't Handle Complex Tasks

Risk 2: n8n State Management Issues

Risk 3: Git Commit Loop Issues

Risk 4: Performance/Time Issues

11. Next Steps After Proof

Phase 4: Production Enhancement

12. Conclusion

Why This Simplified Plan Works

Expected Outcome

Appendix: Implementation Checklist

Pre-Implementation

During Implementation

Post-Implementation

Files to Create/Modify

33 KiB

Raw Permalink Blame History