mvp-factory-openhands/SIMPLIFIED_PHASE3_PLAN.md

33 KiB

Simplified Phase 3: Todo-Based Autonomous Development System

Status: 📋 PLANNED Approach: n8n + OpenHands SDK (direct integration, no SSH) Duration: 3-4 hours


🎯 Vision Statement

Create a simple, autonomous system where:

  1. User pushes prompt: "Create an MVP [app type]"
  2. OpenHands analyzes and creates todos
  3. n8n loops through todos: Execute → Test → Commit → Next
  4. Result: Complete full-stack application built step-by-step

Goal: Prove OpenHands can build full-stack applications autonomously through structured todos


1. Architecture Overview

Current vs Simplified Approach

Aspect Current Phase 3 Simplified Plan
Integration SSH wrapper + CLI Direct SDK (Python)
Workflow 11 nodes (complex) 6 nodes (minimal)
Approach Build + Retry Todo-based loops
Complexity High Low
Proof Build tests Full-stack app

System Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Gitea Repository                         │
│  Prompt Push → Code Changes → Build Artifacts               │
└────────────────────┬────────────────────────────────────────┘
                     │
                     │ Webhook
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                    n8n Workflow (6 nodes)                    │
│  [1] Webhook → [2] Extract Data → [3] Get Next Todo        │
│         ↓                                                    │
│  [4] Execute Todo (SDK) → [5] Test → [6] Commit/Push        │
│         │                            │                      │
│         └──────────── Loop ←─────────┘                      │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│               OpenHands SDK (Python)                         │
│  • Creates todos from prompt                                │
│  • Executes todos autonomously                              │
│  • Returns structured results                               │
│  • Built/test cycle for each todo                           │
└─────────────────────────────────────────────────────────────┘

Data Flow

Git Push
  ↓
n8n Webhook Triggered
  ↓
Extract: repo_name, branch, commit_sha, prompt
  ↓
Check: Are there todos to execute?
  ├─ YES → Get next todo from prompt
  │         ↓
  │      Execute with OpenHands SDK
  │         ↓
  │      Test the changes
  │         ↓
  │      Commit & push to Gitea
  │         ↓
  │      Loop back to check for next todo
  │
  └─ NO → All todos complete! Final commit

2. Todo Management System

How Todos Are Created

Step 1: Initial Prompt Analysis When user pushes initial prompt, OpenHands SDK analyzes and creates todos:

# Task sent to OpenHands SDK
task = """
Analyze this prompt: "Create a full-stack todo app with React + Node.js + PostgreSQL"

Create a TODO.md file with structured tasks. Each task should:
1. Be atomic (one specific feature)
2. Include build/test steps
3. Be executable in isolation
4. Have clear completion criteria

Format:
# Development Tasks

## 1. [Category] Task Name
**Description:** Brief explanation
**Files to create:** list of files
**Tests to run:** test commands
**Expected outcome:** what success looks like

## 2. Next Task...
"""

Step 2: TODO.md Structure Example

# Development Tasks for Todo App MVP

## 1. Backend API Setup
**Description:** Initialize Node.js Express API with PostgreSQL
**Files to create:**
  - package.json
  - server.js
  - database/schema.sql
**Tests to run:** npm install && npm test
**Expected outcome:** API server starts, connects to DB

## 2. Frontend React App
**Description:** Create React app with routing
**Files to create:**
  - src/App.js
  - src/components/TodoList.js
  - package.json
**Tests to run:** npm install && npm start
**Expected outcome:** React app runs, displays todo list

## 3. API Integration
**Description:** Connect frontend to backend
**Files to create:**
  - src/services/api.js
  - src/hooks/useTodos.js
**Tests to run:** npm test
**Expected outcome:** Can fetch/create todos via API

## 4. Database Integration
**Description:** Implement CRUD operations
**Files to create:**
  - server/routes/todos.js
  - server/models/Todo.js
**Tests to run:** npm run test:integration
**Expected outcome:** All CRUD operations work

## 5. Styling & UI
**Description:** Add CSS/styling for better UX
**Files to create:**
  - src/App.css
  - src/components/TodoList.css
**Tests to run:** npm run build
**Expected outcome:** App builds and looks good

## 6. Final Integration Test
**Description:** End-to-end testing
**Files to create:**
  - test/e2e/todo-app.test.js
**Tests to run:** npm run test:e2e
**Expected outcome:** All features work end-to-end

How n8n Tracks Todos

Pattern: Store todo state in n8n's staticData (persistent between runs)

// In Get Next Todo node
const workflow = $workflow;

// Initialize todo state
workflow.staticData = workflow.staticData || {};
workflow.staticData.todos = workflow.staticData.todos || {};

// Parse TODO.md to get current todo index
const currentIndex = workflow.staticData.todos.current_index || 0;
const todos = workflow.staticData.todos.list || [];

if (currentIndex < todos.length) {
  const nextTodo = todos[currentIndex];

  return {
    action: 'execute',
    todo: nextTodo,
    index: currentIndex,
    total: todos.length,
    status: 'IN_PROGRESS'
  };
} else {
  // All todos complete
  return {
    action: 'complete',
    message: 'All todos executed successfully',
    total_todos: todos.length,
    status: 'SUCCESS'
  };
}

Data Persistence Pattern:

  • staticData.todos.list: Array of all todos
  • staticData.todos.current_index: Current todo number
  • staticData.todos.results: Array of results from each todo
  • staticData.todos.completed: Boolean flag

3. Iterative Development Loop

The 6-Node Workflow

[1] Git Push (Gitea Webhook)
     ↓
[2] Extract Repo Info (Code Node)
     ↓
[3] Get Next Todo (Code Node)
     ↓
[4] Execute Todo (Code Node → calls SDK)
     ↓
[5] Test Changes (Code Node)
     ↓
[6] Commit & Push (HTTP Node to Gitea)
     ↓
     └─ Loop back to [3]

Node Details

Node 1: Git Push (Webhook)

  • Trigger: Gitea push event
  • Input: Repository changes, commit data
  • Output: Push event payload

Node 2: Extract Repo Info (Code)

const payload = $json;

return {
  repo_name: payload.repository.name,
  repo_owner: payload.repository.owner.name || payload.repository.owner.login,
  branch: payload.ref.replace('refs/heads/', ''),
  commit_sha: payload.after,
  commit_message: payload.head_commit.message,
  pusher: payload.pusher.name,

  // Check if this is an initial prompt or a build push
  is_initial_push: payload.head_commit.message.startsWith('MVP Prompt:'),

  // Extract prompt from commit message
  prompt: extractPrompt(payload.head_commit.message)
};

Node 3: Get Next Todo (Code)

const repoInfo = $json;
const workflow = $workflow;

// Initialize or get todos
workflow.staticData.todos = workflow.staticData.todos || {};

if (repoInfo.is_initial_push) {
  // First push - extract prompt and create todos
  const prompt = repoInfo.prompt;

  const createTodosTask = `
Analyze this MVP prompt: "${prompt}"

Create a comprehensive TODO.md file with development tasks.
Each task should be atomic and executable.

Return the TODO.md content as JSON:
{
  "tasks": [
    {
      "title": "Task name",
      "description": "Description",
      "category": "Backend|Frontend|Integration|Testing",
      "files": ["file1", "file2"],
      "commands": ["npm install", "npm test"],
      "expected_outcome": "What success looks like"
    }
  ]
}
  `;

  // Store in staticData for next node
  workflow.staticData.todos.pending = createTodosTask;
  workflow.staticData.todos.current_index = 0;
  workflow.staticData.todos.status = 'CREATING_TODOS';

  return {
    action: 'create_todos',
    prompt: prompt,
    status: workflow.staticData.todos.status
  };
} else if (workflow.staticData.todos.current_index !== undefined) {
  // Continue with existing todos
  const index = workflow.staticData.todos.current_index;
  const todos = workflow.staticData.todos.list || [];

  if (index < todos.length) {
    return {
      action: 'execute_todo',
      todo: todos[index],
      index: index,
      total: todos.length,
      status: 'IN_PROGRESS'
    };
  } else {
    return {
      action: 'complete',
      message: 'All todos completed successfully',
      status: 'SUCCESS'
    };
  }
} else {
  return {
    action: 'error',
    message: 'No todos found. Please push MVP prompt first.',
    status: 'ERROR'
  };
}

Node 4: Execute Todo (Code - Calls SDK)

const todoData = $json;
const repoInfo = $node["Extract Repo Info"].json;
const workflow = $workflow;

// Handle different actions
if (todoData.action === 'create_todos') {
  // Call OpenHands to create TODO.md
  const sdkOutput = callOpenHandsSDK(todoData.prompt);

  // Parse and store todos
  workflow.staticData.todos.list = sdkOutput.tasks;
  workflow.staticData.todos.current_index = 0;
  workflow.staticData.todos.status = 'READY';

  return {
    action: 'next_todo',
    message: 'Todos created, starting execution',
    total_todos: sdkOutput.tasks.length
  };
} else if (todoData.action === 'execute_todo') {
  // Execute the current todo
  const task = `
Execute this development task:

**Task:** ${todoData.todo.title}
**Description:** ${todoData.todo.description}
**Category:** ${todoData.todo.category}

**Steps:**
1. Create/modify the required files
2. Run the specified commands
3. Ensure the expected outcome is achieved
4. If tests fail, fix them
5. Commit your changes

**Files to work with:**
${todoData.todo.files.join(', '))}

**Commands to run:**
${todoData.todo.commands.join('\n')}

**Expected outcome:**
${todoData.todo.expected_outcome}

Current directory: /workspace/${repoInfo.repo_name}
  `;

  const sdkOutput = callOpenHandsSDK(task);

  // Store result
  workflow.staticData.todos.results = workflow.staticData.todos.results || [];
  workflow.staticData.todos.results.push({
    todo: todoData.todo,
    output: sdkOutput,
    success: sdkOutput.success,
    timestamp: new Date().toISOString()
  });

  // Increment index for next iteration
  workflow.staticData.todos.current_index++;

  return {
    action: 'todo_executed',
    todo: todoData.todo,
    index: todoData.index,
    success: sdkOutput.success,
    output: sdkOutput
  };
}

Node 5: Test Changes (Code)

const executeResult = $json;

if (executeResult.success) {
  return {
    status: 'SUCCESS',
    message: `Todo "${executeResult.todo.title}" completed successfully`,
    commit_message: `✅ Complete: ${executeResult.todo.title}`,

    // For Gitea status
    state: 'success',
    description: `Todo ${executeResult.index + 1}/${executeResult.todo.total}: ${executeResult.todo.title}`
  };
} else {
  // Mark as failed but continue (for debugging)
  return {
    status: 'FAILED',
    message: `Todo "${executeResult.todo.title}" failed`,
    commit_message: `❌ Failed: ${executeResult.todo.title}`,

    state: 'failure',
    description: `Todo ${executeResult.index + 1}/${executeResult.todo.total}: ${executeResult.todo.title} - FAILED`,
    error: executeResult.output.error || 'Unknown error'
  };
}

Node 6: Commit & Push (HTTP Node)

const testResult = $json;
const repoInfo = $node["Extract Repo Info"].json;

// Create commit
POST https://git.oky.sh/api/v1/repos/{owner}/{repo}/git/commits

{
  "message": testResult.commit_message,
  "tree": getCurrentTreeSha(),
  "parents": [repoInfo.commit_sha]
}

// Update status
POST https://git.oky.sh/api/v1/repos/{owner}/{repo}/statuses/{commit_sha}
{
  "state": testResult.state,
  "description": testResult.description,
  "context": "openhands/todo-build",
  "target_url": "https://n8n.oky.sh"
}

// Return to loop
return {
  loop: true,
  should_continue: testResult.state === 'success' || // Continue on success
                   (testResult.state === 'failure' && // Or even on failure for debugging
                    $workflow.staticData.todos.current_index < $workflow.staticData.todos.list.length)
};

4. OpenHands SDK Integration

Direct SDK Call (No SSH)

Current Setup:

  • SDK wrapper: /home/bam/openhands-sdk-wrapper-fixed.py
  • Python-based
  • Returns JSON output
  • No SSH needed

How to Call from n8n:

// In Code Node (Execute Todo)
function callOpenHandsSDK(task, workspace = "/home/bam") {
  // Using Python subprocess or HTTP call to the SDK wrapper
  const { execSync } = require('child_process');

  const command = `python3 /home/bam/openhands-sdk-wrapper-fixed.py "${task}" --workspace ${workspace} --json`;

  try {
    const output = execSync(command, { encoding: 'utf-8' });
    const result = JSON.parse(output);

    return {
      success: result.success,
      files_created: result.files_created || [],
      files_copied: result.files_copied || [],
      error: result.error || null,
      log_output: result.log_output || []
    };
  } catch (error) {
    return {
      success: false,
      error: error.message,
      files_created: [],
      files_copied: []
    };
  }
}

Alternative: HTTP Call to SDK Service

If we want to avoid subprocess, we can create a simple HTTP service:

# sdk-server.py
from flask import Flask, request, jsonify
from openhands_sdk_wrapper_fixed import run_openhands_task

app = Flask(__name__)

@app.route('/execute', methods=['POST'])
def execute():
    data = request.json
    task = data.get('task')
    workspace = data.get('workspace', '/home/bam')

    result = run_openhands_task(task, workspace)
    return jsonify(result)

if __name__ == '__main__':
    app.run(port=5000, host='0.0.0.0')

Then call from n8n:

// HTTP Request node
POST http://localhost:5000/execute
Content-Type: application/json

{
  "task": "Create TODO.md from prompt: ...",
  "workspace": "/home/bam/mvp-project"
}

Why Direct SDK is Better:

  • No SSH overhead
  • Structured JSON output
  • Faster execution
  • Direct Python control
  • Easier debugging
  • Need to manage Python environment (but we already have this)

5. Full-Stack App Example

Example: Todo App MVP

Initial Push

Commit Message: MVP Prompt: Create a full-stack todo app with React + Node.js + PostgreSQL

What Happens

First n8n Loop:

  1. Node 2 extracts: prompt = "Create a full-stack todo app..."
  2. Node 3: Detects is_initial_push = true
  3. Node 4: Calls OpenHands to create TODO.md
  4. Node 5-6: Create TODO.md file and commit it

TODO.md Created:

# Development Tasks for Todo App MVP

## 1. Backend API Setup
**Description:** Initialize Node.js Express API
**Files:** package.json, server.js, database/schema.sql
**Commands:** npm install, npm test

## 2. Frontend React App
**Description:** Create React app
**Files:** src/App.js, src/components/TodoList.js
**Commands:** npm install, npm start

[... 4 more todos ...]

Subsequent Loops:

Loop 1:
  Get Todo #1: Backend API Setup
  OpenHands creates: package.json, server.js, schema.sql
  Test: npm install && npm test
  Commit: "✅ Complete: Backend API Setup"

Loop 2:
  Get Todo #2: Frontend React App
  OpenHands creates: src/App.js, components, package.json
  Test: npm install && npm start
  Commit: "✅ Complete: Frontend React App"

[... continues for all 6 todos ...]

Loop 7:
  Get Next Todo: None (all complete)
  Commit: "🎉 MVP Complete: Todo App - All 6 todos finished"
  Status: SUCCESS

Expected Outcomes by Loop

Loop Todo What OpenHands Does Test Commit Message
1 Create TODOs Analyzes prompt, creates TODO.md N/A "📋 TODOs created from prompt"
2 Backend Setup Creates package.json, server.js, DB schema npm install && npm test " Complete: Backend API Setup"
3 Frontend App Creates React components, routing npm install && npm start " Complete: Frontend React App"
4 API Integration Connects frontend to backend npm test " Complete: API Integration"
5 Database CRUD Implements todo operations npm run test:integration " Complete: Database Integration"
6 Styling & UI Adds CSS, improves UX npm run build " Complete: Styling & UI"
7 E2E Testing Creates end-to-end tests npm run test:e2e " Complete: Final Integration Test"
8 Final All todos complete All tests pass "🎉 MVP Complete: Todo App"

Proof of Concept: Why This Works

Evidence 1: Todo Structure

  • Each todo is atomic and testable
  • Clear expected outcomes
  • Build/test cycle for each

Evidence 2: OpenHands Capabilities

  • Can create files
  • Can run commands
  • Can fix errors
  • Can commit changes

Evidence 3: n8n Loop

  • Persists state between iterations
  • Tracks progress (current_index)
  • Automatically advances to next todo
  • Stops when complete

Evidence 4: Full-Stack Coverage

  • Backend: Node.js + Express + PostgreSQL
  • Frontend: React + Routing
  • Database: Schema + CRUD
  • Integration: API calls
  • Testing: Unit + E2E
  • Deployment: Build scripts

6. n8n Workflow Design

Minimal 6-Node Structure

┌──────────────┐
│  Node 1:     │
│ Git Push     │
│ (Webhook)    │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Node 2:     │
│ Extract      │
│ Repo Info    │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Node 3:     │
│ Get Next     │
│ Todo         │
└──────┬───────┘
       │
       ├─ Create Todos ──┐
       │                 │
       │                 ▼
       │            ┌──────────────┐
       │            │  Node 4:     │
       │            │ Execute      │
       │            │ (calls SDK)  │
       │            └──────┬───────┘
       │                 │
       ▼                 ▼
┌──────────────┐    ┌──────────────┐
│  Node 5:     │    │  Node 4:     │
│ Test         │    │ Execute      │
│ Changes      │    │ (continues)  │
└──────┬───────┘    └──────────────┘
       │
       ▼
┌──────────────┐
│  Node 6:     │
│ Commit &     │
│ Push         │
└──────┬───────┘
       │
       └─ Loop back to Node 3

Node Configuration Summary

Node Type Purpose Complexity
1 Webhook Receive Gitea push Simple
2 Code Extract repo/prompt data Simple
3 Code Get next todo or finish Medium
4 Code Call OpenHands SDK Medium
5 Code Test and format results Simple
6 HTTP Commit to Gitea Simple

Total Complexity: Much lower than 11-node design!

Data Preservation Pattern

// Each node preserves previous data
const current = $json;  // Current node output
const previous = $node["Previous Node Name"].json;  // Preserve data

return {
  ...previous,  // ← PRESERVE ALL PREVIOUS DATA
  current_data: current
};

Critical: This ensures we don't lose:

  • Repository info (Node 2)
  • Todo list (Node 3)
  • Execution results (Node 4)
  • Test results (Node 5)

7. Proof of Concept

What We'll Prove

Primary Claim: OpenHands can build a complete full-stack application autonomously through structured todos

Supporting Evidence Required:

  1. Analyzes Prompt Correctly

    • Can extract requirements from user prompt
    • Can break down into atomic todos
    • Can define clear expected outcomes
  2. Creates Executable Todos

    • Each todo has specific files to create
    • Each todo has tests to run
    • Each todo has measurable success criteria
  3. Executes Todos Autonomously

    • Creates/modifies files correctly
    • Runs build/test commands
    • Fixes errors when tests fail
    • Commits changes after each todo
  4. Loops Through All Todos

    • n8n tracks current todo index
    • Advances to next todo after success
    • Continues until all complete
    • Handles failures gracefully
  5. Produces Working Application

    • Backend API functional
    • Frontend UI working
    • Database integrated
    • Tests passing
    • End-to-end flow works

Test Scenario

Scenario: Full-Stack Todo App Prompt: "Create a full-stack todo app with React + Node.js + PostgreSQL"

Expected Outcome:

Initial Push:
  → TODOs created (8 items)
  → Commit: "📋 TODOs created from prompt"

Loop 1 (Todo #1):
  → Backend API created
  → Tests pass
  → Commit: "✅ Complete: Backend API Setup"

Loop 2 (Todo #2):
  → React app created
  → Tests pass
  → Commit: "✅ Complete: Frontend React App"

[... 5 more loops ...]

Final Loop (Todo #8):
  → All features complete
  → All tests pass
  → Commit: "🎉 MVP Complete: Todo App"

Total Execution Time: ~30-45 minutes

Success Criteria

Must Have:

  • Initial prompt creates TODO.md with ≥5 todos
  • Each todo executes independently
  • Each todo commits changes
  • Loop continues until all todos complete
  • Final application builds successfully
  • At least 3 todos execute without errors

Proof Points:

  1. TODO.md created and committed
  2. Backend files created (package.json, server.js, etc.)
  3. Frontend files created (React components, etc.)
  4. Database schema created
  5. Tests created and passing
  6. Final application works end-to-end

Verification Steps

After implementation:

  1. Check TODO.md exists

    cat /workspace/project/TODO.md
    
  2. Check commit history

    git log --oneline
    # Should see: TODOs created, Complete: [todo name], Complete: [next], ...
    
  3. Check file structure

    tree /workspace/project
    # Backend: package.json, server.js, routes/, models/
    # Frontend: src/, components/, package.json
    # Tests: test/
    # Database: database/schema.sql
    
  4. Run the application

    cd /workspace/project
    npm install
    npm run build
    npm start
    
  5. Verify all tests pass

    npm test
    npm run test:integration
    npm run test:e2e
    

Example Output

Gitea Commit History:

abc123f (HEAD) 🎉 MVP Complete: Todo App - All 8 todos finished
def4567 ✅ Complete: Final Integration Test
789abcd ✅ Complete: Styling & UI
012cdef ✅ Complete: Database Integration
345fgh ✅ Complete: API Integration
6789ij ✅ Complete: Frontend React App
klmno ✅ Complete: Backend API Setup
pqrst 📋 TODOs created from prompt

File Structure:

/workspace/todo-app/
├── backend/
│   ├── package.json
│   ├── server.js
│   ├── routes/
│   │   └── todos.js
│   └── models/
│       └── Todo.js
├── frontend/
│   ├── package.json
│   ├── src/
│   │   ├── App.js
│   │   ├── components/
│   │   │   └── TodoList.js
│   │   └── services/
│   │       └── api.js
├── database/
│   └── schema.sql
├── test/
│   ├── unit/
│   ├── integration/
│   └── e2e/
├── README.md
└── TODO.md

Application Test:

$ curl http://localhost:3000/api/todos
[]
$ curl -X POST http://localhost:3000/api/todos \
  -H "Content-Type: application/json" \
  -d '{"title": "Test todo"}'
{"id": 1, "title": "Test todo", "completed": false}

8. Implementation Steps

Step-by-Step Plan (8 Steps, 3-4 Hours)

Step 1: Setup Test Repository (20 min)

  • Create test repo in Gitea: todo-app-mvp-test
  • Add initial commit with prompt
  • Configure Gitea webhook to n8n
  • Verify webhook triggers correctly

Test Command:

curl -X POST https://n8n.oky.sh/webhook/todo-mvp-test \
  -H "Content-Type: application/json" \
  -d '{"repository": {"name": "todo-app-mvp-test"}, "head_commit": {"message": "MVP Prompt: Create a full-stack todo app"}}'

Step 2: Create n8n Workflow Skeleton (30 min)

  • Create new workflow: "Todo-Based MVP Builder"
  • Add 6 nodes (Webhook, Extract, Get Todos, Execute, Test, Commit)
  • Configure Webhook node
  • Test manual trigger works

Verification:

  • Workflow ID created
  • Webhook URL accessible
  • Manual trigger executes without errors

Step 3: Implement SDK Integration (45 min)

  • Test OpenHands SDK wrapper directly
  • Create SDK call function in Node 4
  • Handle JSON output parsing
  • Test with simple task: "Create a test file"

Test Code:

// In Node 4, test this:
const result = callOpenHandsSDK("Create a file named sdk-test.txt with content: Hello from SDK");
console.log(result);
// Should return: { success: true, files_created: ['sdk-test.txt'], ... }

Step 4: Implement Todo Creation (30 min)

  • Add prompt analysis logic
  • Create TODO.md generation task
  • Parse TODO.md and store in staticData
  • Test with sample prompt

Test:

const prompt = "Create a full-stack todo app";
const todoResult = callOpenHandsSDK(`Analyze prompt and create TODOs...`);
// Should parse output and create array in staticData

Step 5: Implement Todo Execution Loop (45 min)

  • Add current_index tracking
  • Implement "get next todo" logic
  • Add todo result storage
  • Test loop with 2-3 simple todos

Test Sequence:

Push 1: Initial prompt
  → TODO.md created (Commit 1)

Push 2: Execute todo #1
  → Files created (Commit 2)

Push 3: Execute todo #2
  → Files created (Commit 3)

Push 4: Execute todo #3
  → Files created (Commit 4)

Push 5: No more todos
  → Final completion (Commit 5)

Step 6: Add Test & Validation (30 min)

  • Add test command execution
  • Parse test results
  • Continue on failure (for debugging) or stop
  • Add error formatting

Test with intentional error:

// In todo task, intentionally break code
// Should detect failure and log error
// Workflow continues to next todo (for now)

Step 7: Implement Commit/Push to Gitea (30 min)

  • Add Gitea API calls for commits
  • Add commit status updates
  • Format commit messages per todo
  • Test commit flow

Expected:

git log --oneline
# Commit 1: "📋 TODOs created from prompt"
# Commit 2: "✅ Complete: Backend API Setup"
# Commit 3: "✅ Complete: Frontend React App"

Step 8: Full End-to-End Test (45 min)

  • Use real prompt: "Create a full-stack todo app"
  • Let system execute all todos
  • Verify final application works
  • Document any issues found

Complete Test:

  1. Push initial prompt
  2. Watch 5-8 automatic commits
  3. Check file structure
  4. Run application
  5. Verify all features work

Time Breakdown

Step Activity Time Cumulative
1 Setup test repository 20 min 20 min
2 Create workflow skeleton 30 min 50 min
3 SDK integration 45 min 1h 35m
4 Todo creation logic 30 min 2h 5m
5 Todo execution loop 45 min 2h 50m
6 Test & validation 30 min 3h 20m
7 Commit/push 30 min 3h 50m
8 Full E2E test 45 min 4h 35m

Estimated Total: 4-5 hours (with buffer)

Success Metrics

After Implementation:

  • Can push prompt and get TODO.md
  • Can execute ≥3 todos automatically
  • Each todo commits changes to Gitea
  • Final application exists and builds
  • Workflow ID stable and reusable

Proof Complete:

  • Commit history shows progression
  • File structure matches todos
  • Application runs successfully
  • Tests pass (if any)

9. Advantages Over Current Phase 3

Complexity Reduction

Aspect Current 11-Node Simplified 6-Node Reduction
Nodes 11 6 45% fewer
Logic Complex retry loops Simple todo iteration 60% simpler
State Multiple decision points Linear progression Easier to debug
Testing 3 retry scenarios 1 success scenario Simpler tests

Better Proof of Concept

Current Phase 3:

  • Tests: Build → Retry on error → Max 3 retries
  • Focus: Error handling
  • Outcome: Build succeeds or fails
  • Proof: Can retry builds

Simplified Plan:

  • Tests: Full MVP creation
  • Focus: Autonomous development
  • Outcome: Complete application
  • Proof: Can build full-stack apps

More Practical

Real Usage:

User: "I want a React app with API"
  → System: Creates todos, executes them
  → Result: Full React app with API

User: "Build me a Django site"
  → System: Creates todos, executes them
  → Result: Full Django site

User: "Make a mobile app"
  → System: Creates todos, executes them
  → Result: Full mobile app (React Native)

This is what users actually want to do!


10. Risks & Mitigation

Risk 1: OpenHands Can't Handle Complex Tasks

Mitigation:

  • Start with simple 2-3 todo app
  • Verify each todo executes
  • Gradually increase complexity
  • Use test-first approach

Risk 2: n8n State Management Issues

Mitigation:

  • Use staticData correctly
  • Add logging at each step
  • Test manual workflow execution
  • Verify data persistence

Risk 3: Git Commit Loop Issues

Mitigation:

  • Add commit limits (max 20 commits)
  • Check for infinite loops
  • Add circuit breaker pattern
  • Monitor execution time

Risk 4: Performance/Time Issues

Mitigation:

  • Each todo: 3-5 minutes max
  • All todos: 30-45 minutes total
  • Add timeouts per todo
  • Fail fast on errors

11. Next Steps After Proof

Phase 4: Production Enhancement

Once proof is complete, add:

  1. Error Recovery

    • Retry failed todos
    • Continue on non-critical errors
    • Rollback on major failures
  2. Parallel Execution

    • Run independent todos in parallel
    • Maintain order for dependencies
    • Speed up development
  3. Smart Scheduling

    • Queue-based execution
    • Rate limiting for API calls
    • Optimize token usage
  4. Advanced Features

    • Automatic testing with Playwright
    • Docker containerization
    • Cloud deployment
  5. User Interface

    • Dashboard to track progress
    • Manual todo editing
    • Progress notifications

12. Conclusion

Why This Simplified Plan Works

1. Simplicity

  • 6 nodes vs 11 nodes
  • Linear progression vs complex branches
  • One concept: todo iteration

2. Proof of Concept

  • Tests actual user scenario: building apps
  • Shows OpenHands can create complex software
  • Demonstrates autonomous development

3. Practical Value

  • Solves real problem: "I want to build an app"
  • Can be used immediately after proof
  • Scalable to many application types

4. Foundation for Growth

  • Can add retry logic later (Phase 4)
  • Can add parallel execution (Phase 4)
  • Can add UI (Phase 4)

Expected Outcome

After implementation, we'll have:

  • Working system that builds full-stack apps
  • Proof OpenHands can execute complex development tasks
  • Reusable workflow for any MVP
  • Clear path to production features

Most Important: We'll prove that OpenHands can build complete applications autonomously through structured todos. This is the key innovation that makes this system valuable.


Appendix: Implementation Checklist

Pre-Implementation

  • Review this plan with stakeholder
  • Approve simplified approach
  • Allocate 4-5 hours for implementation
  • Prepare test repository

During Implementation

  • Test each step before moving to next
  • Document any deviations from plan
  • Keep commit history clean
  • Verify at each checkpoint

Post-Implementation

  • Run full E2E test
  • Document results
  • Create user guide
  • Plan Phase 4 enhancements

Files to Create/Modify

  • /home/bam/claude/mvp-factory/SIMPLIFIED_PHASE3_PLAN.md (this file)
  • n8n workflow: "Todo-Based MVP Builder"
  • Test repository: todo-app-mvp-test
  • TODO.md (generated by system)

Simplified Phase 3 Plan - Ready for Approval Estimated Implementation: 4-5 hours Expected Outcome: Autonomous full-stack app builder