Initial commit - MVP project setup
Created by AI Dev Factory init-mvp-project.sh
This commit is contained in:
commit
cf33cc08c1
|
|
@ -0,0 +1,31 @@
|
||||||
|
# Dependencies
|
||||||
|
node_modules/
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
|
*.pyo
|
||||||
|
*.pyd
|
||||||
|
.Python
|
||||||
|
venv/
|
||||||
|
.venv/
|
||||||
|
env/
|
||||||
|
.env
|
||||||
|
|
||||||
|
# IDE
|
||||||
|
.vscode/
|
||||||
|
.idea/
|
||||||
|
*.swp
|
||||||
|
*.swo
|
||||||
|
|
||||||
|
# OS
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
|
|
||||||
|
# Build
|
||||||
|
dist/
|
||||||
|
build/
|
||||||
|
*.egg-info/
|
||||||
|
|
||||||
|
# Test
|
||||||
|
.coverage
|
||||||
|
.pytest_cache/
|
||||||
|
*.log
|
||||||
|
|
@ -0,0 +1,268 @@
|
||||||
|
# Image Description AI - System Architecture
|
||||||
|
|
||||||
|
## High-Level System Design
|
||||||
|
|
||||||
|
The Image Description AI is a client-side single-page application (SPA) that enables users to upload images and receive AI-generated descriptions using the Minimax API. The architecture follows a modern, responsive web application pattern with clean separation of concerns.
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ Client-Side Application │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
|
||||||
|
│ │ Upload │ │ Preview │ │ AI Description │ │
|
||||||
|
│ │ Component │ │ Component │ │ Display │ │
|
||||||
|
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ Application State │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
|
||||||
|
│ │ File │ │ Image │ │ API Response │ │
|
||||||
|
│ │ Validation │ │ Processing │ │ Handler │ │
|
||||||
|
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ Minimax API │
|
||||||
|
│ https://api.minimax.io/v1/text/chatcompletion_v2 │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Technology Choices and Rationale
|
||||||
|
|
||||||
|
### Core Technologies
|
||||||
|
- **HTML5**: Semantic markup for accessibility and modern web standards
|
||||||
|
- **CSS3**: Modern styling with Flexbox/Grid for responsive layouts
|
||||||
|
- **Vanilla JavaScript**: Lightweight, no framework overhead, fast loading
|
||||||
|
- **File API**: Native browser API for file handling and validation
|
||||||
|
|
||||||
|
### UI Framework Decision: Vanilla JavaScript vs React
|
||||||
|
**Chosen: Vanilla JavaScript**
|
||||||
|
- **Rationale**:
|
||||||
|
- Single HTML file requirement simplifies deployment
|
||||||
|
- Minimal bundle size improves load times
|
||||||
|
- No build process needed
|
||||||
|
- Direct DOM manipulation gives precise control
|
||||||
|
- Sufficient for the application's complexity level
|
||||||
|
|
||||||
|
### Styling Approach
|
||||||
|
- **CSS Grid & Flexbox**: Modern, flexible layouts
|
||||||
|
- **CSS Custom Properties**: Maintainable theming
|
||||||
|
- **Mobile-First Responsive Design**: Works across all device sizes
|
||||||
|
- **CSS Animations**: Smooth transitions and loading states
|
||||||
|
|
||||||
|
## Database Schema
|
||||||
|
|
||||||
|
**Not Applicable**: This is a client-side only application with no persistent data storage. All processing is transient and happens in memory.
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### External API Integration
|
||||||
|
|
||||||
|
**Endpoint**: `https://api.minimax.io/v1/text/chatcompletion_v2`
|
||||||
|
- **Method**: POST
|
||||||
|
- **Authentication**: Bearer token (API key)
|
||||||
|
- **Content-Type**: application/json
|
||||||
|
|
||||||
|
**Request Structure**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model": "MiniMax-M2",
|
||||||
|
"messages": [
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [
|
||||||
|
{
|
||||||
|
"type": "text",
|
||||||
|
"text": "Please provide a detailed description of this image in English"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "image_url",
|
||||||
|
"image_url": {
|
||||||
|
"url": "..."
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"max_tokens": 500,
|
||||||
|
"temperature": 0.7
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Structure**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": "chatcmpl-abc123",
|
||||||
|
"object": "chat.completion",
|
||||||
|
"created": 1677652288,
|
||||||
|
"model": "MiniMax-M2",
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"index": 0,
|
||||||
|
"message": {
|
||||||
|
"role": "assistant",
|
||||||
|
"content": "This image shows a beautiful sunset over..."
|
||||||
|
},
|
||||||
|
"finish_reason": "stop"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"usage": {
|
||||||
|
"prompt_tokens": 15,
|
||||||
|
"completion_tokens": 32,
|
||||||
|
"total_tokens": 47
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
image-description-ai/
|
||||||
|
├── index.html # Main application file (self-contained)
|
||||||
|
├── assets/
|
||||||
|
│ ├── styles/
|
||||||
|
│ │ └── main.css # (Optional separate CSS file)
|
||||||
|
│ └── scripts/
|
||||||
|
│ └── main.js # (Optional separate JS file)
|
||||||
|
├── README.md # Project documentation
|
||||||
|
└── docs/
|
||||||
|
├── ARCHITECTURE.md # This file
|
||||||
|
└── TASKS.md # Implementation tasks
|
||||||
|
```
|
||||||
|
|
||||||
|
### Single File Implementation
|
||||||
|
For production deployment as specified, the complete application exists in one HTML file:
|
||||||
|
- `index.html` - Contains all HTML, CSS, and JavaScript inline
|
||||||
|
- No external dependencies or build process required
|
||||||
|
- Immediate browser deployment ready
|
||||||
|
|
||||||
|
## Component Interactions
|
||||||
|
|
||||||
|
### 1. File Upload Flow
|
||||||
|
```
|
||||||
|
User Input → File Selection → Validation → Preview Display
|
||||||
|
↓ ↓ ↓ ↓
|
||||||
|
Drag&Drop → File API → Size/Type Check → Image Preview
|
||||||
|
Click → Reader → Convert to → Update UI
|
||||||
|
→ Base64 → Base64 → State Update
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. AI Processing Flow
|
||||||
|
```
|
||||||
|
Generate Click → API Request → Loading State → Response Handling
|
||||||
|
↓ ↓ ↓ ↓
|
||||||
|
Validate → Construct → Show Spinner → Display Result
|
||||||
|
Image → Payload → Disable UI → Handle Errors
|
||||||
|
↓ ↓ ↓ ↓
|
||||||
|
State Check → Send POST → Timeout → Success/Failure
|
||||||
|
→ Minimax API → Management → UI Update
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Error Handling Flow
|
||||||
|
```
|
||||||
|
Any Error → Error Handler → User Notification → Recovery Option
|
||||||
|
↓ ↓ ↓ ↓
|
||||||
|
Network → Categorize → Clear Message → Reset/Retry
|
||||||
|
API → Error Type → Visual Alert → State Reset
|
||||||
|
File → Log Details → Action Required → User Guidance
|
||||||
|
Validation → Store State → UX Feedback → Continue Flow
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow Architecture
|
||||||
|
|
||||||
|
### Application State Management
|
||||||
|
```javascript
|
||||||
|
appState = {
|
||||||
|
currentImage: {
|
||||||
|
file: File | null,
|
||||||
|
base64: string | null,
|
||||||
|
preview: string | null,
|
||||||
|
metadata: {
|
||||||
|
name: string,
|
||||||
|
size: number,
|
||||||
|
type: string
|
||||||
|
}
|
||||||
|
},
|
||||||
|
apiStatus: {
|
||||||
|
isProcessing: boolean,
|
||||||
|
lastError: string | null,
|
||||||
|
requestId: string | null
|
||||||
|
},
|
||||||
|
ui: {
|
||||||
|
dragOver: boolean,
|
||||||
|
showPreview: boolean,
|
||||||
|
showResults: boolean
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Event-Driven Architecture
|
||||||
|
- **File Input Events**: Handle drag&drop, click-to-browse, file selection
|
||||||
|
- **Validation Events**: File size, type, and format checking
|
||||||
|
- **API Events**: Request initiation, response handling, error management
|
||||||
|
- **UI Events**: Loading states, animations, user feedback
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
### Client-Side Security
|
||||||
|
- **Input Validation**: Strict file type and size checking
|
||||||
|
- **XSS Prevention**: Sanitized content display
|
||||||
|
- **API Key Management**: Client-side exposure (note: production should use server-side proxy)
|
||||||
|
- **HTTPS Only**: Secure transmission to Minimax API
|
||||||
|
|
||||||
|
### Production Recommendations
|
||||||
|
1. **Server-Side API Proxy**: Move API calls to backend to hide API keys
|
||||||
|
2. **Rate Limiting**: Prevent API abuse
|
||||||
|
3. **File Scanning**: Server-side malware detection
|
||||||
|
4. **Content Security Policy**: Additional XSS protection
|
||||||
|
|
||||||
|
## Performance Optimizations
|
||||||
|
|
||||||
|
### Client-Side Optimizations
|
||||||
|
- **Lazy Loading**: Load UI components on demand
|
||||||
|
- **Debounced Validation**: Reduce unnecessary processing
|
||||||
|
- **Memory Management**: Clean up base64 strings after use
|
||||||
|
- **Progressive Enhancement**: Core functionality works without JavaScript
|
||||||
|
|
||||||
|
### API Optimizations
|
||||||
|
- **Request Compression**: Minimize payload size
|
||||||
|
- **Timeout Management**: Prevent hanging requests
|
||||||
|
- **Retry Logic**: Handle transient network failures
|
||||||
|
- **Caching**: Avoid duplicate API calls for same images
|
||||||
|
|
||||||
|
## Scalability Considerations
|
||||||
|
|
||||||
|
### Current Architecture Limits
|
||||||
|
- **Client-Only Processing**: Limited by user's device capabilities
|
||||||
|
- **File Size Constraints**: 5MB limit for practical performance
|
||||||
|
- **API Rate Limits**: Dependent on Minimax service limits
|
||||||
|
|
||||||
|
### Future Enhancements
|
||||||
|
- **Backend Integration**: Server-side processing and API management
|
||||||
|
- **Batch Processing**: Multiple image handling
|
||||||
|
- **User Accounts**: Save and manage image descriptions
|
||||||
|
- **Advanced Features**: Multiple language support, custom prompts
|
||||||
|
|
||||||
|
## Browser Compatibility
|
||||||
|
|
||||||
|
### Supported Features
|
||||||
|
- **File API**: Modern browsers (IE10+, all modern browsers)
|
||||||
|
- **Base64 Encoding**: Universal browser support
|
||||||
|
- **CSS Grid/Flexbox**: IE11+, all modern browsers
|
||||||
|
- **Fetch API**: IE11+, all modern browsers (polyfill available)
|
||||||
|
|
||||||
|
### Fallback Strategies
|
||||||
|
- **Older Browsers**: Graceful degradation with polyfills
|
||||||
|
- **No JavaScript**: Basic form submission (limited functionality)
|
||||||
|
- **Network Issues**: Offline mode with queued requests
|
||||||
|
|
||||||
|
## Deployment Architecture
|
||||||
|
|
||||||
|
### Static File Deployment
|
||||||
|
- **CDN Ready**: Single HTML file suitable for any CDN
|
||||||
|
- **Zero Dependencies**: No npm packages or build process
|
||||||
|
- **Instant Deployment**: Upload and serve immediately
|
||||||
|
- **Version Control**: Simple Git-based version management
|
||||||
|
|
||||||
|
### Environment Configuration
|
||||||
|
- **Development**: Direct API calls with test keys
|
||||||
|
- **Staging**: Mirror production with environment-specific settings
|
||||||
|
- **Production**: Server-side API proxy recommended for security
|
||||||
|
|
@ -0,0 +1,43 @@
|
||||||
|
Create a clean, modern web application that allows users to upload an image and get an AI-generated description using the Minimax API.
|
||||||
|
|
||||||
|
**Requirements:**
|
||||||
|
|
||||||
|
1. **Frontend Interface:**
|
||||||
|
- Single page application with a clean, centered layout
|
||||||
|
- File upload area (drag-and-drop support + click to browse)
|
||||||
|
- Image preview after upload
|
||||||
|
- "Generate Description" button
|
||||||
|
- Loading state while processing
|
||||||
|
- Display area for the AI-generated description
|
||||||
|
- Option to upload a new image after getting results
|
||||||
|
|
||||||
|
2. **Technical Implementation:**
|
||||||
|
- Use vanilla JavaScript, HTML, and CSS (or React if you prefer)
|
||||||
|
- Handle image file validation (accept common formats: jpg, png, webp)
|
||||||
|
- Convert uploaded image to base64 for API submission
|
||||||
|
- Make POST request to Minimax API endpoint
|
||||||
|
- Handle API responses and errors gracefully
|
||||||
|
- Display clear error messages if something goes wrong
|
||||||
|
|
||||||
|
3. **Minimax API Integration:**
|
||||||
|
- Endpoint: `https://api.minimax.io/v1/text/chatcompletion_v2`
|
||||||
|
- Use model: `MiniMax-M2`
|
||||||
|
- Send image as base64 in the messages array
|
||||||
|
- Prompt: "Please provide a detailed description of this image in English"
|
||||||
|
- Handle API key securely (note: for production, this should be handled server-side)
|
||||||
|
|
||||||
|
4. **UI/UX Details:**
|
||||||
|
- Responsive design that works on mobile and desktop
|
||||||
|
- Professional color scheme (suggest modern blues/grays)
|
||||||
|
- Smooth transitions and loading animations
|
||||||
|
- Clear visual feedback for all user actions
|
||||||
|
|
||||||
|
5. **Error Handling:**
|
||||||
|
- File size validation (max 5MB recommended)
|
||||||
|
- File type validation
|
||||||
|
- API error handling with user-friendly messages
|
||||||
|
- Network error handling
|
||||||
|
|
||||||
|
Please create a complete, production-ready single HTML file with inline CSS and JavaScript that I can immediately use in a browser.
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -0,0 +1,174 @@
|
||||||
|
# Image Description AI - Implementation Tasks
|
||||||
|
|
||||||
|
## Task 1: Create Basic HTML Structure and Styling Foundation
|
||||||
|
**Objective**: Establish the foundational HTML structure with semantic markup and responsive CSS framework
|
||||||
|
**Deliverables**:
|
||||||
|
- Complete HTML skeleton with proper DOCTYPE and meta tags
|
||||||
|
- Responsive CSS Grid layout system for the main application container
|
||||||
|
- Modern color scheme implementation (blues/grays as specified)
|
||||||
|
- Typography system with readable fonts and proper spacing
|
||||||
|
- Mobile-first responsive breakpoints
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] HTML validates as HTML5 standard
|
||||||
|
- [ ] Layout is responsive across mobile (320px+), tablet (768px+), and desktop (1024px+)
|
||||||
|
- [ ] Color scheme uses professional blue/gray palette with proper contrast ratios
|
||||||
|
- [ ] Typography is legible across all device sizes
|
||||||
|
- [ ] Basic layout structure includes: header, main upload area, preview section, results section
|
||||||
|
- [ ] CSS is modular with clear section organization (layout, components, utilities)
|
||||||
|
|
||||||
|
## Task 2: Implement File Upload Interface with Drag & Drop
|
||||||
|
**Objective**: Create an intuitive file upload interface supporting both drag-and-drop and click-to-browse functionality
|
||||||
|
**Deliverables**:
|
||||||
|
- Drag-and-drop zone with visual feedback states (drag over, drop, default)
|
||||||
|
- Hidden file input element for click-to-browse functionality
|
||||||
|
- Visual upload area with icon, instructional text, and file format specifications
|
||||||
|
- File type icon display for different image formats
|
||||||
|
- Hover and focus states for accessibility
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] Drag-and-drop zone visually responds to drag events with proper styling
|
||||||
|
- [ ] Click-to-browse opens file dialog and triggers file selection
|
||||||
|
- [ ] Upload area shows clear instructions: "Drag an image here or click to browse"
|
||||||
|
- [ ] Supported formats are displayed: JPG, PNG, WebP
|
||||||
|
- [ ] Accessibility: Keyboard navigation and screen reader support
|
||||||
|
- [ ] Visual feedback on hover/focus with smooth transitions
|
||||||
|
|
||||||
|
## Task 3: Add File Validation and Error Handling System
|
||||||
|
**Objective**: Implement comprehensive file validation to ensure only valid images are processed
|
||||||
|
**Deliverables**:
|
||||||
|
- File type validation (accept only jpg, jpeg, png, webp)
|
||||||
|
- File size validation (maximum 5MB with user-friendly error messages)
|
||||||
|
- Validation feedback system with clear error messages
|
||||||
|
- File metadata extraction (name, size, type)
|
||||||
|
- Reset functionality to clear errors and start over
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] Invalid file types show error: "Please select a valid image file (JPG, PNG, WebP)"
|
||||||
|
- [ ] Files over 5MB show error: "File size must be less than 5MB"
|
||||||
|
- [ ] Error messages display in red text below upload area
|
||||||
|
- [ ] Successful validation clears previous errors
|
||||||
|
- [ ] Validation occurs immediately upon file selection
|
||||||
|
- [ ] Files with valid extensions but invalid content are caught
|
||||||
|
- [ ] Reset button clears all errors and upload area state
|
||||||
|
|
||||||
|
## Task 4: Implement Image Preview Functionality
|
||||||
|
**Objective**: Display uploaded image with proper sizing and formatting for user confirmation
|
||||||
|
**Deliverables**:
|
||||||
|
- Image preview container with proper aspect ratio handling
|
||||||
|
- Image resizing and optimization for preview (max 400px width/height)
|
||||||
|
- Base64 encoding of the selected image for API submission
|
||||||
|
- Metadata display (filename, file size, dimensions)
|
||||||
|
- Replace/change image functionality
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] Image preview displays within 2 seconds of file selection
|
||||||
|
- [ ] Preview maintains aspect ratio without distortion
|
||||||
|
- [ ] Large images are scaled appropriately for preview
|
||||||
|
- [ ] Base64 encoding completes successfully and is stored in memory
|
||||||
|
- [ ] Image metadata is extracted and displayed (filename, size in MB, dimensions)
|
||||||
|
- [ ] "Change Image" button allows uploading a different file
|
||||||
|
- [ ] Preview clears when starting over
|
||||||
|
|
||||||
|
## Task 5: Integrate Minimax API for Image Description Generation
|
||||||
|
**Objective**: Connect to Minimax API and implement the core AI description generation functionality
|
||||||
|
**Deliverables**:
|
||||||
|
- API request construction with proper payload format
|
||||||
|
- Base64 image embedding in the request body
|
||||||
|
- Proper error handling for network issues and API responses
|
||||||
|
- Response parsing and extraction of the AI-generated description
|
||||||
|
- API key management (client-side with security notes)
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] API request uses correct endpoint: `https://api.minimax.io/v1/text/chatcompletion_v2`
|
||||||
|
- [ ] Request payload includes model: "MiniMax-M2"
|
||||||
|
- [ ] Base64 image is properly formatted in the messages array
|
||||||
|
- [ ] Prompt "Please provide a detailed description of this image in English" is included
|
||||||
|
- [ ] Successful API response extracts the description text
|
||||||
|
- [ ] API errors are handled gracefully with user-friendly messages
|
||||||
|
- [ ] Network timeouts are handled with appropriate error messaging
|
||||||
|
- [ ] Loading state is shown during API calls
|
||||||
|
|
||||||
|
## Task 6: Create Loading States and User Feedback System
|
||||||
|
**Objective**: Implement comprehensive loading and feedback mechanisms to enhance user experience
|
||||||
|
**Deliverables**:
|
||||||
|
- Loading spinner and progress indicator during API calls
|
||||||
|
- Status messages for different processing stages
|
||||||
|
- Button state management (disabled during processing)
|
||||||
|
- Timeout handling with user notification
|
||||||
|
- Success and error state animations
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] Loading spinner appears immediately when "Generate Description" is clicked
|
||||||
|
- [ ] "Generate Description" button is disabled during processing to prevent duplicate requests
|
||||||
|
- [ ] Status message shows: "Analyzing image with AI..."
|
||||||
|
- [ ] Processing timeout (30 seconds) shows: "Processing is taking longer than expected"
|
||||||
|
- [ ] Success animation plays when description is generated
|
||||||
|
- [ ] Error state shows appropriate error message with red styling
|
||||||
|
- [ ] Loading states have smooth transitions and professional appearance
|
||||||
|
|
||||||
|
## Task 7: Display AI Description Results with Formatting
|
||||||
|
**Objective**: Present the AI-generated description in a clean, readable format with additional functionality
|
||||||
|
**Deliverables**:
|
||||||
|
- Results display area with proper typography and spacing
|
||||||
|
- Text formatting and paragraph handling for long descriptions
|
||||||
|
- Option to copy description to clipboard
|
||||||
|
- "Generate New Description" functionality for the same image
|
||||||
|
- "Upload New Image" reset functionality
|
||||||
|
- Responsive results layout
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] Description displays in a readable format with proper line breaks
|
||||||
|
- [ ] Long descriptions are scrollable if they exceed viewport
|
||||||
|
- [ ] "Copy to Clipboard" button works and shows confirmation
|
||||||
|
- [ ] "Generate New Description" button triggers new API call
|
||||||
|
- [ ] "Upload New Image" button clears all data and returns to upload state
|
||||||
|
- [ ] Results area is visually distinct from upload area
|
||||||
|
- [ ] Typography is large enough to read comfortably on mobile devices
|
||||||
|
|
||||||
|
## Task 8: Final Polish, Testing, and Production Readiness
|
||||||
|
**Objective**: Complete final testing, optimizations, and prepare for immediate browser deployment
|
||||||
|
**Deliverables**:
|
||||||
|
- Comprehensive error handling for all edge cases
|
||||||
|
- Performance optimization for large images and slow networks
|
||||||
|
- Cross-browser compatibility testing
|
||||||
|
- Accessibility improvements (ARIA labels, keyboard navigation)
|
||||||
|
- Final code organization and documentation
|
||||||
|
- Single HTML file consolidation with inline CSS and JavaScript
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
- [ ] Application works in Chrome, Firefox, Safari, and Edge
|
||||||
|
- [ ] All functionality works on mobile devices (iOS and Android)
|
||||||
|
- [ ] Images up to 5MB process within 30 seconds on average connections
|
||||||
|
- [ ] Clear error messages for all failure scenarios (network, API, validation)
|
||||||
|
- [ ] Complete keyboard navigation support
|
||||||
|
- [ ] ARIA labels and semantic HTML for screen readers
|
||||||
|
- [ ] No console errors or warnings
|
||||||
|
- [ ] Single HTML file contains all code and loads immediately in any modern browser
|
||||||
|
- [ ] Professional appearance with smooth animations and transitions
|
||||||
|
- [ ] File size under 50KB for fast loading
|
||||||
|
|
||||||
|
## Implementation Notes
|
||||||
|
|
||||||
|
### Task Dependencies
|
||||||
|
- Tasks 1-2 can be implemented in parallel
|
||||||
|
- Task 3 depends on Task 2 (validation needs upload interface)
|
||||||
|
- Task 4 depends on Task 3 (preview needs validation)
|
||||||
|
- Task 5 depends on Task 4 (API needs base64 image)
|
||||||
|
- Task 6 depends on Task 5 (loading states during API calls)
|
||||||
|
- Task 7 depends on Task 6 (results display after processing)
|
||||||
|
- Task 8 depends on all previous tasks (final testing and polish)
|
||||||
|
|
||||||
|
### Technical Considerations
|
||||||
|
- Use vanilla JavaScript ES6+ features for modern browser support
|
||||||
|
- Implement CSS custom properties for maintainable theming
|
||||||
|
- Follow progressive enhancement principles
|
||||||
|
- Maintain separation of concerns within the single file
|
||||||
|
- Include comprehensive error boundaries
|
||||||
|
|
||||||
|
### Testing Approach
|
||||||
|
- Test with various image sizes and formats
|
||||||
|
- Test error scenarios (large files, invalid types, network issues)
|
||||||
|
- Verify responsive behavior across devices
|
||||||
|
- Validate accessibility with screen readers
|
||||||
|
- Performance test with slow network connections
|
||||||
|
|
@ -0,0 +1,43 @@
|
||||||
|
Create a clean, modern web application that allows users to upload an image and get an AI-generated description using the Minimax API.
|
||||||
|
|
||||||
|
**Requirements:**
|
||||||
|
|
||||||
|
1. **Frontend Interface:**
|
||||||
|
- Single page application with a clean, centered layout
|
||||||
|
- File upload area (drag-and-drop support + click to browse)
|
||||||
|
- Image preview after upload
|
||||||
|
- "Generate Description" button
|
||||||
|
- Loading state while processing
|
||||||
|
- Display area for the AI-generated description
|
||||||
|
- Option to upload a new image after getting results
|
||||||
|
|
||||||
|
2. **Technical Implementation:**
|
||||||
|
- Use vanilla JavaScript, HTML, and CSS (or React if you prefer)
|
||||||
|
- Handle image file validation (accept common formats: jpg, png, webp)
|
||||||
|
- Convert uploaded image to base64 for API submission
|
||||||
|
- Make POST request to Minimax API endpoint
|
||||||
|
- Handle API responses and errors gracefully
|
||||||
|
- Display clear error messages if something goes wrong
|
||||||
|
|
||||||
|
3. **Minimax API Integration:**
|
||||||
|
- Endpoint: `https://api.minimax.io/v1/text/chatcompletion_v2`
|
||||||
|
- Use model: `MiniMax-M2`
|
||||||
|
- Send image as base64 in the messages array
|
||||||
|
- Prompt: "Please provide a detailed description of this image in English"
|
||||||
|
- Handle API key securely (note: for production, this should be handled server-side)
|
||||||
|
|
||||||
|
4. **UI/UX Details:**
|
||||||
|
- Responsive design that works on mobile and desktop
|
||||||
|
- Professional color scheme (suggest modern blues/grays)
|
||||||
|
- Smooth transitions and loading animations
|
||||||
|
- Clear visual feedback for all user actions
|
||||||
|
|
||||||
|
5. **Error Handling:**
|
||||||
|
- File size validation (max 5MB recommended)
|
||||||
|
- File type validation
|
||||||
|
- API error handling with user-friendly messages
|
||||||
|
- Network error handling
|
||||||
|
|
||||||
|
Please create a complete, production-ready single HTML file with inline CSS and JavaScript that I can immediately use in a browser.
|
||||||
|
|
||||||
|
|
||||||
Loading…
Reference in New Issue