docs: complete project research
Files: - STACK.md - FEATURES.md - ARCHITECTURE.md - PITFALLS.md - SUMMARY.md Key findings: - Stack: FastAPI 0.135.1 + React 19 + Vite 7 + Tailwind v4, single-container deploy - Architecture: FastAPI serves React SPA via catch-all, file-based storage (Docker volume), LLMService with retry/backoff - Critical pitfall: All 9 pitfalls map to Phase 1 — Italian prompts, Canva field constants, UTF-8 BOM, root_path config, per-item bulk isolation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
492
.planning/research/ARCHITECTURE.md
Normal file
492
.planning/research/ARCHITECTURE.md
Normal file
@@ -0,0 +1,492 @@
|
|||||||
|
# Architecture Research
|
||||||
|
|
||||||
|
**Domain:** Content Automation System (FastAPI + React SPA, file-based storage, LLM integration)
|
||||||
|
**Researched:** 2026-03-07
|
||||||
|
**Confidence:** HIGH
|
||||||
|
|
||||||
|
## Standard Architecture
|
||||||
|
|
||||||
|
### System Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ EXTERNAL LAYER │
|
||||||
|
│ │
|
||||||
|
│ lab.mlhub.it/postgenerator/ │
|
||||||
|
│ nginx lab-router (strips /postgenerator/ prefix, forwards to :8000) │
|
||||||
|
└─────────────────────────────────────┬───────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌──────────────────────────────────────▼──────────────────────────────┐
|
||||||
|
│ SINGLE DOCKER CONTAINER │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ FastAPI (port 8000) │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
||||||
|
│ │ │ /api/prompts│ │ /api/calendar│ │ /api/generate│ │ │
|
||||||
|
│ │ │ Router │ │ Router │ │ Router │ │ │
|
||||||
|
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬────────┘ │ │
|
||||||
|
│ │ │ │ │ │ │
|
||||||
|
│ │ ┌──────▼───────┐ ┌──────▼───────┐ ┌──────▼────────┐ │ │
|
||||||
|
│ │ │ PromptService│ │CalendarService│ │ LLMService │ │ │
|
||||||
|
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬────────┘ │ │
|
||||||
|
│ │ │ │ │ │ │
|
||||||
|
│ │ ┌──────▼──────────────────▼──────────────────▼────────┐ │ │
|
||||||
|
│ │ │ Storage Layer │ │ │
|
||||||
|
│ │ │ /data/prompts/*.txt /data/outputs/*.csv │ │ │
|
||||||
|
│ │ │ /data/config/*.json /data/swipe-files/*.json │ │ │
|
||||||
|
│ │ └───────────────────────────────────────────────────────┘ │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ [GET /*, /assets/*] → SPAStaticFiles → /app/dist/ │ │
|
||||||
|
│ │ (catch-all route serves React index.html for SPA routing) │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
└──────────────────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌────────▼──────────┐
|
||||||
|
│ Claude API │
|
||||||
|
│ (Anthropic) │
|
||||||
|
│ External HTTPS │
|
||||||
|
└────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Component Responsibilities
|
||||||
|
|
||||||
|
| Component | Responsibility | Boundary |
|
||||||
|
|-----------|----------------|----------|
|
||||||
|
| nginx lab-router | Strip `/postgenerator/` prefix, forward to container port 8000 | External → Container |
|
||||||
|
| FastAPI main.py | Mount routers, serve React SPA via catch-all, configure root_path | App entry point |
|
||||||
|
| API Routers | Validate HTTP requests, call services, return responses | HTTP → Business logic |
|
||||||
|
| Services | Business logic, orchestration, no HTTP concerns | Business logic → Storage |
|
||||||
|
| LLMService | Claude API calls with retry, JSON validation, prompt loading | Business logic → External API |
|
||||||
|
| PromptService | Load/list/save .txt prompt files from /prompts/ directory | Business logic → Filesystem |
|
||||||
|
| CalendarService | Generate weekly/monthly post schedules from campaign config | Business logic → Domain logic |
|
||||||
|
| CSVBuilder | Transform generated content into Canva Bulk Create format | Business logic → File output |
|
||||||
|
| SwipeFileManager | Read/write curated content collections from JSON | Business logic → Filesystem |
|
||||||
|
| Storage layer | JSON config files, CSV outputs, prompt .txt files, swipe files | Filesystem |
|
||||||
|
| React SPA | UI: form inputs, results display, calendar views, file downloads | Browser → API |
|
||||||
|
|
||||||
|
## Recommended Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
postgenerator/
|
||||||
|
├── backend/
|
||||||
|
│ ├── main.py # FastAPI app, router includes, root_path config, SPA mount
|
||||||
|
│ ├── config.py # Settings from env vars (paths, API keys)
|
||||||
|
│ ├── routers/
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── prompts.py # GET/POST/DELETE /api/prompts
|
||||||
|
│ │ ├── formats.py # GET /api/formats (carousel, post, story)
|
||||||
|
│ │ ├── calendar.py # POST /api/calendar/generate
|
||||||
|
│ │ ├── campaigns.py # CRUD /api/campaigns
|
||||||
|
│ │ ├── generate.py # POST /api/generate (LLM call endpoint)
|
||||||
|
│ │ ├── csv.py # POST /api/csv/build, GET /api/csv/download/{id}
|
||||||
|
│ │ └── swipe.py # CRUD /api/swipe-files
|
||||||
|
│ ├── services/
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── llm_service.py # Claude API calls, retry logic, JSON validation
|
||||||
|
│ │ ├── prompt_service.py # Load/list/save .txt files from /data/prompts/
|
||||||
|
│ │ ├── calendar_service.py # Generate date-indexed content schedules
|
||||||
|
│ │ ├── csv_builder.py # Build Canva Bulk Create CSV (max 300 rows, 150 cols)
|
||||||
|
│ │ └── swipe_service.py # Read/write swipe file JSON collections
|
||||||
|
│ ├── schemas/
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── prompt.py # Pydantic models for prompt request/response
|
||||||
|
│ │ ├── calendar.py # CalendarRequest, CalendarResponse
|
||||||
|
│ │ ├── generate.py # GenerateRequest, GeneratedPost, GenerateResponse
|
||||||
|
│ │ └── csv.py # CSVBuildRequest, CSVRow schema
|
||||||
|
│ └── data/ # Runtime data (Docker volume mount)
|
||||||
|
│ ├── prompts/ # Editable .txt prompt templates
|
||||||
|
│ │ ├── carousel_hook.txt
|
||||||
|
│ │ ├── carousel_slides.txt
|
||||||
|
│ │ └── caption_cta.txt
|
||||||
|
│ ├── outputs/ # Generated CSV files (named by timestamp)
|
||||||
|
│ ├── campaigns/ # JSON campaign configs
|
||||||
|
│ └── swipe-files/ # JSON swipe file collections
|
||||||
|
├── frontend/
|
||||||
|
│ ├── src/
|
||||||
|
│ │ ├── App.tsx
|
||||||
|
│ │ ├── pages/
|
||||||
|
│ │ │ ├── PromptManager.tsx
|
||||||
|
│ │ │ ├── CalendarView.tsx
|
||||||
|
│ │ │ ├── Generator.tsx
|
||||||
|
│ │ │ └── SwipeFile.tsx
|
||||||
|
│ │ ├── components/
|
||||||
|
│ │ └── api/ # API client (axios or fetch wrappers)
|
||||||
|
│ │ └── client.ts # Base URL, request helpers
|
||||||
|
│ ├── dist/ # Build output — FastAPI serves from here
|
||||||
|
│ └── vite.config.ts # base: '/postgenerator/' for subpath SPA routing
|
||||||
|
├── Dockerfile # Multi-stage: Node build → Python runtime
|
||||||
|
├── docker-compose.yml
|
||||||
|
└── .env # ANTHROPIC_API_KEY, DATA_PATH=/data
|
||||||
|
```
|
||||||
|
|
||||||
|
### Structure Rationale
|
||||||
|
|
||||||
|
- **routers/ vs services/:** Routers own HTTP (request parsing, status codes, response shaping). Services own business logic with no HTTP imports. This separation makes services testable independently.
|
||||||
|
- **schemas/:** Pydantic models live separately from logic. Shared between router (validation) and service (type hints).
|
||||||
|
- **data/ as Docker volume:** Prompts and outputs persist across container restarts. Mount as named volume in docker-compose. This is the "database" for a file-based system.
|
||||||
|
- **frontend/dist/ inside container:** React build output is copied into the container during Docker build. FastAPI serves it via SPAStaticFiles catch-all.
|
||||||
|
|
||||||
|
## Architectural Patterns
|
||||||
|
|
||||||
|
### Pattern 1: FastAPI Serving React SPA via Catch-All Route
|
||||||
|
|
||||||
|
**What:** FastAPI mounts React's build output with a custom StaticFiles handler that falls back to `index.html` for any path not found — enabling React Router client-side navigation.
|
||||||
|
|
||||||
|
**When to use:** Single container deployment where React and FastAPI coexist. Eliminates CORS complexity and separate web server process.
|
||||||
|
|
||||||
|
**Trade-offs:** API routes must be prefixed (e.g., `/api/`) to avoid colliding with the SPA catch-all. The SPA catch-all must be registered LAST after all API routers.
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```python
|
||||||
|
# backend/main.py
|
||||||
|
from fastapi import FastAPI
|
||||||
|
from fastapi.staticfiles import StaticFiles
|
||||||
|
from starlette.responses import FileResponse
|
||||||
|
import os
|
||||||
|
|
||||||
|
app = FastAPI(root_path="/postgenerator") # Required for subpath deployment
|
||||||
|
|
||||||
|
# API routers first
|
||||||
|
app.include_router(prompts_router, prefix="/api/prompts")
|
||||||
|
app.include_router(calendar_router, prefix="/api/calendar")
|
||||||
|
app.include_router(generate_router, prefix="/api/generate")
|
||||||
|
app.include_router(csv_router, prefix="/api/csv")
|
||||||
|
app.include_router(swipe_router, prefix="/api/swipe")
|
||||||
|
|
||||||
|
# SPA catch-all MUST be last
|
||||||
|
class SPAStaticFiles(StaticFiles):
|
||||||
|
async def get_response(self, path: str, scope):
|
||||||
|
try:
|
||||||
|
return await super().get_response(path, scope)
|
||||||
|
except Exception:
|
||||||
|
# Fall back to index.html for SPA client-side routing
|
||||||
|
return await super().get_response("index.html", scope)
|
||||||
|
|
||||||
|
app.mount("/", SPAStaticFiles(directory="frontend/dist", html=True), name="spa")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern 2: LLM Service with Retry and Structured Output
|
||||||
|
|
||||||
|
**What:** Wrap all Claude API calls in a single service class. Use Anthropic's structured outputs (GA since Nov 2025 for Claude Sonnet/Opus) to guarantee JSON schema compliance. Fall back to prompt-based JSON enforcement with validation and retry for older approaches.
|
||||||
|
|
||||||
|
**When to use:** Any LLM integration in production. Retry + validation prevents cascading failures from transient Claude API errors or malformed responses.
|
||||||
|
|
||||||
|
**Trade-offs:** Structured outputs add latency (grammar compilation). Retry with exponential backoff adds latency on failure. Worth it for reliability. Max 3 retries before returning 503 to client.
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```python
|
||||||
|
# backend/services/llm_service.py
|
||||||
|
import anthropic
|
||||||
|
import time
|
||||||
|
from typing import Type
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
class LLMService:
|
||||||
|
def __init__(self, api_key: str, max_retries: int = 3, base_delay: float = 1.0):
|
||||||
|
self.client = anthropic.Anthropic(api_key=api_key)
|
||||||
|
self.max_retries = max_retries
|
||||||
|
self.base_delay = base_delay
|
||||||
|
|
||||||
|
def generate(self, prompt: str, response_schema: Type[BaseModel]) -> BaseModel:
|
||||||
|
"""Call Claude with retry. Returns validated Pydantic model instance."""
|
||||||
|
last_error = None
|
||||||
|
for attempt in range(self.max_retries):
|
||||||
|
try:
|
||||||
|
message = self.client.messages.create(
|
||||||
|
model="claude-sonnet-4-5",
|
||||||
|
max_tokens=4096,
|
||||||
|
messages=[{"role": "user", "content": prompt}],
|
||||||
|
)
|
||||||
|
raw = message.content[0].text
|
||||||
|
return response_schema.model_validate_json(raw)
|
||||||
|
except (anthropic.APIError, anthropic.RateLimitError) as e:
|
||||||
|
last_error = e
|
||||||
|
time.sleep(self.base_delay * (2 ** attempt))
|
||||||
|
except Exception as e:
|
||||||
|
raise # Don't retry validation errors — bad prompt
|
||||||
|
raise RuntimeError(f"Claude API failed after {self.max_retries} attempts: {last_error}")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern 3: File-Based Storage with Explicit Path Management
|
||||||
|
|
||||||
|
**What:** All storage operations go through service classes that know the data directory layout. No scattered `open()` calls in routers or business logic. Path constants defined once in `config.py`.
|
||||||
|
|
||||||
|
**When to use:** This project. File-based storage is the right choice here — no concurrent writes, data is config-like (prompts, campaigns), outputs are ephemeral CSV downloads.
|
||||||
|
|
||||||
|
**Trade-offs:** No transactions. Concurrent writes from multiple users could corrupt JSON files (acceptable for single-user tool). If multi-user needed later, add file locking (fcntl) or migrate to SQLite.
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```python
|
||||||
|
# backend/config.py
|
||||||
|
from pathlib import Path
|
||||||
|
import os
|
||||||
|
|
||||||
|
DATA_PATH = Path(os.getenv("DATA_PATH", "./backend/data"))
|
||||||
|
PROMPTS_PATH = DATA_PATH / "prompts"
|
||||||
|
OUTPUTS_PATH = DATA_PATH / "outputs"
|
||||||
|
CAMPAIGNS_PATH = DATA_PATH / "campaigns"
|
||||||
|
SWIPE_PATH = DATA_PATH / "swipe-files"
|
||||||
|
|
||||||
|
# backend/services/prompt_service.py
|
||||||
|
from backend.config import PROMPTS_PATH
|
||||||
|
|
||||||
|
class PromptService:
|
||||||
|
def list_prompts(self) -> list[str]:
|
||||||
|
return [f.stem for f in PROMPTS_PATH.glob("*.txt")]
|
||||||
|
|
||||||
|
def get_prompt(self, name: str) -> str:
|
||||||
|
path = PROMPTS_PATH / f"{name}.txt"
|
||||||
|
if not path.exists():
|
||||||
|
raise FileNotFoundError(f"Prompt '{name}' not found")
|
||||||
|
return path.read_text(encoding="utf-8")
|
||||||
|
|
||||||
|
def save_prompt(self, name: str, content: str) -> None:
|
||||||
|
(PROMPTS_PATH / f"{name}.txt").write_text(content, encoding="utf-8")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern 4: Calendar/Campaign Generation Engine
|
||||||
|
|
||||||
|
**What:** Calendar generation is pure Python — no LLM involved. Takes a campaign config (topics, frequency, date range, format mix) and produces a date-indexed schedule. Then LLM Generator fills each slot.
|
||||||
|
|
||||||
|
**When to use:** Separate the "what to generate" (calendar) from "generate it" (LLM). This way the calendar can be previewed, edited, and reused without burning API credits.
|
||||||
|
|
||||||
|
**Trade-offs:** Two-step UX (plan calendar → generate content) is more powerful but slightly more complex than one-shot generation.
|
||||||
|
|
||||||
|
**Example data flow:**
|
||||||
|
```
|
||||||
|
CampaignConfig (topics, freq, date_range, format_mix)
|
||||||
|
↓ CalendarService.generate()
|
||||||
|
CalendarSlots: [{date, topic, format, status: "pending"}, ...]
|
||||||
|
↓ User reviews calendar, confirms
|
||||||
|
↓ GenerateService.fill_slot(slot, prompt_name)
|
||||||
|
GeneratedContent: [{slot, hook, slides: [...], caption, cta}, ...]
|
||||||
|
↓ CSVBuilder.build(content_list)
|
||||||
|
CSV file ready for Canva Bulk Create download
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Flow
|
||||||
|
|
||||||
|
### Request Flow: LLM Content Generation
|
||||||
|
|
||||||
|
```
|
||||||
|
User (React)
|
||||||
|
│ POST /api/generate {prompt_name, topic, format, slot_count}
|
||||||
|
▼
|
||||||
|
generate.py Router
|
||||||
|
│ Validate with GenerateRequest schema
|
||||||
|
│ Load prompt text via PromptService
|
||||||
|
│ Build final prompt string (template + user params)
|
||||||
|
▼
|
||||||
|
LLMService.generate(prompt, GenerateResponse schema)
|
||||||
|
│ Call Claude API (with retry)
|
||||||
|
│ Parse and validate JSON response
|
||||||
|
▼
|
||||||
|
GenerateResponse (list of GeneratedPost)
|
||||||
|
▼
|
||||||
|
generate.py Router
|
||||||
|
│ Return 200 with JSON response
|
||||||
|
▼
|
||||||
|
React SPA
|
||||||
|
│ Render results, offer CSV download
|
||||||
|
▼
|
||||||
|
POST /api/csv/build {posts: [...]}
|
||||||
|
▼
|
||||||
|
CSVBuilder.build()
|
||||||
|
│ Write CSV to /data/outputs/{timestamp}.csv
|
||||||
|
▼
|
||||||
|
GET /api/csv/download/{timestamp}
|
||||||
|
│ FileResponse streaming download
|
||||||
|
▼
|
||||||
|
User downloads CSV → uploads to Canva Bulk Create
|
||||||
|
```
|
||||||
|
|
||||||
|
### Request Flow: Calendar Generation
|
||||||
|
|
||||||
|
```
|
||||||
|
User (React)
|
||||||
|
│ POST /api/calendar/generate {campaign_config}
|
||||||
|
▼
|
||||||
|
calendar.py Router
|
||||||
|
│ Validate CalendarRequest
|
||||||
|
▼
|
||||||
|
CalendarService.generate(config)
|
||||||
|
│ Pure Python: iterate date range
|
||||||
|
│ Apply frequency rules (3x/week, mix of formats)
|
||||||
|
│ Assign topics from rotation
|
||||||
|
▼
|
||||||
|
CalendarResponse: list of CalendarSlot
|
||||||
|
▼
|
||||||
|
React renders calendar grid
|
||||||
|
│ User edits slots, confirms
|
||||||
|
│ User triggers bulk generation per slot
|
||||||
|
▼
|
||||||
|
LLM Generator loop (one API call per slot)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Request Flow: Prompt Management
|
||||||
|
|
||||||
|
```
|
||||||
|
User (React)
|
||||||
|
│ GET /api/prompts → list all prompts
|
||||||
|
│ GET /api/prompts/{name} → get content
|
||||||
|
│ PUT /api/prompts/{name} → save edited prompt
|
||||||
|
▼
|
||||||
|
prompts.py Router
|
||||||
|
▼
|
||||||
|
PromptService
|
||||||
|
│ Read/write /data/prompts/*.txt
|
||||||
|
▼
|
||||||
|
File system (Docker volume)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Data Flows Summary
|
||||||
|
|
||||||
|
1. **Prompt → LLM → CSV:** Template text (from file) + user params → Claude API → structured JSON → CSV rows → Canva Bulk Create download
|
||||||
|
2. **Campaign Config → Calendar → Bulk Generate:** Campaign settings → date-slot schedule → per-slot LLM calls → aggregated CSV
|
||||||
|
3. **Swipe File:** Curated content JSON stored as reference — read-only during generation, write-only during curation
|
||||||
|
|
||||||
|
## Subpath Deployment: Critical Configuration
|
||||||
|
|
||||||
|
Deploying at `/postgenerator/` requires coordinated config at three levels:
|
||||||
|
|
||||||
|
### 1. nginx lab-router (strips prefix)
|
||||||
|
```nginx
|
||||||
|
location /postgenerator/ {
|
||||||
|
proxy_pass http://lab-postgenerator-app:8000/; # Trailing slash strips prefix
|
||||||
|
proxy_set_header Host $host;
|
||||||
|
proxy_set_header X-Real-IP $remote_addr;
|
||||||
|
proxy_set_header X-Forwarded-Proto https;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. FastAPI (root_path for OpenAPI docs)
|
||||||
|
```python
|
||||||
|
app = FastAPI(root_path="/postgenerator")
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. React/Vite (base path for assets)
|
||||||
|
```typescript
|
||||||
|
// vite.config.ts
|
||||||
|
export default defineConfig({
|
||||||
|
base: '/postgenerator/',
|
||||||
|
})
|
||||||
|
```
|
||||||
|
React Router must also use `<BrowserRouter basename="/postgenerator">`.
|
||||||
|
|
||||||
|
### 4. React API client (relative path)
|
||||||
|
```typescript
|
||||||
|
// frontend/src/api/client.ts
|
||||||
|
const API_BASE = '/postgenerator/api' // Absolute path including subpath
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scaling Considerations
|
||||||
|
|
||||||
|
| Scale | Architecture Adjustments |
|
||||||
|
|-------|--------------------------|
|
||||||
|
| Single user (MVP) | Current architecture — file-based storage, single container, acceptable |
|
||||||
|
| 2-5 concurrent users | Add file locking on JSON writes (fcntl/portalocker). Current CSV output naming by timestamp may collide — use UUID instead |
|
||||||
|
| 10+ users or teams | Migrate JSON config to SQLite (drop-in with no infra change). Add task queue (Celery + Redis) for async LLM calls — they're slow (5-15s each) |
|
||||||
|
| Production-grade | Separate frontend container, PostgreSQL, background worker for LLM generation with WebSocket progress updates |
|
||||||
|
|
||||||
|
### Scaling Priorities
|
||||||
|
|
||||||
|
1. **First bottleneck:** LLM call latency. Each Claude API call takes 5-15 seconds. If user triggers bulk calendar generation (20 posts), that's 100-300 seconds synchronously. Fix: move to async background tasks with polling endpoint before any other optimization.
|
||||||
|
2. **Second bottleneck:** File write conflicts. Two simultaneous saves to the same JSON campaign file corrupts it. Fix: file locking or SQLite.
|
||||||
|
|
||||||
|
## Anti-Patterns
|
||||||
|
|
||||||
|
### Anti-Pattern 1: Catch-All Route Registered Before API Routes
|
||||||
|
|
||||||
|
**What people do:** Mount the SPAStaticFiles before including API routers.
|
||||||
|
|
||||||
|
**Why it's wrong:** The catch-all intercepts all requests including API calls. Every `/api/generate` request returns `index.html` instead of the actual handler.
|
||||||
|
|
||||||
|
**Do this instead:** Always register all API routers first, SPAStaticFiles last. FastAPI matches routes in registration order.
|
||||||
|
|
||||||
|
### Anti-Pattern 2: Hardcoding Paths Instead of Using Config
|
||||||
|
|
||||||
|
**What people do:** Scatter `open("/data/prompts/foo.txt")` calls throughout services and routers.
|
||||||
|
|
||||||
|
**Why it's wrong:** Impossible to test locally (path doesn't exist). Impossible to change deployment path without grep-replace. Docker volume mount path change breaks everything.
|
||||||
|
|
||||||
|
**Do this instead:** Single `config.py` with `DATA_PATH = Path(os.getenv("DATA_PATH", "./backend/data"))`. All services import path constants from config.
|
||||||
|
|
||||||
|
### Anti-Pattern 3: LLM Calls in Routers
|
||||||
|
|
||||||
|
**What people do:** Call `anthropic.messages.create()` directly inside the FastAPI route handler.
|
||||||
|
|
||||||
|
**Why it's wrong:** Untestable without mocking the entire Anthropic client. Retry logic duplicated if called from multiple routes. Error handling scattered.
|
||||||
|
|
||||||
|
**Do this instead:** All Claude API calls go through `LLMService`. Router calls `llm_service.generate(prompt, schema)`. LLMService can be mocked in tests.
|
||||||
|
|
||||||
|
### Anti-Pattern 4: React Base Path Mismatch
|
||||||
|
|
||||||
|
**What people do:** Build React with default base `/` but deploy at `/postgenerator/`. Or configure `base: '/postgenerator/'` but use absolute paths like `/api/generate` in fetch calls.
|
||||||
|
|
||||||
|
**Why it's wrong:** Asset 404s (JS/CSS files request `/assets/index.js` instead of `/postgenerator/assets/index.js`). API calls fail because they don't include the subpath.
|
||||||
|
|
||||||
|
**Do this instead:** `vite.config.ts` sets `base: '/postgenerator/'`. React Router uses `basename="/postgenerator"`. API client uses the full path `/postgenerator/api`.
|
||||||
|
|
||||||
|
### Anti-Pattern 5: Synchronous Bulk LLM Generation Without Feedback
|
||||||
|
|
||||||
|
**What people do:** Loop over 20 calendar slots, call Claude 20 times synchronously, return a 200 after 3 minutes.
|
||||||
|
|
||||||
|
**Why it's wrong:** Browser timeouts (default 60s). User has no progress feedback. Single failure aborts entire batch.
|
||||||
|
|
||||||
|
**Do this instead (MVP version):** Generate one slot at a time with streaming response or server-sent events. Client calls `/api/generate/slot` in a loop with progress bar. Each call is independent — failure is isolated.
|
||||||
|
|
||||||
|
## Integration Points
|
||||||
|
|
||||||
|
### External Services
|
||||||
|
|
||||||
|
| Service | Integration Pattern | Notes |
|
||||||
|
|---------|---------------------|-------|
|
||||||
|
| Anthropic Claude API | HTTP via `anthropic` Python SDK. Sync client in service layer. | ANTHROPIC_API_KEY from env. claude-sonnet-4-5 recommended for structured output reliability. Rate limit: 429 handled in retry loop. |
|
||||||
|
| Canva Bulk Create | No direct API. Generate CSV file, user uploads manually to Canva. | CSV constraints: max 300 rows, max 150 columns, headers must match Canva template field names. Image URLs not supported — text fields only in MVP. |
|
||||||
|
|
||||||
|
### Internal Boundaries
|
||||||
|
|
||||||
|
| Boundary | Communication | Notes |
|
||||||
|
|----------|---------------|-------|
|
||||||
|
| React SPA ↔ FastAPI | REST JSON over HTTP. No WebSockets in MVP. | All API routes under `/api/` prefix. Vite dev proxy handles CORS in development. In production: same origin, no CORS needed. |
|
||||||
|
| LLMService ↔ PromptService | Direct Python call. LLMService receives prompt string, does not load files itself. | Router loads prompt via PromptService, passes text to LLMService. Keeps LLMService pure (no file I/O). |
|
||||||
|
| CSVBuilder ↔ Storage | CSVBuilder writes to `/data/outputs/`. Returns filename. Router returns download URL. | Files accumulate — no auto-cleanup in MVP. Add manual delete or TTL later. |
|
||||||
|
| CalendarService ↔ LLMService | CalendarService generates slots (pure Python). Router then calls LLMService per slot. | CalendarService has zero LLM coupling. Can generate and display calendar without burning API credits. |
|
||||||
|
|
||||||
|
## Build Order Implications
|
||||||
|
|
||||||
|
The component dependencies suggest this implementation sequence:
|
||||||
|
|
||||||
|
1. **Foundation first:** Docker setup, FastAPI skeleton, root_path config, nginx routing, React Vite project with correct base path. Verify the plumbing works before writing any business logic.
|
||||||
|
|
||||||
|
2. **Storage layer second:** Config.py paths, data directory structure, PromptService (read/write .txt files). This is the dependency for everything else.
|
||||||
|
|
||||||
|
3. **LLMService third:** Retry logic, JSON validation. Can be tested with a single hardcoded prompt before routing exists.
|
||||||
|
|
||||||
|
4. **Feature services in dependency order:**
|
||||||
|
- PromptService (no dependencies)
|
||||||
|
- CalendarService (no LLM dependency — pure Python)
|
||||||
|
- CSVBuilder (depends on knowing the output schema)
|
||||||
|
- LLMService (depends on PromptService for templates)
|
||||||
|
- SwipeService (independent, lowest priority)
|
||||||
|
|
||||||
|
5. **API routers last:** Wire services to HTTP. Each router is thin — validates input, calls service, returns output.
|
||||||
|
|
||||||
|
6. **React UI per feature:** Build UI for each backend feature after that backend feature is working. Don't build UI speculatively.
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
- FastAPI official docs — Behind a Proxy / root_path: https://fastapi.tiangolo.com/advanced/behind-a-proxy/
|
||||||
|
- FastAPI official docs — Bigger Applications: https://fastapi.tiangolo.com/tutorial/bigger-applications/
|
||||||
|
- FastAPI official docs — Static Files: https://fastapi.tiangolo.com/tutorial/static-files/
|
||||||
|
- Serving React from FastAPI (SPAStaticFiles pattern): https://davidmuraya.com/blog/serving-a-react-frontend-application-with-fastapi/
|
||||||
|
- FastAPI + React single container: https://dakdeniz.medium.com/fastapi-react-dockerize-in-single-container-e546e80b4e4d
|
||||||
|
- Anthropic Structured Outputs (Nov 2025): https://techbytes.app/posts/claude-structured-outputs-json-schema-api/
|
||||||
|
- Canva Bulk Create CSV requirements: https://www.canva.com/help/bulk-create/
|
||||||
|
- FastAPI best practices (service layer): https://orchestrator.dev/blog/2025-1-30-fastapi-production-patterns/
|
||||||
|
|
||||||
|
---
|
||||||
|
*Architecture research for: PostGenerator — Instagram carousel automation system*
|
||||||
|
*Researched: 2026-03-07*
|
||||||
226
.planning/research/FEATURES.md
Normal file
226
.planning/research/FEATURES.md
Normal file
@@ -0,0 +1,226 @@
|
|||||||
|
# Feature Research
|
||||||
|
|
||||||
|
**Domain:** Social media content automation / Instagram carousel bulk generator (B2B)
|
||||||
|
**Researched:** 2026-03-07
|
||||||
|
**Confidence:** MEDIUM-HIGH (ecosystem surveyed, product is novel niche — partial direct comparisons)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
PostGenerator is not a generic social media scheduler or a simple AI writer. It sits at the intersection of three product categories:
|
||||||
|
|
||||||
|
1. **Bulk content generation tools** (PostNitro, aiCarousels, Templated) — generate individual carousels
|
||||||
|
2. **Editorial calendar / content planning tools** (CoSchedule, StoryChief, SocialPilot) — plan and schedule
|
||||||
|
3. **LLM prompt management systems** (Agenta, PromptHub, PromptLayer) — manage AI generation
|
||||||
|
|
||||||
|
No direct competitor was found that combines all three with strategic frameworks (Schwartz levels, Persuasion Nurturing, narrative formats) and Canva Bulk Create CSV output. This is a **purpose-built tool**, so feature categorization is based on user expectation from adjacent tools plus the project's specific domain logic.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Feature Landscape
|
||||||
|
|
||||||
|
### Table Stakes (Users Expect These)
|
||||||
|
|
||||||
|
Features that, if missing, make the tool feel broken or incomplete. These are non-negotiable for v1.
|
||||||
|
|
||||||
|
| Feature | Why Expected | Complexity | Notes |
|
||||||
|
|---------|--------------|------------|-------|
|
||||||
|
| **Generate a 13-post editorial cycle** | Core value proposition — the entire reason the tool exists | HIGH | Orchestrates all strategic layers: Persuasion Nurturing distribution, Schwartz levels, narrative formats, niches. The central "one-click calendar" function. |
|
||||||
|
| **Canva-compatible CSV export** | Without this, the tool has no usable output | MEDIUM | Header must exactly match Canva Bulk Create schema. Each row = one carousel. 8 slide columns per carousel. Metadata columns included for analysis. |
|
||||||
|
| **LLM content generation per slide** | Users expect AI to produce actual copy, not just metadata | HIGH | Claude API integration. JSON-structured output per carousel with all 8 slides. Prompt must produce consistent slide structure. |
|
||||||
|
| **Format selector (tipo_contenuto x schwartz → formato)** | Core strategic logic — wrong format selection breaks campaign coherence | MEDIUM | Mapping table: content type + awareness level → narrative format (PAS, AIDA, BAB, etc.). Can be rules-based initially, no ML needed. |
|
||||||
|
| **Niche rotation (B2B verticals)** | Content must feel relevant to different target audiences | LOW | 50% generic PMI, 50% rotating through dentists/lawyers/ecommerce/local/agencies. Simple round-robin or weighted random. |
|
||||||
|
| **Topic generation (auto + manual override)** | Users need starting points; experts need control | MEDIUM | LLM generates topic suggestions per niche/type; user can override before generation. Hybrid model prevents "blank page" paralysis. |
|
||||||
|
| **Web UI (not CLI)** | Modern tool expectation; Michele explicitly chose Web UI | HIGH | FastAPI backend + React frontend. Minimum: calendar view, generate button, output review, CSV download. |
|
||||||
|
| **Prompt editor (file-based, editable via UI)** | Content quality depends on prompt quality; prompts must be tweakable | MEDIUM | Prompts stored as files; UI renders and allows editing without code. One prompt per combination (tipo x formato). |
|
||||||
|
| **Generation output review before export** | Users must see what was generated before committing to CSV | MEDIUM | Per-carousel preview: show all slide content, metadata tags, image keywords. Allow re-generation of individual carousels. |
|
||||||
|
| **Campaign phase tagging** | Metadata in CSV must include attira/cattura/coinvolgi/converti phase | LOW | Assigned during cycle planning, not AI-generated. Rules-based mapping from content type. |
|
||||||
|
| **Suggested publish dates** | Calendar output needs dates, even if publishing is manual | LOW | Auto-assign based on cycle start date + post index. Two carousels per week = 6-7 week cycle for 13 posts. |
|
||||||
|
|
||||||
|
### Differentiators (Competitive Advantage)
|
||||||
|
|
||||||
|
Features that no generic tool offers. This is where PostGenerator earns its existence.
|
||||||
|
|
||||||
|
| Feature | Value Proposition | Complexity | Notes |
|
||||||
|
|---------|-------------------|------------|-------|
|
||||||
|
| **Persuasion Nurturing cycle orchestration** | Ensures the 13-post mix is always strategically balanced (4 valore, 2 storytelling, 2 news, 3 riprova, 1 coinvolgimento, 1 promozione) | MEDIUM | The generator enforces mix rules, not the user. Output is never an ad-hoc list of posts — it's a coherent 2-week nurturing arc. |
|
||||||
|
| **Schwartz awareness level assignment** | Each post targets the right audience mindset for that funnel stage | MEDIUM | L5→L1 mapping integrated into prompt selection. Most tools ignore where the audience is in their buyer journey. |
|
||||||
|
| **7 narrative format library** | PAS, AIDA, BAB, Listicle, Storytelling/Eroe, Dato+Implicazione, Obiezione+Risposta — all optimized for carousel format | HIGH | Each format needs its own prompt structure because slide decomposition differs. PAS has 3-act structure; Listicle has variable item count. |
|
||||||
|
| **8-slide carousel structure enforcement** | Structured 8-slide schema (Cover, Problema, Contesto, Sviluppo A/B/C, Sintesi, CTA) ensures design-ready output | MEDIUM | Claude must output content fitting exactly the 8-slide structure. CSV columns are pre-mapped to slide positions. Canva template slots match exactly. |
|
||||||
|
| **B2B niche-aware copy** | Same topic, different framing for dentists vs. lawyers vs. ecommerce. Dramatically increases perceived relevance | MEDIUM | Niche injected into prompt context. Each niche has distinct pain points, vocabulary, and objections. |
|
||||||
|
| **Swipe File for idea capture** | Quick capture of external inspiration (ads, posts, headlines) that can influence future topic generation | LOW | Simple CRUD for saving ideas, snippets, references. Optional tagging by format or niche. Feed into topic selection as context. |
|
||||||
|
| **Image keyword generation per slide** | Every slide gets 1-2 search keywords for stock photo sourcing in Canva | LOW | Extracted by LLM during content generation. Optionally resolved to actual URLs via Unsplash API (when key available). |
|
||||||
|
| **Copywriting rules enforcement** | Built-in guardrails: "cosa fare" not "come farlo", tono diretto-provocatorio, nessun gergo, focus su imprenditori italiani | MEDIUM | Encoded in system prompt and per-format templates. Not user-configurable (by design — this is the brand voice). |
|
||||||
|
| **Single-post generation mode** | Generate one carousel on demand with full parameter control (tipo, formato, nicchia, livello Schwartz) | MEDIUM | Useful for ad-hoc content, testing, or filling gaps in an existing calendar. Reuses same generation engine as bulk mode. |
|
||||||
|
| **Campaign naming and grouping** | Each 13-post cycle is a named campaign with its own output folder and CSV | LOW | Enables history browsing: "Campagna Marzo settimana 1", "Campagna Aprile dentisti". Simple folder/file naming convention. |
|
||||||
|
|
||||||
|
### Anti-Features (Things to Deliberately NOT Build)
|
||||||
|
|
||||||
|
Features that seem valuable but would undermine the product's purpose, add disproportionate complexity, or belong in Phase 2+ only.
|
||||||
|
|
||||||
|
| Feature | Why Requested | Why Problematic | Alternative |
|
||||||
|
|---------|---------------|-----------------|-------------|
|
||||||
|
| **Direct Instagram publishing** | "Why download a CSV if I can post directly?" | Requires Instagram Business API + Meta App Review + OAuth. Adds compliance surface. Publishing quality control (visual review) is critical and belongs in Canva — that's the whole point. | Keep the Canva workflow. The CSV+Canva+manual-publish chain is deliberate, not a limitation. Document this clearly. |
|
||||||
|
| **Automatic scheduling calendar** | Every content tool has a scheduler | Scheduling is a solved problem (Buffer, Later, Metricool). Building it duplicates commodity software. Also: Canva doesn't support scheduling carousels via its own API in all cases. | Suggest publish dates in CSV. Let user paste into their preferred scheduler. |
|
||||||
|
| **Analytics / performance tracking** | "Show me which posts performed best" | Requires Instagram Insights API, engagement data storage, attribution logic. This is an entirely separate product domain. Premature before content engine is validated. | Out of scope per PROJECT.md. Phase 2 if content engine proves valuable. |
|
||||||
|
| **Multi-user / team collaboration** | "Share with my VA" | Adds authentication, role management, permission systems, conflict resolution. High complexity for a personal tool. | Single-user by design. If team use is needed later, add basic HTTP auth first. |
|
||||||
|
| **Real-time LLM streaming** | "Show text appearing as it's generated" | Adds websocket/SSE complexity to backend. For batch generation of 13 posts, streaming UX is confusing (13 parallel streams?). | Generate all, show loading state, display results. Simpler and actually better UX for batch workflows. |
|
||||||
|
| **Automatic A/B testing of prompts** | "Which prompt version performs better?" | Requires performance data linkage back from Instagram (see Analytics above). Without actual engagement data, A/B testing is cosmetic. | Let user manually edit prompts and observe output quality. Add versioning if needed. |
|
||||||
|
| **Brand voice training / fine-tuning** | "Make it sound more like me" | Fine-tuning Claude is not available via API. RAG-based voice imitation adds significant complexity. Claude with well-engineered prompts in Italian is already adequate. | Encode brand voice in system prompt. Offer prompt editor for adjustments. |
|
||||||
|
| **Image generation (AI)** | "Generate images instead of keywords" | AI image generation (DALL-E, Midjourney) produces generic or uncanny results for B2B professional content. Adds cost and quality uncertainty. Canva has its own image tools. | Generate Unsplash keywords. Let Canva handle image sourcing with its built-in library. |
|
||||||
|
| **Relational database** | "Use PostgreSQL for scalability" | No multi-user, no concurrent writes, no complex queries. File system is simpler, version-controllable, and inspectable. Premature optimization for a personal tool. | File system: prompts/ outputs/ data/ per PROJECT.md decision. Revisit only if scale demands it. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Feature Dependencies
|
||||||
|
|
||||||
|
```
|
||||||
|
[Calendar Generator]
|
||||||
|
├──requires──> [Format Selector]
|
||||||
|
│ └──requires──> [Persuasion Nurturing Rules]
|
||||||
|
│ └──requires──> [Schwartz Level Mapping]
|
||||||
|
│ └──requires──> [Narrative Format Library]
|
||||||
|
├──requires──> [LLM Content Generation]
|
||||||
|
│ └──requires──> [Prompt Editor / File Store]
|
||||||
|
│ └──requires──> [Niche Context Injection]
|
||||||
|
├──requires──> [CSV Builder]
|
||||||
|
│ └──requires──> [Canva Header Schema]
|
||||||
|
│ └──requires──> [Image Keyword Generator]
|
||||||
|
└──requires──> [Output Review UI]
|
||||||
|
|
||||||
|
[Single Post Generator]
|
||||||
|
└──requires──> [LLM Content Generation] (reuses same engine)
|
||||||
|
└──requires──> [Format Selector] (reuses same logic)
|
||||||
|
|
||||||
|
[Swipe File]
|
||||||
|
└──enhances──> [Topic Generation] (swipe items can inform topic context)
|
||||||
|
|
||||||
|
[Image Keyword Generator]
|
||||||
|
└──enhances──> [Unsplash API Fetch] (optional, only if API key present)
|
||||||
|
|
||||||
|
[Output Review UI]
|
||||||
|
└──requires──> [Generation History] (per-campaign output storage)
|
||||||
|
|
||||||
|
[Prompt Editor]
|
||||||
|
└──conflicts──> [Copywriting Rules Enforcement]
|
||||||
|
(user can break rules via prompt edits — document that editing overrides guardrails)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dependency Notes
|
||||||
|
|
||||||
|
- **Calendar Generator requires Format Selector:** Without the mapping logic (tipo x schwartz → formato), the generator cannot assign narrative formats to each of the 13 posts in the cycle.
|
||||||
|
- **LLM Content Generation requires Prompt Editor:** Prompts are the intelligence layer. If prompts are hardcoded and not editable, output quality cannot be improved without code changes.
|
||||||
|
- **CSV Builder requires Canva Header Schema:** The CSV header must match Canva's template placeholders exactly. This schema is fixed upfront and determines the entire data model.
|
||||||
|
- **Single Post Generator reuses Calendar engine:** Build once, expose in two modes. Do not create a separate code path for single-post — it creates drift.
|
||||||
|
- **Swipe File enhances Topic Generation:** Swipe items can be passed as context to the LLM when generating topics, but the dependency is loose (topic gen works without swipe file).
|
||||||
|
- **Image Keyword Generator enhances Unsplash Fetch:** Keywords are always generated. Unsplash fetch is an optional enrichment step gated on API key availability.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## MVP Definition
|
||||||
|
|
||||||
|
### Launch With (v1)
|
||||||
|
|
||||||
|
Minimum set to validate the core content engine. Everything below must work together for a usable output.
|
||||||
|
|
||||||
|
- [ ] **Calendar Generator** — Produces 13-post cycle with correct Persuasion Nurturing distribution
|
||||||
|
- [ ] **Format Selector** — Maps tipo_contenuto x schwartz_level → narrative_format deterministically
|
||||||
|
- [ ] **LLM Content Generation** — Claude API generates all 8 slides per carousel as structured JSON
|
||||||
|
- [ ] **Prompt File Store + Editor** — Prompts editable via UI without code deployment
|
||||||
|
- [ ] **Image Keyword Generation** — At least 1-2 keywords per slide (no Unsplash fetch in v1)
|
||||||
|
- [ ] **CSV Builder** — Output matches Canva Bulk Create header exactly, downloads as file
|
||||||
|
- [ ] **Output Review UI** — Show generated carousels before export; allow regenerating individual posts
|
||||||
|
- [ ] **Single Post Generator** — On-demand generation with parameter override (useful for testing)
|
||||||
|
- [ ] **Swipe File** — Capture and retrieve inspiration items (simple CRUD, no LLM integration in v1)
|
||||||
|
- [ ] **Campaign History** — Named output folders; browse and re-download past generations
|
||||||
|
|
||||||
|
### Add After Validation (v1.x)
|
||||||
|
|
||||||
|
Add once the content engine is proven and output quality is validated through actual use.
|
||||||
|
|
||||||
|
- [ ] **Unsplash Integration** — Resolve image keywords to actual URLs when API key is present; trigger: Michele obtains Unsplash API key
|
||||||
|
- [ ] **Swipe File → Topic Context** — Pass swipe items to LLM during topic generation; trigger: swipe file in regular use
|
||||||
|
- [ ] **Campaign performance notes** — Free-text notes per campaign ("this cycle performed well for dentists"); trigger: after first real publishing cycle
|
||||||
|
- [ ] **Bulk topic input** — Paste multiple topics, generate entire calendar per topic; trigger: workflow bottleneck identified
|
||||||
|
|
||||||
|
### Future Consideration (v2+)
|
||||||
|
|
||||||
|
Defer until product-market fit confirmed and use patterns established.
|
||||||
|
|
||||||
|
- [ ] **Analytics linkage** — Import engagement data from Instagram Insights; defer: requires platform API + post-publishing data
|
||||||
|
- [ ] **Multi-campaign comparison** — Compare content mixes across campaigns; defer: needs history depth
|
||||||
|
- [ ] **Team / shared access** — Basic auth for sharing with VA or client; defer: personal tool for now
|
||||||
|
- [ ] **LinkedIn adaptation** — Repurpose carousel content for LinkedIn carousel format; defer: different platform mechanics
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Feature Prioritization Matrix
|
||||||
|
|
||||||
|
| Feature | User Value | Implementation Cost | Priority |
|
||||||
|
|---------|------------|---------------------|----------|
|
||||||
|
| Calendar Generator (13-post cycle) | HIGH | HIGH | P1 |
|
||||||
|
| LLM Content Generation (8 slides/carousel) | HIGH | HIGH | P1 |
|
||||||
|
| CSV Builder (Canva-compatible) | HIGH | MEDIUM | P1 |
|
||||||
|
| Format Selector (tipo x schwartz → formato) | HIGH | LOW | P1 |
|
||||||
|
| Prompt File Store + Editor | HIGH | MEDIUM | P1 |
|
||||||
|
| Output Review UI | HIGH | MEDIUM | P1 |
|
||||||
|
| Niche rotation (B2B verticals) | HIGH | LOW | P1 |
|
||||||
|
| Image Keyword Generation | MEDIUM | LOW | P1 |
|
||||||
|
| Single Post Generator | MEDIUM | LOW | P1 |
|
||||||
|
| Swipe File (CRUD) | MEDIUM | LOW | P1 |
|
||||||
|
| Campaign History / naming | MEDIUM | LOW | P1 |
|
||||||
|
| Unsplash API fetch | MEDIUM | LOW | P2 |
|
||||||
|
| Swipe File → Topic context injection | MEDIUM | MEDIUM | P2 |
|
||||||
|
| Campaign performance notes | LOW | LOW | P2 |
|
||||||
|
| Bulk topic input | LOW | MEDIUM | P3 |
|
||||||
|
| Analytics linkage | HIGH | HIGH | P3 |
|
||||||
|
|
||||||
|
**Priority key:**
|
||||||
|
- P1: Must have for launch
|
||||||
|
- P2: Should have, add when possible
|
||||||
|
- P3: Nice to have, future consideration
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Competitor Feature Analysis
|
||||||
|
|
||||||
|
No direct competitor found that matches PostGenerator's specific combination. Closest adjacent tools:
|
||||||
|
|
||||||
|
| Feature | PostNitro | aiCarousels | StoryChief | PostGenerator (ours) |
|
||||||
|
|---------|-----------|-------------|------------|----------------------|
|
||||||
|
| Carousel generation | Yes (AI) | Yes (AI) | No | Yes (AI via Claude) |
|
||||||
|
| Bulk generation (batch 13+) | No | No | Partial | Yes (core feature) |
|
||||||
|
| Canva CSV export | No | No | No | Yes (primary output) |
|
||||||
|
| Strategic framework (Schwartz/Nurturing) | No | No | No | Yes (differentiator) |
|
||||||
|
| Editorial calendar | No | No | Yes | Yes (13-post cycle) |
|
||||||
|
| Narrative format library | No | No | No | Yes (7 formats) |
|
||||||
|
| Niche rotation | No | No | No | Yes (B2B verticals) |
|
||||||
|
| Prompt editor | No | No | No | Yes (file-based) |
|
||||||
|
| Multi-platform scheduling | Yes | Yes | Yes | No (deliberately) |
|
||||||
|
| Direct publishing | Yes | No | Yes | No (deliberately) |
|
||||||
|
| Analytics | Yes | No | Yes | No (deliberately) |
|
||||||
|
| Team collaboration | Yes | Yes | Yes | No (single user by design) |
|
||||||
|
|
||||||
|
**Conclusion:** PostGenerator wins on strategic depth and Canva-workflow integration. It loses (deliberately) on scheduling, publishing, and analytics. This is correct positioning for a personal B2B content engine — not a generic social media management platform.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
- PostNitro feature analysis: https://postnitro.ai/ (WebFetch, 2026-03-07)
|
||||||
|
- aiCarousels feature analysis: https://www.aicarousels.com/ (WebFetch, 2026-03-07)
|
||||||
|
- Canva Bulk Create documentation: https://www.canva.com/help/bulk-create/ (WebSearch, 2026-03-07)
|
||||||
|
- Canva Bulk Create CSV workflow: https://dreamina.capcut.com/resource/canva-bulk-create (WebSearch, 2026-03-07)
|
||||||
|
- B2B content marketing automation trends: https://reliqus.com/b2b-content-marketing-automation-strategy-2025/ (WebSearch, 2026-03-07)
|
||||||
|
- Social media content automation pitfalls: https://metricool.com/biggest-social-media-mistakes/ (WebSearch, 2026-03-07)
|
||||||
|
- AI content pitfalls 2025: https://www.wisedigitalpartners.com/learn/blog/strategy/ai-content-creation-2025-promise-pitfalls-path (WebSearch, 2026-03-07)
|
||||||
|
- LLM prompt management features: https://mirascope.com/blog/prompt-management-system (WebSearch, 2026-03-07)
|
||||||
|
- Editorial calendar B2B tools: https://storychief.io/blog/best-content-calendar-tools (WebSearch, 2026-03-07)
|
||||||
|
- Instagram carousel B2B strategy: https://alliedinsight.com/resources/b2b-instagram-2025-algorithm-guide/ (WebSearch, 2026-03-07)
|
||||||
|
- PostGenerator PROJECT.md: .planning/PROJECT.md (primary source for domain logic)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Feature research for: Instagram carousel automation / B2B content generation (PostGenerator)*
|
||||||
|
*Researched: 2026-03-07*
|
||||||
344
.planning/research/PITFALLS.md
Normal file
344
.planning/research/PITFALLS.md
Normal file
@@ -0,0 +1,344 @@
|
|||||||
|
# Pitfalls Research
|
||||||
|
|
||||||
|
**Domain:** LLM-powered bulk content generation / CSV output for design tools
|
||||||
|
**Project:** PostGenerator — Instagram carousel bulk generation for B2B Italian SME marketing
|
||||||
|
**Researched:** 2026-03-07
|
||||||
|
**Confidence:** HIGH (most pitfalls verified against official docs or multiple confirmed sources)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Critical Pitfalls
|
||||||
|
|
||||||
|
### Pitfall 1: LLM Output That "Looks Valid" But Isn't (Soft Failures)
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
Claude returns HTTP 200 with a valid response, JSON parses successfully, but the content is wrong: truncated mid-sentence, repeated slide text, missing slides in a carousel, or Italian that reads as translated-from-English. These "soft failures" never trigger error handling logic because there is no exception — only bad output silently written to files.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Developers test the happy path, see valid JSON, and assume correctness. Validation is wired only for parse errors, not for semantic/structural correctness. In bulk mode, one bad generation gets buried in a batch of 20 and is only caught by the end user.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Implement a two-level validation layer: (1) JSON schema validation via Pydantic on parse, (2) business rule validation on content — check slide count matches requested count, each slide has non-empty text fields, text lengths are within Canva field limits, no repeated content across slides.
|
||||||
|
- Log structured validation failures separately from API errors so you can track quality drift.
|
||||||
|
- For Italian content: detect obvious English leakage by checking for common English stop words ("the", "and", "of") in output fields that should be Italian.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Output carousels where multiple slides have identical or near-identical text.
|
||||||
|
- Generated `titolo` fields that are suspiciously short (< 5 chars) or over Canva limits.
|
||||||
|
- CSV rows where some cells are empty when the schema required content.
|
||||||
|
- User reports "Canva mappa un campo ma c'e' testo mancante".
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 (Core generation pipeline) — build validation alongside the first API call, not as a later addition.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 2: Canva Bulk Create CSV — Column Names Must Exactly Match Template Placeholders
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
Canva Bulk Create requires CSV column headers to exactly match the placeholder names defined in the Canva template design. A mismatch (case, space, accent, or encoding difference) causes the field to silently fail to map — the placeholder stays empty in the generated design. There is no error message; the design just looks incomplete.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Developers generate CSV column names programmatically from the content schema, but the Canva template was built manually with slightly different placeholder names. The user then has to manually re-map every field in the Canva UI, defeating the automation benefit.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Define placeholder names as a project-level constant (e.g. `CANVA_FIELDS = ["titolo", "sottotitolo", "testo_1", ...]`) that is shared between the template documentation and the CSV generator code.
|
||||||
|
- Include a "validate against template" step in the generation pipeline: before writing the CSV, verify every column name matches the expected constant list.
|
||||||
|
- Document exact Canva placeholder names in the project (e.g., in a `CANVA_TEMPLATE.md` or in the prompt templates themselves).
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- User says "Canva non collega automaticamente i campi, devo farlo a mano ogni volta".
|
||||||
|
- CSV preview in the application shows correct data but Canva designs come out blank.
|
||||||
|
- After a prompt template change, the column name changes and existing Canva templates break.
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 — define and lock the Canva field schema before writing any generation code.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 3: CSV UTF-8 BOM Encoding Breaks Canva / Excel Import
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
The CSV is correctly UTF-8 encoded, but Canva or the user's spreadsheet tool misinterprets it because a BOM (Byte Order Mark) is present or absent. Italian accented characters (à, è, é, ì, ò, ù) appear as `à `, `è` etc. in the design. Alternatively, the file is generated without BOM, and when the user opens it in Excel on Windows before uploading to Canva, Excel re-encodes it to Windows-1252, corrupting the accented characters before they reach Canva.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Python's `csv` module with `encoding='utf-8'` produces UTF-8 without BOM. Excel on Windows expects UTF-8 with BOM (`utf-8-sig`) to auto-detect encoding. Canva itself expects plain UTF-8. This creates a two-failure-mode trap: BOM for Excel users, no-BOM for Canva direct upload.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Generate CSV with `encoding='utf-8-sig'` (UTF-8 with BOM). Canva ignores the BOM; Excel on Windows correctly reads it. This is the "safe default" for this use case.
|
||||||
|
- Always include a charset declaration in the HTTP response header when serving the download: `Content-Type: text/csv; charset=utf-8`.
|
||||||
|
- Test the download with actual Italian content containing `àèéìòù` before declaring the feature complete.
|
||||||
|
- Add a note in the UI: "Apri il CSV in Google Sheets, non in Excel, per evitare problemi di encoding".
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Any accented character appearing as multiple garbled characters in Canva designs.
|
||||||
|
- User reports "i caratteri speciali sono sbagliati".
|
||||||
|
- CSV looks fine in a code editor but breaks when opened in Excel.
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 — encoding must be correct from the first CSV output, not retrofitted later.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 4: FastAPI `root_path` Double-Path Bug Behind Nginx Subpath Proxy
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
FastAPI deployed at `lab.mlhub.it/postgenerator/` requires `root_path` configuration to generate correct OpenAPI/Swagger URLs. However, if `root_path` is set both in `FastAPI()` constructor AND in Uvicorn, the prefix is applied twice, producing paths like `/postgenerator/postgenerator/openapi.json` which returns 404. The API may work correctly while the docs are broken, masking the configuration error.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
The FastAPI `root_path` mechanism is designed for "stripping path proxies" that remove the prefix before forwarding. If the nginx config forwards the full path (including `/postgenerator/`), FastAPI needs root_path to know its prefix but must NOT double-apply it. The interaction between FastAPI app-level root_path and Uvicorn-level root_path is a confirmed multi-year bug with many GitHub issues.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Set `root_path` only via Uvicorn (`--root-path /postgenerator`) or as an environment variable, NOT in the `FastAPI()` constructor.
|
||||||
|
- Configure nginx to strip the prefix before forwarding to the container: `proxy_pass http://container:8000/;` (note trailing slash — nginx strips the location prefix).
|
||||||
|
- Test OpenAPI docs at `lab.mlhub.it/postgenerator/docs` explicitly during initial deploy, not just API calls.
|
||||||
|
- Use `X-Forwarded-Prefix` header instead of root_path if nginx stripping is not feasible.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Swagger UI loads but API calls from the docs return 404.
|
||||||
|
- `/openapi.json` returns 404 while direct API endpoint calls work.
|
||||||
|
- Browser network tab shows requests to `../postgenerator/postgenerator/...`.
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 (deployment scaffolding) — verify proxy configuration before any feature work.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 5: Bulk Generation Without Per-Item State — All-or-Nothing Failure
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
A user requests 20 carousels. The backend loops through 20 API calls sequentially. The 15th call hits a rate limit or network timeout. The entire batch fails, the partial results are lost, and the user must start over from scratch.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Simple sequential loop with a try/except at the top level. No intermediate state is persisted. This is acceptable for a single-item generation but catastrophic for bulk operations where each item has real API cost.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Implement per-item status tracking: each carousel gets a status record (pending / processing / success / failed) stored in the file system or an in-memory dict with a job ID.
|
||||||
|
- On failure, mark the item as `failed` and continue processing the rest. Allow retry of only failed items.
|
||||||
|
- Return partial results: if 15/20 succeed, deliver those 15 in the CSV.
|
||||||
|
- Persist the job state to disk (JSON file per job) so that server restart does not lose progress.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Bulk requests timeout and the user sees no output.
|
||||||
|
- Backend logs show one exception that kills the loop.
|
||||||
|
- User has to re-enter all 20 topics to retry because no intermediate state exists.
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 for basic per-item error isolation; Phase 2+ for full job persistence and retry UI.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 6: Rate Limit Errors Treated as Generic Errors — No Backoff
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
Claude API returns HTTP 429 (rate limit exceeded). The backend catches `Exception`, logs "generation failed", and either retries immediately (hammering the API) or returns an error to the user. In Tier 1 (50 RPM, 8,000 OTPM for Sonnet), a bulk batch of 10-20 carousels can easily hit the OTPM ceiling.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Developers test with 1-2 items and never hit rate limits. Error handling is added generically. The `retry-after` header in the 429 response — which tells you exactly how long to wait — is ignored.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Implement specific 429 handling that reads the `retry-after` response header and waits that exact duration before retrying.
|
||||||
|
- Use exponential backoff with jitter for other transient errors (5xx), but honor `retry-after` precisely for 429.
|
||||||
|
- For bulk jobs: add a configurable delay between consecutive API calls (e.g. 2-3 seconds) to stay within OTPM limits even in Tier 1.
|
||||||
|
- Monitor `anthropic-ratelimit-output-tokens-remaining` response header to throttle proactively.
|
||||||
|
- Consider the Batches API (50% cheaper, separate rate limits) for non-interactive bulk generation.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Backend logs full of 429 errors during any batch larger than 5 items.
|
||||||
|
- Generation speed suddenly drops to zero mid-batch.
|
||||||
|
- Costs appear lower than expected (actually requests are failing, not succeeding).
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 — implement from the first API integration, not as a later optimization.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 7: Prompt Templates With Hard-Coded Assumptions That Break Silently
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
The prompt template instructs Claude to "generate 5 slides". Later, the user configures 7 slides per carousel. The template still says 5, but the schema expects 7. Claude generates 5 slides. The schema validator accepts it (minimum not enforced). CSV has 5 slide columns with data and 2 empty. Canva design has 2 blank slides.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Prompt template is written with specific numbers embedded as literals rather than as injected variables. When configuration changes, only the schema/code is updated, not the template. File-based prompt management makes this disconnect invisible — there is no compile-time check.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Make all variable parameters injectable: `Genera {{num_slides}} slide`, not "Genera 5 slide".
|
||||||
|
- Implement a template validation step at startup: parse all templates, identify all `{{variable}}` placeholders, and verify every placeholder has a corresponding runtime value.
|
||||||
|
- Use a template rendering test in CI: render each template with test values and verify output matches expected format.
|
||||||
|
- Keep a `TEMPLATE_VARIABLES.md` that documents every variable each template expects.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Generated carousels have inconsistent slide counts.
|
||||||
|
- Template file was last modified weeks ago but configuration was changed last week.
|
||||||
|
- New team member added a config option but forgot to update the prompt template.
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 (prompt system foundation) — variable injection must be architectural, not an afterthought.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 8: Italian Language — Prompts Written in English Produce "Translated" Italian
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
The system prompt is written in English and asks Claude to "generate content in Italian". Claude generates Italian that is grammatically correct but sounds translated: unnatural phrasing, English idioms translated literally, formal register too stiff for SME B2B social media, or UK English business vocabulary instead of Italian business vocabulary.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Claude is primarily English-trained. When given English instructions to produce Italian output, it "thinks in English and translates". The result passes spell-check but fails the native speaker smell test. B2B content for Italian entrepreneurs has specific vocabulary and register expectations (professional but not academic, action-oriented, concrete benefits).
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Write the system prompt IN Italian (not "write in Italian" from an English prompt). Italian-language instructions produce more natural Italian output.
|
||||||
|
- Provide Italian examples in the few-shot section: show Claude an example carousel with the exact tone, vocabulary, and structure you want.
|
||||||
|
- Define explicit tone guidelines in Italian: e.g. "Usa il tu, sii diretto, parla di benefici concreti, evita il jargon tecnico".
|
||||||
|
- Include a list of Italian B2B vocabulary to use/avoid.
|
||||||
|
- Note: Italian prompts cost ~2x more tokens than English equivalents — factor this into cost estimates.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Generated Italian text uses "fare business" when native would say "fare affari".
|
||||||
|
- Carousel titles that sound like SEO headlines rather than Instagram hooks.
|
||||||
|
- Formality that oscillates randomly (tu/lei mixed, formal/informal register).
|
||||||
|
- User feedback: "sembra scritto con Google Translate".
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 — prompt language and tone are foundational decisions, hard to fix post-deployment without regenerating all content.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 9: React Frontend API URL Hardcoded to Absolute Path — Breaks Behind Subpath Proxy
|
||||||
|
|
||||||
|
**What goes wrong:**
|
||||||
|
React frontend makes API calls to `/api/generate`. Behind nginx at `lab.mlhub.it/postgenerator/`, the actual path is `lab.mlhub.it/postgenerator/api/generate`. The frontend sends requests to `lab.mlhub.it/api/generate` which is a different nginx location (or 404). Works in local development, breaks immediately in production.
|
||||||
|
|
||||||
|
**Why it happens:**
|
||||||
|
Developers use Create React App / Vite proxy in development where `/api` works fine. In production, the subpath prefix is not accounted for. `REACT_APP_API_URL` environment variable is set to `/api` (absolute) instead of a relative or base-path-aware URL.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
- Use the Vite/CRA `VITE_API_BASE_URL` env var set to an empty string for development (proxy handles it) and `/postgenerator/api` for production builds.
|
||||||
|
- Alternatively: serve the React app and FastAPI from the same container with nginx internal routing — browser sees single origin, no CORS, no subpath math.
|
||||||
|
- Test the production build (not dev server) against the nginx proxy during Phase 1 deployment, before any feature work.
|
||||||
|
- In nginx config, ensure `/postgenerator/api/` proxies to FastAPI and `/postgenerator/` serves the React static files.
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- API calls return 404 or hit the wrong service in production but work in `npm run dev`.
|
||||||
|
- Browser network tab shows requests to `/api/generate` without the `/postgenerator` prefix.
|
||||||
|
- CORS errors in production (requests going to wrong origin because URL resolution failed).
|
||||||
|
|
||||||
|
**Phase to address:** Phase 1 deployment scaffolding — must be tested before any other development.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Technical Debt Patterns
|
||||||
|
|
||||||
|
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|
||||||
|
|----------|-------------------|----------------|-----------------|
|
||||||
|
| Store all files in flat directory without job subdirectories | Simple to implement | Hundreds of files mixed together, no cleanup possible | Never — use job-scoped directories from day 1 |
|
||||||
|
| Hardcode slide count (5) in prompt template | Fast to write | Config changes break output silently | Never — always inject as variable |
|
||||||
|
| Single try/except around entire generation loop | Simple error handling | One failure kills entire batch | Never for bulk — per-item isolation is required |
|
||||||
|
| Write CSV as UTF-8 without BOM | Python default | Italian accents corrupt in Excel | Never — use `utf-8-sig` always |
|
||||||
|
| Set `root_path` in both FastAPI and Uvicorn | Seems comprehensive | Double-path 404 bug in docs and some calls | Never — set in one place only |
|
||||||
|
| Generate all carousels before validating any | Defers complexity | User waits for all, then gets bulk failure | Acceptable in MVP if per-item failure isolation exists |
|
||||||
|
| English-language system prompt with "write in Italian" | Easier to write | Translated-sounding Italian output | Never for public-facing product |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Integration Gotchas
|
||||||
|
|
||||||
|
| Integration | Common Mistake | Correct Approach |
|
||||||
|
|-------------|----------------|------------------|
|
||||||
|
| Claude API structured outputs | Using schema with `minimum`/`maximum` constraints and expecting them to be enforced | Constraints are stripped from schema sent to Claude; use Pydantic validation post-response to enforce ranges |
|
||||||
|
| Claude API rate limits | Treating 429 as generic error, retrying immediately | Read `retry-after` header; honor exactly; add inter-request delay for bulk jobs |
|
||||||
|
| Canva Bulk Create | Column names generated from code schema diverge from template placeholders | Lock column names as constants shared between code and Canva template documentation |
|
||||||
|
| Canva Bulk Create | Uploading CSV with row count > 300 | Canva Bulk Create supports max 300 rows per upload batch; split large batches |
|
||||||
|
| Claude API OTPM | Generating 20 carousels at full speed in Tier 1 (8,000 OTPM) | Add configurable delay between calls; consider Batches API for non-interactive generation |
|
||||||
|
| FastAPI + nginx subpath | Setting `root_path` in FastAPI constructor when nginx does NOT strip prefix | Set root_path only in Uvicorn; configure nginx to strip prefix OR forward full path but not both |
|
||||||
|
| React + FastAPI in same Docker container | CORS configuration in FastAPI needed even for same-origin | Use nginx internal routing so browser sees single origin; CORS becomes irrelevant |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Performance Traps
|
||||||
|
|
||||||
|
| Trap | Symptoms | Prevention | When It Breaks |
|
||||||
|
|------|----------|------------|----------------|
|
||||||
|
| Sequential API calls for bulk | 20 carousels take 20x single time; UI shows "loading" with no feedback | Add progress tracking per item; consider async generation with status polling | Any bulk request > 3 items |
|
||||||
|
| No prompt caching for repeated system prompt | Each call sends full system prompt, consuming ITPM quota | Use `cache_control: ephemeral` on system prompt; same prompt cached for ~5 min | At Tier 1 OTPM limits with 5+ items in batch |
|
||||||
|
| CSV generated in memory for large batches | Memory spike, potential OOM in 256MB container | Stream CSV rows to disk as each carousel is generated | Batches > 50 carousels |
|
||||||
|
| File storage without cleanup | Disk fills up on VPS over time | Implement TTL-based cleanup for generated files (e.g., delete after 24h) | After ~1000 generation jobs depending on file size |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Mistakes
|
||||||
|
|
||||||
|
| Mistake | Risk | Prevention |
|
||||||
|
|---------|------|------------|
|
||||||
|
| Exposing Claude API key in frontend environment variables | Key leaked via browser DevTools, used for unauthorized API calls | Keep API key server-side only; frontend never sees it |
|
||||||
|
| No validation of user-provided topic/industry input before injecting into prompt | Prompt injection: user crafts input that overrides system prompt instructions | Sanitize and length-limit user inputs; wrap user content in explicit delimiters in prompt |
|
||||||
|
| Storing generated CSV files accessible at predictable URLs | Users can download others' generated content | Use UUID-based job IDs for file paths; validate job ownership before serving |
|
||||||
|
| No API rate limiting on the FastAPI endpoint | Unlimited calls = unlimited Claude API cost | Implement per-IP or per-session rate limiting on the generation endpoint |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## UX Pitfalls
|
||||||
|
|
||||||
|
| Pitfall | User Impact | Better Approach |
|
||||||
|
|---------|-------------|-----------------|
|
||||||
|
| No progress feedback during bulk generation | User sees spinner for 60+ seconds, assumes crash, refreshes, loses job | Show per-carousel progress (item 3/10 completed) with estimated time remaining |
|
||||||
|
| "Generation failed" error with no actionable info | User does not know if it was their input, the API, or a bug | Distinguish: "API limit reached, retry in X seconds" vs "Input too long" vs "Unexpected error (ID: xxx)" |
|
||||||
|
| CSV download triggers browser "open with" dialog | User opens CSV in Excel, encoding corrupts Italian text, blames the tool | Set `Content-Disposition: attachment; filename="carousels.csv"` and add note about Google Sheets |
|
||||||
|
| Generated content not previewable before CSV download | User discovers quality issues only after importing to Canva | Show a text preview of at least the first carousel before offering CSV download |
|
||||||
|
| No way to regenerate a single carousel | User must redo entire batch to fix one bad result | Allow per-item regeneration from the results view |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## "Looks Done But Isn't" Checklist
|
||||||
|
|
||||||
|
- [ ] **CSV encoding:** Contains actual Italian accented characters (àèéìòù) — verify download in both browser and Excel, not just code editor
|
||||||
|
- [ ] **Canva field mapping:** Test actual Canva import with the generated CSV, not just "the columns look right"
|
||||||
|
- [ ] **Rate limit handling:** Test with a batch of 10+ items to trigger actual rate limits in Tier 1
|
||||||
|
- [ ] **Subpath routing:** Test production build (Docker container) at nginx subpath, not local `npm run dev`
|
||||||
|
- [ ] **FastAPI docs:** Verify Swagger UI at `/postgenerator/docs` works AND API calls from within Swagger work
|
||||||
|
- [ ] **Italian quality:** Have a native Italian speaker review generated content, not just verify it is grammatically Italian
|
||||||
|
- [ ] **Partial failure:** Kill the process mid-batch and verify partial results are not lost
|
||||||
|
- [ ] **Large batch:** Test with 20 carousels (near Tier 1 limits) for both correctness and timing
|
||||||
|
- [ ] **Prompt variable injection:** Change slide count in config, verify prompt template reflects the change, verify output slide count matches
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Recovery Strategies
|
||||||
|
|
||||||
|
| Pitfall | Recovery Cost | Recovery Steps |
|
||||||
|
|---------|---------------|----------------|
|
||||||
|
| CSV encoding corruption | LOW | Change Python writer to `utf-8-sig`, regenerate affected CSVs |
|
||||||
|
| Canva field name mismatch | MEDIUM | Redefine constant, update all templates referencing old names, regenerate |
|
||||||
|
| FastAPI double root_path | LOW | Remove `root_path` from FastAPI constructor, set only in Uvicorn, redeploy |
|
||||||
|
| Italian quality issues in prompt | HIGH | Rewrite system prompt in Italian, add few-shot examples, re-evaluate all previously generated content |
|
||||||
|
| Bulk pipeline data loss (no per-item state) | HIGH | Requires architectural change to generation loop; cannot fix without refactor |
|
||||||
|
| Prompt template variable mismatch | MEDIUM | Add validation step, fix template, test all templates in sequence |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pitfall-to-Phase Mapping
|
||||||
|
|
||||||
|
| Pitfall | Prevention Phase | Verification |
|
||||||
|
|---------|------------------|--------------|
|
||||||
|
| Soft failures in LLM output (Pitfall 1) | Phase 1: Generation pipeline | Validation rejects malformed output in unit tests |
|
||||||
|
| Canva column name mismatch (Pitfall 2) | Phase 1: Define Canva field constants | Manual Canva import test with generated CSV |
|
||||||
|
| CSV UTF-8 BOM encoding (Pitfall 3) | Phase 1: First CSV output | Open downloaded CSV in Excel on Windows — no garbled chars |
|
||||||
|
| FastAPI root_path double bug (Pitfall 4) | Phase 1: Deployment scaffolding | Swagger UI works at `/postgenerator/docs` in Docker |
|
||||||
|
| Bulk batch all-or-nothing failure (Pitfall 5) | Phase 1: Per-item error isolation | Force mid-batch failure, verify partial results saved |
|
||||||
|
| Rate limit no-backoff (Pitfall 6) | Phase 1: API client layer | Batch of 10 items completes without 429 crashing the job |
|
||||||
|
| Prompt template hardcoded values (Pitfall 7) | Phase 1: Prompt template system | Change slide count config, verify template output changes |
|
||||||
|
| Italian quality / translated-sounding (Pitfall 8) | Phase 1: Prompt engineering | Native Italian speaker review before any user testing |
|
||||||
|
| React API URL subpath bug (Pitfall 9) | Phase 1: Deployment scaffolding | Production build test at nginx subpath URL |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
- [Claude API Rate Limits — official documentation](https://platform.claude.com/docs/en/api/rate-limits) (HIGH confidence — official, accessed 2026-03-07)
|
||||||
|
- [Claude Structured Outputs — official documentation](https://platform.claude.com/docs/en/build-with-claude/structured-outputs) (HIGH confidence — official, accessed 2026-03-07)
|
||||||
|
- [Canva Bulk Create — Help Center](https://www.canva.com/help/bulk-create/) (MEDIUM confidence — official product page, row/column limits confirmed)
|
||||||
|
- [Canva Bulk Create Data Autofill — Help Center](https://www.canva.com/help/bulk-create-data-autofill/) (MEDIUM confidence — 150 field limit confirmed)
|
||||||
|
- [FastAPI Behind a Proxy — official docs](https://fastapi.tiangolo.com/advanced/behind-a-proxy/) (HIGH confidence — official)
|
||||||
|
- [FastAPI root_path double-path issue — GitHub Discussion #9018](https://github.com/fastapi/fastapi/discussions/9018) (HIGH confidence — confirmed bug with multiple occurrences)
|
||||||
|
- [FastAPI Incorrect root_path duplicated prefixes — GitHub Discussion #11977](https://github.com/fastapi/fastapi/discussions/11977) (HIGH confidence)
|
||||||
|
- [Building Reliable LLM Pipelines: Error Handling Patterns — ilovedevops](https://ilovedevops.substack.com/p/building-reliable-llm-pipelines-error) (MEDIUM confidence — community, corroborates official rate limit docs)
|
||||||
|
- [Retries, fallbacks, circuit breakers in LLM apps — Portkey](https://portkey.ai/blog/retries-fallbacks-and-circuit-breakers-in-llm-apps/) (MEDIUM confidence — practitioner guide)
|
||||||
|
- [Non-English Languages Prompt Engineering Trade-offs — LinkedIn](https://www.linkedin.com/pulse/non-english-languages-prompt-engineering-trade-offs-giorgio-robino) (MEDIUM confidence — Italian token cost 2x confirmed)
|
||||||
|
- [Evalita-LLM: Benchmarking LLMs on Italian — arXiv](https://arxiv.org/html/2502.02289v1) (MEDIUM confidence — research confirming Italian LLM quality issues)
|
||||||
|
- [Canva CSV upload tips — Create Stimulate](https://createstimulate.com/blogs/news/canva-tips-for-uploading-csv-files-using-bulk-create) (LOW confidence — community blog, limited technical detail)
|
||||||
|
- [Opening CSV UTF-8 files in Excel — Microsoft Support](https://support.microsoft.com/en-us/office/opening-csv-utf-8-files-correctly-in-excel-8a935af5-3416-4edd-ba7e-3dfd2bc4a032) (HIGH confidence — official Microsoft, confirms BOM requirement)
|
||||||
|
|
||||||
|
---
|
||||||
|
*Pitfalls research for: PostGenerator — LLM-powered Instagram carousel bulk generation (B2B Italian SME)*
|
||||||
|
*Researched: 2026-03-07*
|
||||||
415
.planning/research/STACK.md
Normal file
415
.planning/research/STACK.md
Normal file
@@ -0,0 +1,415 @@
|
|||||||
|
# Stack Research
|
||||||
|
|
||||||
|
**Domain:** Python content automation backend + React SPA frontend (bulk social media content generation)
|
||||||
|
**Researched:** 2026-03-07
|
||||||
|
**Confidence:** HIGH (all versions verified against PyPI and npm registries as of March 2026)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Recommended Stack
|
||||||
|
|
||||||
|
### Core Technologies
|
||||||
|
|
||||||
|
| Technology | Version | Purpose | Why Recommended |
|
||||||
|
|------------|---------|---------|-----------------|
|
||||||
|
| Python | 3.12+ | Runtime | LTS, full async support, matches Docker slim images |
|
||||||
|
| FastAPI | 0.135.1 | API framework + static file serving | Native async, Pydantic v2 integration, StaticFiles for SPA, OpenAPI auto-docs. Current standard for Python REST APIs. |
|
||||||
|
| Uvicorn | 0.41.0 | ASGI server | Production-grade async server for FastAPI. "Standard" install includes uvicorn + extras. |
|
||||||
|
| Pydantic | 2.12.5 | Data validation + settings | FastAPI depends on it. v2 is 5-50x faster than v1. Required for request/response schemas. |
|
||||||
|
| anthropic (SDK) | 0.84.0 | Claude API client | Official Anthropic Python SDK. Supports sync/async, streaming via `client.messages.stream()`. |
|
||||||
|
| React | 19.2.4 | Frontend SPA | Latest stable. New concurrent features, improved hooks. Ecosystem standard. |
|
||||||
|
| Vite | 7.x (latest stable) | Frontend build tool | Faster than Webpack/CRA. Native ES modules in dev. First-party React plugin. Single-command build for production. |
|
||||||
|
| Tailwind CSS | 4.2.1 | Utility CSS | v4 is major rewrite: 5x faster builds, no tailwind.config.js required, native @tailwindcss/vite plugin. |
|
||||||
|
|
||||||
|
### Supporting Libraries
|
||||||
|
|
||||||
|
#### Python Backend
|
||||||
|
|
||||||
|
| Library | Version | Purpose | When to Use |
|
||||||
|
|---------|---------|---------|-------------|
|
||||||
|
| python-dotenv | 1.2.2 | Environment variable loading | Load `.env` in local dev. In Docker, use compose `environment:` directly. |
|
||||||
|
| python-multipart | 0.0.22 | Form/file upload parsing | Required by FastAPI for `UploadFile` and `Form` parameters. |
|
||||||
|
| httpx | 0.28.1 | Async HTTP client | Calling Unsplash API. Native async, HTTP/2 support, replaces requests for async code. |
|
||||||
|
| aiofiles | 24.x | Async file I/O | Reading/writing prompt files and generated CSVs without blocking the event loop. |
|
||||||
|
|
||||||
|
**CSV generation:** Use Python's built-in `csv` module (stdlib). No external dependency needed. This is the correct choice for structured CSV output to Canva. Pandas/Polars are overkill — the project generates structured rows, not analyzes datasets.
|
||||||
|
|
||||||
|
**Prompt file management:** Use Python's built-in `pathlib.Path` + `json`/`yaml` stdlib. No external library needed for file-based prompt storage.
|
||||||
|
|
||||||
|
#### React Frontend
|
||||||
|
|
||||||
|
| Library | Version | Purpose | When to Use |
|
||||||
|
|---------|---------|---------|-------------|
|
||||||
|
| @tailwindcss/vite | 4.x | Tailwind v4 Vite plugin | Required for Tailwind v4 integration. Replaces PostCSS config. |
|
||||||
|
| shadcn/ui | latest CLI | UI component collection | Copy-owned components built on Radix UI + Tailwind. Use for forms, tables, modals, progress indicators. Not an npm dep — CLI copies source. |
|
||||||
|
| @radix-ui/react-* | latest | Accessible primitives | Pulled in automatically by shadcn/ui. Headless, accessible. |
|
||||||
|
| react-router-dom | 7.x | Client-side routing | SPA navigation between generation wizard, history, settings pages. |
|
||||||
|
| @tanstack/react-query | 5.x | Server state + async data | Handles API call lifecycle (loading, error, cache). Essential for polling long-running generation jobs. |
|
||||||
|
| lucide-react | latest | Icons | shadcn/ui default icon set. Consistent with component library. |
|
||||||
|
|
||||||
|
### Development Tools
|
||||||
|
|
||||||
|
| Tool | Purpose | Notes |
|
||||||
|
|------|---------|-------|
|
||||||
|
| Docker | Container runtime | Multi-stage build: Node stage builds React, Python stage serves everything |
|
||||||
|
| docker compose | Local dev + VPS deploy | Single `docker-compose.yml` for both environments with env overrides |
|
||||||
|
| Vite dev proxy | Dev CORS bypass | `vite.config.ts` proxy `/api` to `localhost:8000` during development |
|
||||||
|
| TypeScript | Frontend type safety | Enabled by default in Vite React template. Catches API contract errors at compile time. |
|
||||||
|
| ESLint + prettier | Code quality | Vite template includes eslint. Add prettier for formatting. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
### Backend
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create and activate venv
|
||||||
|
python -m venv .venv
|
||||||
|
source .venv/bin/activate # Linux/Mac
|
||||||
|
# .venv\Scripts\activate # Windows
|
||||||
|
|
||||||
|
# Core
|
||||||
|
pip install fastapi[standard]==0.135.1
|
||||||
|
pip install anthropic==0.84.0
|
||||||
|
pip install httpx==0.28.1
|
||||||
|
pip install python-dotenv==1.2.2
|
||||||
|
pip install aiofiles
|
||||||
|
|
||||||
|
# python-multipart is included in fastapi[standard]
|
||||||
|
# pydantic is included in fastapi[standard]
|
||||||
|
# uvicorn is included in fastapi[standard]
|
||||||
|
|
||||||
|
# Generate requirements.txt
|
||||||
|
pip freeze > requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note:** `fastapi[standard]` installs FastAPI + uvicorn[standard] + python-multipart + pydantic + email-validator. This is the recommended install per FastAPI docs as of v0.100+. Do NOT install `fastapi-slim` — it was deprecated.
|
||||||
|
|
||||||
|
### Frontend
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create Vite + React + TypeScript project
|
||||||
|
npm create vite@latest frontend -- --template react-ts
|
||||||
|
cd frontend
|
||||||
|
|
||||||
|
# Install Tailwind v4 with Vite plugin
|
||||||
|
npm install tailwindcss @tailwindcss/vite
|
||||||
|
|
||||||
|
# Install routing and data fetching
|
||||||
|
npm install react-router-dom @tanstack/react-query
|
||||||
|
|
||||||
|
# Install shadcn/ui (interactive CLI)
|
||||||
|
npx shadcn@latest init
|
||||||
|
|
||||||
|
# Add components as needed
|
||||||
|
npx shadcn@latest add button input table progress select
|
||||||
|
|
||||||
|
# Dev dependencies
|
||||||
|
npm install -D typescript @types/react @types/react-dom
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tailwind v4 Configuration
|
||||||
|
|
||||||
|
In `vite.config.ts`:
|
||||||
|
```typescript
|
||||||
|
import { defineConfig } from 'vite'
|
||||||
|
import react from '@vitejs/plugin-react'
|
||||||
|
import tailwindcss from '@tailwindcss/vite'
|
||||||
|
|
||||||
|
export default defineConfig({
|
||||||
|
plugins: [
|
||||||
|
react(),
|
||||||
|
tailwindcss(),
|
||||||
|
],
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
In `src/index.css` (no `tailwind.config.js` needed in v4):
|
||||||
|
```css
|
||||||
|
@import "tailwindcss";
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture: Single-Container Deploy
|
||||||
|
|
||||||
|
FastAPI serves both the API and the built React SPA. This eliminates CORS configuration and simplifies Docker deployment on the VPS.
|
||||||
|
|
||||||
|
**Pattern:**
|
||||||
|
```python
|
||||||
|
# main.py
|
||||||
|
from fastapi import FastAPI
|
||||||
|
from fastapi.staticfiles import StaticFiles
|
||||||
|
from fastapi.responses import FileResponse
|
||||||
|
import os
|
||||||
|
|
||||||
|
app = FastAPI()
|
||||||
|
|
||||||
|
# API routes first
|
||||||
|
app.include_router(api_router, prefix="/api")
|
||||||
|
|
||||||
|
# SPA catch-all: serve index.html for any non-API route
|
||||||
|
@app.get("/{full_path:path}")
|
||||||
|
async def serve_spa(full_path: str):
|
||||||
|
static_dir = "static"
|
||||||
|
file_path = os.path.join(static_dir, full_path)
|
||||||
|
if os.path.isfile(file_path):
|
||||||
|
return FileResponse(file_path)
|
||||||
|
return FileResponse(os.path.join(static_dir, "index.html"))
|
||||||
|
|
||||||
|
# Mount static files (CSS, JS, assets)
|
||||||
|
app.mount("/assets", StaticFiles(directory="static/assets"), name="assets")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Multi-stage Dockerfile:**
|
||||||
|
```dockerfile
|
||||||
|
# Stage 1: Build React frontend
|
||||||
|
FROM node:22-slim AS frontend-builder
|
||||||
|
WORKDIR /app/frontend
|
||||||
|
COPY frontend/package*.json ./
|
||||||
|
RUN npm ci
|
||||||
|
COPY frontend/ .
|
||||||
|
RUN npm run build
|
||||||
|
|
||||||
|
# Stage 2: Python runtime
|
||||||
|
FROM python:3.12-slim
|
||||||
|
WORKDIR /app
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
COPY backend/ .
|
||||||
|
# Copy built frontend into FastAPI static directory
|
||||||
|
COPY --from=frontend-builder /app/frontend/dist ./static
|
||||||
|
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "3000"]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Claude API Integration
|
||||||
|
|
||||||
|
**Recommended model:** `claude-sonnet-4-5` for content generation (best cost/quality ratio for structured text output at volume).
|
||||||
|
|
||||||
|
**Sync pattern** (for bulk generation — simpler, no streaming needed for batch jobs):
|
||||||
|
```python
|
||||||
|
import anthropic
|
||||||
|
|
||||||
|
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env
|
||||||
|
|
||||||
|
response = client.messages.create(
|
||||||
|
model="claude-sonnet-4-5",
|
||||||
|
max_tokens=2000,
|
||||||
|
messages=[
|
||||||
|
{"role": "user", "content": prompt}
|
||||||
|
]
|
||||||
|
)
|
||||||
|
content = response.content[0].text
|
||||||
|
```
|
||||||
|
|
||||||
|
**Streaming pattern** (for real-time UI feedback during generation):
|
||||||
|
```python
|
||||||
|
with client.messages.stream(
|
||||||
|
model="claude-sonnet-4-5",
|
||||||
|
max_tokens=2000,
|
||||||
|
messages=[{"role": "user", "content": prompt}]
|
||||||
|
) as stream:
|
||||||
|
for text in stream.text_stream:
|
||||||
|
yield text # SSE to frontend
|
||||||
|
```
|
||||||
|
|
||||||
|
**Async pattern** (for FastAPI async endpoints):
|
||||||
|
```python
|
||||||
|
async_client = anthropic.AsyncAnthropic()
|
||||||
|
|
||||||
|
async def generate_content(prompt: str) -> str:
|
||||||
|
response = await async_client.messages.create(
|
||||||
|
model="claude-sonnet-4-5",
|
||||||
|
max_tokens=2000,
|
||||||
|
messages=[{"role": "user", "content": prompt}]
|
||||||
|
)
|
||||||
|
return response.content[0].text
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Unsplash API Integration
|
||||||
|
|
||||||
|
**Rate limits:**
|
||||||
|
- Demo (free) tier: **50 requests/hour**
|
||||||
|
- Production (after manual approval): 5,000 requests/hour
|
||||||
|
|
||||||
|
**Implication for bulk generation:** With 50 req/hour limit and bulk carousel generation, implement:
|
||||||
|
1. Local image cache (save URLs/download images locally after first fetch)
|
||||||
|
2. Deduplication (don't re-fetch same search term if cached)
|
||||||
|
3. Fallback to cached results when limit approached
|
||||||
|
|
||||||
|
**Python integration via httpx** (no dedicated library needed — Unsplash REST API is simple):
|
||||||
|
```python
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
async def search_unsplash(query: str, count: int = 1) -> list[dict]:
|
||||||
|
async with httpx.AsyncClient() as client:
|
||||||
|
response = await client.get(
|
||||||
|
"https://api.unsplash.com/search/photos",
|
||||||
|
params={"query": query, "per_page": count},
|
||||||
|
headers={"Authorization": f"Client-ID {UNSPLASH_ACCESS_KEY}"}
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()["results"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why not python-unsplash library:** `python-unsplash` (v1.2.5, last updated Nov 2023) and `pyunsplash` (v2020) are both poorly maintained. The Unsplash REST API is simple enough that `httpx` direct calls are cleaner, more maintainable, and don't add a stale dependency.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## CSV Generation for Canva Bulk Create
|
||||||
|
|
||||||
|
Use Python's stdlib `csv` module — no additional dependency:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import csv
|
||||||
|
import io
|
||||||
|
|
||||||
|
def generate_canva_csv(rows: list[dict], fields: list[str]) -> str:
|
||||||
|
"""Generate CSV string formatted for Canva Bulk Create."""
|
||||||
|
output = io.StringIO()
|
||||||
|
writer = csv.DictWriter(output, fieldnames=fields)
|
||||||
|
writer.writeheader()
|
||||||
|
writer.writerows(rows)
|
||||||
|
return output.getvalue()
|
||||||
|
```
|
||||||
|
|
||||||
|
**FastAPI endpoint for CSV download:**
|
||||||
|
```python
|
||||||
|
from fastapi.responses import StreamingResponse
|
||||||
|
|
||||||
|
@app.get("/api/export/{job_id}/csv")
|
||||||
|
async def export_csv(job_id: str):
|
||||||
|
csv_content = get_job_csv(job_id) # retrieve generated CSV
|
||||||
|
return StreamingResponse(
|
||||||
|
io.StringIO(csv_content),
|
||||||
|
media_type="text/csv",
|
||||||
|
headers={"Content-Disposition": f"attachment; filename=carousels_{job_id}.csv"}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File-Based Prompt Management
|
||||||
|
|
||||||
|
No database. Prompt templates stored as files:
|
||||||
|
|
||||||
|
```
|
||||||
|
backend/
|
||||||
|
prompts/
|
||||||
|
frameworks/
|
||||||
|
persuasion_nurturing.txt
|
||||||
|
schwartz_awareness.txt
|
||||||
|
niches/
|
||||||
|
tech_saas.txt
|
||||||
|
finance.txt
|
||||||
|
system_prompt.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
**Access pattern:**
|
||||||
|
```python
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
PROMPTS_DIR = Path(__file__).parent / "prompts"
|
||||||
|
|
||||||
|
def load_prompt(category: str, name: str) -> str:
|
||||||
|
prompt_path = PROMPTS_DIR / category / f"{name}.txt"
|
||||||
|
return prompt_path.read_text(encoding="utf-8")
|
||||||
|
|
||||||
|
def list_prompts(category: str) -> list[str]:
|
||||||
|
return [p.stem for p in (PROMPTS_DIR / category).glob("*.txt")]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Advantages over DB:** Zero setup, version-controlled with git, easy to edit, diff-friendly. Appropriate for this scale (tens of prompt files, not millions of records).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Alternatives Considered
|
||||||
|
|
||||||
|
| Recommended | Alternative | When to Use Alternative |
|
||||||
|
|-------------|-------------|-------------------------|
|
||||||
|
| FastAPI | Django REST Framework | When you need Django ORM, admin panel, or large team with Django expertise |
|
||||||
|
| FastAPI | Flask | Flask is simpler but lacks native async, auto OpenAPI docs, and Pydantic integration |
|
||||||
|
| React + Vite | Next.js | When you need SSR, SEO, or ISR. Overkill for an internal B2B tool served on a subpath. |
|
||||||
|
| Tailwind v4 | Tailwind v3 | Only if targeting environments where v4's new CSS syntax causes issues. v4 is stable as of 2025. |
|
||||||
|
| shadcn/ui | MUI / Chakra UI | When you need a full pre-styled design system without customization. shadcn is better for owned components. |
|
||||||
|
| stdlib csv | pandas | When doing data analysis, not just writing structured rows. pandas adds ~30MB to container. |
|
||||||
|
| httpx direct | python-unsplash | Only if Unsplash API complexity grows significantly (pagination, upload, etc.) |
|
||||||
|
| Single container | Separate nginx + Python | When frontend and backend scale independently, or need different update cadences. Overkill for this project. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What NOT to Use
|
||||||
|
|
||||||
|
| Avoid | Why | Use Instead |
|
||||||
|
|-------|-----|-------------|
|
||||||
|
| `fastapi-slim` | Deprecated as of recent FastAPI versions. Missing extras. | `fastapi[standard]` |
|
||||||
|
| `create-react-app` | Unmaintained since 2023. No Vite, slow builds, outdated webpack. | `npm create vite@latest` |
|
||||||
|
| `requests` library | Sync-only. Blocks FastAPI async event loop when called from async endpoints. | `httpx` with `AsyncClient` |
|
||||||
|
| `openai` library for Claude | Wrong SDK. Anthropic has official `anthropic` SDK with better type safety and streaming helpers. | `anthropic` SDK |
|
||||||
|
| Tailwind v3 PostCSS config | v3 config is still valid but v4 drops `tailwind.config.js` requirement and is significantly faster. | Tailwind v4 + `@tailwindcss/vite` |
|
||||||
|
| `python-unsplash` / `pyunsplash` | Both stale (2020-2023), no async support, thin wrappers adding maintenance risk. | `httpx` direct calls |
|
||||||
|
| `uvicorn` without `[standard]` | Base uvicorn lacks websocket support and other extensions. | `uvicorn[standard]` (included in `fastapi[standard]`) |
|
||||||
|
| Separate nginx container | Adds operational complexity without benefit for single-service subpath deploy. | FastAPI serves static files directly. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Stack Patterns by Variant
|
||||||
|
|
||||||
|
**For development (local):**
|
||||||
|
- Run FastAPI with `uvicorn main:app --reload` on port 8000
|
||||||
|
- Run Vite dev server on port 5173 with proxy: `vite.config.ts` → `proxy: { '/api': 'http://localhost:8000' }`
|
||||||
|
- No Docker needed locally
|
||||||
|
|
||||||
|
**For VPS subpath deployment (`lab.mlhub.it/postgenerator/`):**
|
||||||
|
- React must be built with `base: '/postgenerator/'` in `vite.config.ts`
|
||||||
|
- FastAPI must handle requests prefixed with `/postgenerator/` (via reverse proxy or app root_path)
|
||||||
|
- Set `root_path="/postgenerator"` in uvicorn or via `--root-path /postgenerator` flag
|
||||||
|
- lab-router nginx handles the subpath routing
|
||||||
|
|
||||||
|
**For bulk generation (many carousels in one job):**
|
||||||
|
- Use FastAPI `BackgroundTasks` for async job processing
|
||||||
|
- Store job state in a JSON file (file-based, no DB)
|
||||||
|
- Frontend polls `/api/jobs/{id}/status` with react-query
|
||||||
|
- Streaming SSE optional for real-time token display
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Version Compatibility
|
||||||
|
|
||||||
|
| Package | Compatible With | Notes |
|
||||||
|
|---------|-----------------|-------|
|
||||||
|
| fastapi==0.135.1 | pydantic>=2.0, starlette>=0.46.0 | v0.100+ requires pydantic v2. pydantic v1 not supported. |
|
||||||
|
| anthropic==0.84.0 | Python>=3.8 | Fully supports async via `AsyncAnthropic()`. |
|
||||||
|
| tailwindcss==4.2.1 | @tailwindcss/vite==4.x | v4 requires the Vite plugin, NOT PostCSS plugin (different from v3). |
|
||||||
|
| react==19.2.4 | react-router-dom>=7, @tanstack/react-query>=5 | React 19 has breaking changes from 18. Verify shadcn/ui component compatibility. |
|
||||||
|
| shadcn/ui | react>=18, tailwindcss>=3 | shadcn/ui officially supports Tailwind v4 since early 2025. Use `npx shadcn@latest init`. |
|
||||||
|
| python-multipart==0.0.22 | Python>=3.10 | Required for FastAPI file uploads. Requires py3.10+. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
- [PyPI - fastapi](https://pypi.org/project/fastapi/) — v0.135.1 verified March 1, 2026 (HIGH confidence)
|
||||||
|
- [PyPI - anthropic](https://pypi.org/project/anthropic/) — v0.84.0 verified February 25, 2026 (HIGH confidence)
|
||||||
|
- [PyPI - uvicorn](https://pypi.org/project/uvicorn/) — v0.41.0 verified February 16, 2026 (HIGH confidence)
|
||||||
|
- [PyPI - pydantic](https://pypi.org/project/pydantic/) — v2.12.5 verified (HIGH confidence)
|
||||||
|
- [PyPI - python-dotenv](https://pypi.org/project/python-dotenv/) — v1.2.2 verified March 1, 2026 (HIGH confidence)
|
||||||
|
- [PyPI - httpx](https://pypi.org/project/httpx/) — v0.28.1 verified December 6, 2024 (HIGH confidence)
|
||||||
|
- [PyPI - python-multipart](https://pypi.org/project/python-multipart/) — v0.0.22 verified January 25, 2026 (HIGH confidence)
|
||||||
|
- [npm - react](https://www.npmjs.com/package/react) — v19.2.4 (HIGH confidence, per WebSearch)
|
||||||
|
- [npm - tailwindcss](https://www.npmjs.com/package/tailwindcss) — v4.2.1 verified ~February 2026 (HIGH confidence)
|
||||||
|
- [npm - vite](https://vite.dev/releases) — v7.x confirmed stable 2026 (MEDIUM confidence — WebSearch shows 7.3.1, v8 beta in progress)
|
||||||
|
- [Anthropic Streaming Docs](https://platform.claude.com/docs/en/api/messages-streaming) — Streaming patterns verified directly (HIGH confidence)
|
||||||
|
- [Unsplash API Docs](https://unsplash.com/documentation) — Rate limits 50 req/hr demo, 5000 prod (HIGH confidence)
|
||||||
|
- [FastAPI Static Files Docs](https://fastapi.tiangolo.com/tutorial/static-files/) — SPA serving pattern (HIGH confidence)
|
||||||
|
- [Tailwind CSS v4 Announcement](https://tailwindcss.com/blog/tailwindcss-v4) — v4 features and migration (HIGH confidence)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Stack research for: PostGenerator — Instagram carousel automation (Python + FastAPI + React + Claude API)*
|
||||||
|
*Researched: 2026-03-07*
|
||||||
190
.planning/research/SUMMARY.md
Normal file
190
.planning/research/SUMMARY.md
Normal file
@@ -0,0 +1,190 @@
|
|||||||
|
# Project Research Summary
|
||||||
|
|
||||||
|
**Project:** PostGenerator — Instagram Carousel Bulk Generator for B2B Italian SME Marketing
|
||||||
|
**Domain:** LLM-powered social media content automation (Python + FastAPI + React SPA)
|
||||||
|
**Researched:** 2026-03-07
|
||||||
|
**Confidence:** HIGH
|
||||||
|
|
||||||
|
## Executive Summary
|
||||||
|
|
||||||
|
PostGenerator is a purpose-built content automation tool with no direct competitor. It sits at the intersection of bulk carousel generation, strategic editorial planning (Persuasion Nurturing + Schwartz awareness levels), and Canva Bulk Create workflow — a combination that no existing product (PostNitro, aiCarousels, StoryChief) offers. The recommended architecture is a single-container FastAPI + React SPA deployment: Python FastAPI handles both the REST API and serves the built React frontend, eliminating CORS configuration and simplifying VPS deployment under the lab-router nginx subpath at `lab.mlhub.it/postgenerator/`. The entire data layer is file-based (prompt .txt files, JSON configs, CSV outputs as Docker volume) — no database, no Supabase, no external services beyond the Claude API.
|
||||||
|
|
||||||
|
The core generation pipeline — calendar planning, LLM content generation, CSV export — has clear and well-researched implementation patterns. The stack is modern and stable: FastAPI 0.135.1 with Pydantic v2, React 19 with Vite 7, Tailwind CSS v4, and the official Anthropic SDK. The critical engineering challenge is not the technology but the orchestration: generating 13 carousels in a single batch, each requiring an independent Claude API call, with per-item error isolation, rate-limit handling, and structured JSON output validation — all before a reliable CSV is delivered.
|
||||||
|
|
||||||
|
The most severe risks are concentrated in Phase 1 and are all preventable with upfront decisions: Italian-language prompts (not English "write in Italian"), UTF-8 BOM CSV encoding, Canva field name constants locked before any generation code, per-item failure isolation in the generation loop, and the FastAPI `root_path` subpath deployment configured correctly from day one. All 9 critical pitfalls identified map to Phase 1, which makes the foundation phase the highest-stakes phase of the entire project.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Findings
|
||||||
|
|
||||||
|
### Recommended Stack
|
||||||
|
|
||||||
|
The stack is a single-container Python/React application. FastAPI serves as both the API server and static file host for the React SPA build output. The React frontend is built with Vite (base path set to `/postgenerator/`), served via a SPAStaticFiles catch-all route mounted last in FastAPI after all API routers. No separate nginx container, no CORS, no database. All persistent data lives in a Docker volume mounted at `/data/`.
|
||||||
|
|
||||||
|
The Claude API is accessed via the official `anthropic` Python SDK (v0.84.0), using `claude-sonnet-4-5` as the generation model. Unsplash image search is called directly via `httpx` (async) — no third-party Unsplash library (both available ones are stale). CSV generation uses Python's stdlib `csv` module with `utf-8-sig` encoding.
|
||||||
|
|
||||||
|
**Core technologies:**
|
||||||
|
- **Python 3.12 + FastAPI 0.135.1**: API framework + SPA file server — native async, Pydantic v2, auto OpenAPI docs, StaticFiles for React
|
||||||
|
- **Anthropic SDK 0.84.0**: Claude API client — official, async-capable, streaming support; use `claude-sonnet-4-5` for best cost/quality on structured text
|
||||||
|
- **React 19 + Vite 7 + TypeScript**: Frontend SPA — fast builds, native ESM dev server, subpath-aware with `base: '/postgenerator/'`
|
||||||
|
- **Tailwind CSS v4 + shadcn/ui**: UI styling — v4 drops `tailwind.config.js`, 5x faster builds, shadcn components copy-owned (no vendor lock)
|
||||||
|
- **TanStack Query v5**: API state management — essential for polling bulk generation job status
|
||||||
|
- **httpx 0.28.1**: Async HTTP client — for Unsplash API calls; replaces `requests` in async FastAPI context
|
||||||
|
- **stdlib `csv` + `pathlib`**: File generation and prompt management — no pandas, no external deps for these operations
|
||||||
|
|
||||||
|
### Expected Features
|
||||||
|
|
||||||
|
PostGenerator's feature set is entirely defined by the Persuasion Nurturing framework logic, not by generic social tool conventions. The Calendar Generator (13-post cycle with enforced mix: 4 valore, 2 storytelling, 2 news, 3 riprova, 1 coinvolgimento, 1 promozione) and the Canva CSV exporter are the two non-negotiable core deliverables. Everything else either enables or enriches them.
|
||||||
|
|
||||||
|
**Must have (table stakes — v1):**
|
||||||
|
- **13-post Calendar Generator** — core value proposition; enforces Persuasion Nurturing distribution automatically
|
||||||
|
- **LLM Content Generation (8 slides/carousel)** — Claude generates all slides as structured JSON; Schwartz levels + narrative formats applied per slot
|
||||||
|
- **Canva-compatible CSV export** — headers must exactly match Canva Bulk Create template placeholders; `utf-8-sig` encoding mandatory
|
||||||
|
- **Format Selector (tipo x Schwartz → formato)** — deterministic rules-based mapping; 7 narrative formats (PAS, AIDA, BAB, Listicle, Eroe, Dato, Obiezione)
|
||||||
|
- **Prompt File Store + Editor** — file-based .txt prompts editable via UI without code deployment
|
||||||
|
- **Output Review UI** — preview generated carousels before CSV export; allow per-item regeneration
|
||||||
|
- **Single Post Generator** — on-demand mode reusing the same generation engine; essential for testing
|
||||||
|
- **Swipe File (CRUD)** — capture inspiration items; simple JSON store, no LLM integration in v1
|
||||||
|
- **Campaign History** — named output folders; browse and re-download past generations
|
||||||
|
|
||||||
|
**Should have (differentiators — v1.x post validation):**
|
||||||
|
- **Unsplash API integration** — resolve image keywords to actual URLs (gated on API key presence)
|
||||||
|
- **Swipe File → Topic context injection** — pass swipe items as LLM context during topic generation
|
||||||
|
- **Campaign performance notes** — free-text notes per campaign for qualitative tracking
|
||||||
|
|
||||||
|
**Defer (v2+):**
|
||||||
|
- Direct Instagram publishing — requires Meta App Review, OAuth; Canva workflow is deliberate, not a limitation
|
||||||
|
- Analytics linkage — separate product domain; premature before content engine is validated
|
||||||
|
- Multi-user / team collaboration — personal tool by design; add basic HTTP auth only if explicitly needed
|
||||||
|
- AI image generation — uncanny for B2B professional content; Canva's own library is better
|
||||||
|
|
||||||
|
### Architecture Approach
|
||||||
|
|
||||||
|
The architecture is a three-layer system: nginx lab-router (prefix stripping) → single FastAPI container (API + SPA) → file system Docker volume (prompts, outputs, campaigns, swipe files). The application is organized into routers (HTTP), services (business logic), and schemas (Pydantic models). Services have zero HTTP imports; routers have zero business logic. The LLMService wraps all Claude API calls with retry, exponential backoff, and Pydantic JSON validation. CalendarService is pure Python with zero LLM dependency — calendar slots can be previewed and edited before burning API credits. CSVBuilder writes to disk, not memory, to avoid OOM on large batches.
|
||||||
|
|
||||||
|
**Major components:**
|
||||||
|
1. **FastAPI main.py** — mounts routers, configures `root_path="/postgenerator"` (via Uvicorn only), registers SPAStaticFiles catch-all last
|
||||||
|
2. **LLMService** — Claude API calls with retry (max 3), exponential backoff, `retry-after` header honor on 429, Pydantic JSON validation
|
||||||
|
3. **CalendarService** — pure Python; generates date-indexed 13-post schedule from campaign config; no LLM coupling
|
||||||
|
4. **PromptService** — reads/writes `.txt` files from `/data/prompts/`; templates use injectable variables (`{{num_slides}}` not literals)
|
||||||
|
5. **CSVBuilder** — transforms GeneratedContent list into Canva Bulk Create CSV using locked CANVA_FIELDS constant; writes to `/data/outputs/`
|
||||||
|
6. **SwipeService** — CRUD for JSON collections in `/data/swipe-files/`
|
||||||
|
7. **React SPA** — pages: Generator, CalendarView, PromptManager, SwipeFile; TanStack Query for polling; API client uses `/postgenerator/api` absolute path
|
||||||
|
|
||||||
|
### Critical Pitfalls
|
||||||
|
|
||||||
|
All 9 critical pitfalls identified map to Phase 1. Phase 2+ is comparatively low-risk. The top 5 requiring upfront architectural decisions:
|
||||||
|
|
||||||
|
1. **Italian prompt language** — Write system prompts IN Italian (not English + "write in Italian"). English instructions produce grammatically correct but translated-sounding Italian. This is a foundational decision; retrofitting is expensive. Italian prompts cost ~2x tokens — factor into cost estimates.
|
||||||
|
2. **Canva CSV column name mismatch** — Lock `CANVA_FIELDS = ["titolo", "sottotitolo", ...]` as a project constant before writing any generation code. Column names must exactly match Canva template placeholders. Mismatch causes silent empty fields in Canva designs — no error message.
|
||||||
|
3. **CSV encoding (UTF-8 BOM)** — Use `encoding='utf-8-sig'` always. Python's default UTF-8 without BOM causes Excel on Windows to misinterpret Italian accented characters (`à`, `è` → `Ã`, `è`) before the user uploads to Canva.
|
||||||
|
4. **FastAPI `root_path` double-path bug** — Set `root_path` ONLY via Uvicorn (`--root-path /postgenerator`), never in the `FastAPI()` constructor. If set in both places, paths double (`/postgenerator/postgenerator/...`). Nginx must strip prefix with trailing slash: `proxy_pass http://container:8000/;`.
|
||||||
|
5. **Bulk generation all-or-nothing failure** — Implement per-item status tracking from the first generation loop. Each carousel must be independently stored (status: pending/success/failed) so partial batches are recoverable. A single exception must not kill the entire batch.
|
||||||
|
|
||||||
|
Additional critical: Rate limit handling (Pitfall 6) — read `retry-after` header on 429 exactly; add configurable inter-request delay (2-3s) for bulk jobs to avoid OTPM ceiling at Tier 1 (8,000 tokens/min for Sonnet). Prompt template variable injection (Pitfall 7) — never hardcode slide counts or other config values in templates; use `{{variable}}` placeholders validated at startup.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implications for Roadmap
|
||||||
|
|
||||||
|
Based on research, suggested phase structure:
|
||||||
|
|
||||||
|
### Phase 1: Foundation + Core Generation Pipeline
|
||||||
|
**Rationale:** All 9 critical pitfalls are Phase 1 concerns. The plumbing (subpath routing, CSV encoding, Italian prompts, per-item error isolation) must be correct before any feature work. Architecture research confirms: build Docker → storage layer → LLMService → routers in that order. A working generation pipeline that produces a valid Canva CSV is the minimum viable product.
|
||||||
|
**Delivers:** Working end-to-end flow — Docker container deployed at `lab.mlhub.it/postgenerator/`, LLM generates 8-slide carousels, CSV downloads with correct encoding, Italian prompts produce native-quality copy.
|
||||||
|
**Addresses features:** Calendar Generator (13-post cycle), Format Selector, LLM Content Generation, CSV Builder, Image Keyword Generation
|
||||||
|
**Avoids:** Pitfalls 1-9 — all preventable with upfront decisions in this phase
|
||||||
|
**Research flag:** Standard patterns — FastAPI + Docker + Claude API are well-documented. No additional phase research needed.
|
||||||
|
|
||||||
|
### Phase 2: Prompt Management + Output Review
|
||||||
|
**Rationale:** Prompt quality is the primary lever for improving content quality after the pipeline works. The editor must exist before any iteration on content quality. Output review (preview before CSV export + per-item regeneration) is the UX prerequisite for confident publishing.
|
||||||
|
**Delivers:** Prompt editor UI (list/view/edit .txt files via web), output review UI (carousel preview before download), per-item regeneration, Single Post Generator mode
|
||||||
|
**Implements:** PromptService (full CRUD via UI), GenerateResponse preview components, single-post generation endpoint (reuses calendar engine)
|
||||||
|
**Addresses features:** Prompt File Store + Editor, Output Review UI, Single Post Generator
|
||||||
|
**Research flag:** Standard patterns — file-based CRUD and React form components. No additional research needed.
|
||||||
|
|
||||||
|
### Phase 3: Campaign History + Swipe File
|
||||||
|
**Rationale:** After the core pipeline and prompt management are working, the organizational layer (named campaigns, history browsing, inspiration capture) makes the tool usable as a sustained workflow rather than a one-shot generator.
|
||||||
|
**Delivers:** Named campaign management (create/browse/re-download past generations), Swipe File CRUD (capture inspiration items, tag by format/niche)
|
||||||
|
**Implements:** CampaignService (JSON metadata per campaign), SwipeService (CRUD JSON collections), Campaign History page, Swipe File page in React
|
||||||
|
**Addresses features:** Campaign History, Swipe File
|
||||||
|
**Research flag:** Standard patterns — JSON file CRUD, React list/form components. No research needed.
|
||||||
|
|
||||||
|
### Phase 4: Enrichment + Polish
|
||||||
|
**Rationale:** After core workflow is validated through real use, optional enrichments add value without blocking the primary workflow. Unsplash integration is gated on API key availability. Swipe-to-topic context injection adds intelligence once swipe file has real content.
|
||||||
|
**Delivers:** Unsplash keyword-to-URL resolution (when API key present), Swipe File → topic context injection, campaign performance notes, UI polish and UX improvements based on real use feedback
|
||||||
|
**Implements:** Unsplash API via httpx (async), topic generation context injection, free-text campaign notes field
|
||||||
|
**Addresses features:** Unsplash Integration (v1.x), Swipe File → Topic Context, Campaign Performance Notes
|
||||||
|
**Research flag:** Unsplash API integration may need rate limit strategy research (50 req/hr demo, 5000/hr production). Otherwise standard patterns.
|
||||||
|
|
||||||
|
### Phase Ordering Rationale
|
||||||
|
|
||||||
|
- **Phases 1-2 are sequential by hard dependency:** LLMService and CSVBuilder must exist before PromptEditor or OutputReview can function. Building review UI before generation pipeline is speculative work.
|
||||||
|
- **Phase 3 is independent of Phase 2:** Swipe File and Campaign History have no LLM coupling (per CalendarService isolation pattern). They could ship alongside Phase 2, but are lower priority than prompt editing.
|
||||||
|
- **Phase 4 is additive:** Unsplash and context injection enhance existing features without requiring architectural changes. They are safe to defer until core workflow is validated.
|
||||||
|
- **The calendar generation flow has a natural two-step UX** (plan calendar → generate content) documented in ARCHITECTURE.md. Phase 1 should implement both steps — the split is a feature, not a limitation.
|
||||||
|
|
||||||
|
### Research Flags
|
||||||
|
|
||||||
|
Phases likely needing deeper research during planning:
|
||||||
|
- **Phase 4 (Unsplash):** Verify current rate limits and caching strategy for demo vs. production tier. API is simple but rate limit enforcement needs validation with actual bulk generation volumes.
|
||||||
|
|
||||||
|
Phases with standard patterns (skip research-phase):
|
||||||
|
- **Phase 1:** FastAPI + Docker + Claude API + Vite subpath deployment — all well-documented with official sources
|
||||||
|
- **Phase 2:** File-based CRUD + React form components — standard CRUD patterns
|
||||||
|
- **Phase 3:** JSON file management + React list/detail pages — standard patterns
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Confidence Assessment
|
||||||
|
|
||||||
|
| Area | Confidence | Notes |
|
||||||
|
|------|------------|-------|
|
||||||
|
| Stack | HIGH | All versions verified against PyPI and npm registries as of March 2026. FastAPI 0.135.1, Anthropic 0.84.0, React 19.2.4, Tailwind 4.2.1 confirmed. Vite 7 MEDIUM (7.3.1 confirmed, v8 beta in progress but not yet stable). |
|
||||||
|
| Features | MEDIUM-HIGH | Domain is novel — no direct competitor exists combining all three aspects (bulk generation + Persuasion Nurturing + Canva CSV). Feature categorization is sound but based on adjacent tools and domain logic rather than direct market research. |
|
||||||
|
| Architecture | HIGH | FastAPI + React SPA single container is a well-established pattern with multiple official sources. Subpath deployment patterns verified against official FastAPI docs and confirmed GitHub issues. |
|
||||||
|
| Pitfalls | HIGH | 9 critical pitfalls identified, most verified against official docs (FastAPI GitHub, Anthropic rate limit docs, Microsoft Excel UTF-8 BOM support). Italian LLM quality confirmed by academic research (arXiv) and practitioner sources. |
|
||||||
|
|
||||||
|
**Overall confidence:** HIGH
|
||||||
|
|
||||||
|
### Gaps to Address
|
||||||
|
|
||||||
|
- **Canva field schema:** The exact placeholder names used in the Canva template are not yet defined. These must be locked as project constants (`CANVA_FIELDS`) in Phase 1 before writing any generation code. This is the most critical unresolved dependency — it determines the CSV schema and the LLM output schema simultaneously.
|
||||||
|
- **Anthropic Tier 1 limits in bulk context:** The 8,000 OTPM limit for claude-sonnet-4-5 at Tier 1 needs validation against actual bulk generation token usage. A 13-carousel batch at 8 slides each could approach or exceed the limit without inter-request delays. The `retry-after` + delay strategy in Phase 1 should be tested with a real batch of 13+ items before shipping.
|
||||||
|
- **Italian prompt quality baseline:** Written system prompts in Italian are foundational but cannot be validated until actual generation output is reviewed by a native Italian speaker. Plan for a prompt iteration cycle within Phase 1 before moving to Phase 2.
|
||||||
|
- **Vite 7 version stability:** Vite 7 is confirmed stable (7.3.1) but v8 beta is in progress. If Vite 7 minor versions introduce breaking changes during development, shadcn/ui compatibility should be re-verified.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
### Primary (HIGH confidence)
|
||||||
|
- [PyPI - fastapi 0.135.1](https://pypi.org/project/fastapi/) — version, install options, `fastapi[standard]` recommendation
|
||||||
|
- [PyPI - anthropic 0.84.0](https://pypi.org/project/anthropic/) — version, async support, streaming
|
||||||
|
- [Anthropic API Docs - Rate Limits](https://platform.claude.com/docs/en/api/rate-limits) — Tier 1 limits (50 RPM, 8,000 OTPM for Sonnet), `retry-after` header
|
||||||
|
- [Anthropic Docs - Structured Outputs](https://platform.claude.com/docs/en/build-with-claude/structured-outputs) — GA Nov 2025, schema constraints stripped
|
||||||
|
- [FastAPI Docs - Behind a Proxy](https://fastapi.tiangolo.com/advanced/behind-a-proxy/) — root_path mechanism
|
||||||
|
- [FastAPI GitHub - root_path double-path bug #9018, #11977](https://github.com/fastapi/fastapi/discussions/9018) — confirmed bug, prevention strategy
|
||||||
|
- [FastAPI Docs - Static Files](https://fastapi.tiangolo.com/tutorial/static-files/) — SPAStaticFiles pattern
|
||||||
|
- [Canva Help - Bulk Create](https://www.canva.com/help/bulk-create/) — max 300 rows, 150 columns, placeholder name matching
|
||||||
|
- [Tailwind CSS v4 Announcement](https://tailwindcss.com/blog/tailwindcss-v4) — v4 features, Vite plugin
|
||||||
|
- [Microsoft Support - CSV UTF-8 BOM in Excel](https://support.microsoft.com/en-us/office/opening-csv-utf-8-files-correctly-in-excel-8a935af5-3416-4edd-ba7e-3dfd2bc4a032) — BOM requirement confirmed
|
||||||
|
- [npm - react 19.2.4](https://www.npmjs.com/package/react) — version confirmed
|
||||||
|
- [npm - tailwindcss 4.2.1](https://www.npmjs.com/package/tailwindcss) — version confirmed
|
||||||
|
|
||||||
|
### Secondary (MEDIUM confidence)
|
||||||
|
- [PostNitro feature analysis](https://postnitro.ai/) — competitor feature mapping
|
||||||
|
- [aiCarousels feature analysis](https://www.aicarousels.com/) — competitor feature mapping
|
||||||
|
- [StoryChief editorial calendar](https://storychief.io/blog/best-content-calendar-tools) — adjacent tool features
|
||||||
|
- [arXiv - Evalita-LLM Italian benchmarks](https://arxiv.org/html/2502.02289v1) — Italian LLM quality research
|
||||||
|
- [Portkey - LLM retry patterns](https://portkey.ai/blog/retries-fallbacks-and-circuit-breakers-in-llm-apps/) — retry/backoff patterns
|
||||||
|
- [Unsplash API Docs](https://unsplash.com/documentation) — rate limits (50 req/hr demo, 5000/hr production)
|
||||||
|
- FastAPI + React single container patterns: multiple practitioner blog posts (davidmuraya.com, dakdeniz.medium.com)
|
||||||
|
|
||||||
|
### Tertiary (LOW confidence)
|
||||||
|
- [Create Stimulate - Canva CSV upload tips](https://createstimulate.com/blogs/news/canva-tips-for-uploading-csv-files-using-bulk-create) — community blog; BOM/encoding advice corroborated by Microsoft source
|
||||||
|
- [LinkedIn - Non-English prompt engineering](https://www.linkedin.com/pulse/non-english-languages-prompt-engineering-trade-offs-giorgio-robino) — Italian token cost 2x estimate; corroborated by Italian token length observation
|
||||||
|
|
||||||
|
---
|
||||||
|
*Research completed: 2026-03-07*
|
||||||
|
*Ready for roadmap: yes*
|
||||||
Reference in New Issue
Block a user