Files: - STACK.md - FEATURES.md - ARCHITECTURE.md - PITFALLS.md - SUMMARY.md Key findings: - Stack: FastAPI 0.135.1 + React 19 + Vite 7 + Tailwind v4, single-container deploy - Architecture: FastAPI serves React SPA via catch-all, file-based storage (Docker volume), LLMService with retry/backoff - Critical pitfall: All 9 pitfalls map to Phase 1 — Italian prompts, Canva field constants, UTF-8 BOM, root_path config, per-item bulk isolation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
191 lines
20 KiB
Markdown
191 lines
20 KiB
Markdown
# Project Research Summary
|
|
|
|
**Project:** PostGenerator — Instagram Carousel Bulk Generator for B2B Italian SME Marketing
|
|
**Domain:** LLM-powered social media content automation (Python + FastAPI + React SPA)
|
|
**Researched:** 2026-03-07
|
|
**Confidence:** HIGH
|
|
|
|
## Executive Summary
|
|
|
|
PostGenerator is a purpose-built content automation tool with no direct competitor. It sits at the intersection of bulk carousel generation, strategic editorial planning (Persuasion Nurturing + Schwartz awareness levels), and Canva Bulk Create workflow — a combination that no existing product (PostNitro, aiCarousels, StoryChief) offers. The recommended architecture is a single-container FastAPI + React SPA deployment: Python FastAPI handles both the REST API and serves the built React frontend, eliminating CORS configuration and simplifying VPS deployment under the lab-router nginx subpath at `lab.mlhub.it/postgenerator/`. The entire data layer is file-based (prompt .txt files, JSON configs, CSV outputs as Docker volume) — no database, no Supabase, no external services beyond the Claude API.
|
|
|
|
The core generation pipeline — calendar planning, LLM content generation, CSV export — has clear and well-researched implementation patterns. The stack is modern and stable: FastAPI 0.135.1 with Pydantic v2, React 19 with Vite 7, Tailwind CSS v4, and the official Anthropic SDK. The critical engineering challenge is not the technology but the orchestration: generating 13 carousels in a single batch, each requiring an independent Claude API call, with per-item error isolation, rate-limit handling, and structured JSON output validation — all before a reliable CSV is delivered.
|
|
|
|
The most severe risks are concentrated in Phase 1 and are all preventable with upfront decisions: Italian-language prompts (not English "write in Italian"), UTF-8 BOM CSV encoding, Canva field name constants locked before any generation code, per-item failure isolation in the generation loop, and the FastAPI `root_path` subpath deployment configured correctly from day one. All 9 critical pitfalls identified map to Phase 1, which makes the foundation phase the highest-stakes phase of the entire project.
|
|
|
|
---
|
|
|
|
## Key Findings
|
|
|
|
### Recommended Stack
|
|
|
|
The stack is a single-container Python/React application. FastAPI serves as both the API server and static file host for the React SPA build output. The React frontend is built with Vite (base path set to `/postgenerator/`), served via a SPAStaticFiles catch-all route mounted last in FastAPI after all API routers. No separate nginx container, no CORS, no database. All persistent data lives in a Docker volume mounted at `/data/`.
|
|
|
|
The Claude API is accessed via the official `anthropic` Python SDK (v0.84.0), using `claude-sonnet-4-5` as the generation model. Unsplash image search is called directly via `httpx` (async) — no third-party Unsplash library (both available ones are stale). CSV generation uses Python's stdlib `csv` module with `utf-8-sig` encoding.
|
|
|
|
**Core technologies:**
|
|
- **Python 3.12 + FastAPI 0.135.1**: API framework + SPA file server — native async, Pydantic v2, auto OpenAPI docs, StaticFiles for React
|
|
- **Anthropic SDK 0.84.0**: Claude API client — official, async-capable, streaming support; use `claude-sonnet-4-5` for best cost/quality on structured text
|
|
- **React 19 + Vite 7 + TypeScript**: Frontend SPA — fast builds, native ESM dev server, subpath-aware with `base: '/postgenerator/'`
|
|
- **Tailwind CSS v4 + shadcn/ui**: UI styling — v4 drops `tailwind.config.js`, 5x faster builds, shadcn components copy-owned (no vendor lock)
|
|
- **TanStack Query v5**: API state management — essential for polling bulk generation job status
|
|
- **httpx 0.28.1**: Async HTTP client — for Unsplash API calls; replaces `requests` in async FastAPI context
|
|
- **stdlib `csv` + `pathlib`**: File generation and prompt management — no pandas, no external deps for these operations
|
|
|
|
### Expected Features
|
|
|
|
PostGenerator's feature set is entirely defined by the Persuasion Nurturing framework logic, not by generic social tool conventions. The Calendar Generator (13-post cycle with enforced mix: 4 valore, 2 storytelling, 2 news, 3 riprova, 1 coinvolgimento, 1 promozione) and the Canva CSV exporter are the two non-negotiable core deliverables. Everything else either enables or enriches them.
|
|
|
|
**Must have (table stakes — v1):**
|
|
- **13-post Calendar Generator** — core value proposition; enforces Persuasion Nurturing distribution automatically
|
|
- **LLM Content Generation (8 slides/carousel)** — Claude generates all slides as structured JSON; Schwartz levels + narrative formats applied per slot
|
|
- **Canva-compatible CSV export** — headers must exactly match Canva Bulk Create template placeholders; `utf-8-sig` encoding mandatory
|
|
- **Format Selector (tipo x Schwartz → formato)** — deterministic rules-based mapping; 7 narrative formats (PAS, AIDA, BAB, Listicle, Eroe, Dato, Obiezione)
|
|
- **Prompt File Store + Editor** — file-based .txt prompts editable via UI without code deployment
|
|
- **Output Review UI** — preview generated carousels before CSV export; allow per-item regeneration
|
|
- **Single Post Generator** — on-demand mode reusing the same generation engine; essential for testing
|
|
- **Swipe File (CRUD)** — capture inspiration items; simple JSON store, no LLM integration in v1
|
|
- **Campaign History** — named output folders; browse and re-download past generations
|
|
|
|
**Should have (differentiators — v1.x post validation):**
|
|
- **Unsplash API integration** — resolve image keywords to actual URLs (gated on API key presence)
|
|
- **Swipe File → Topic context injection** — pass swipe items as LLM context during topic generation
|
|
- **Campaign performance notes** — free-text notes per campaign for qualitative tracking
|
|
|
|
**Defer (v2+):**
|
|
- Direct Instagram publishing — requires Meta App Review, OAuth; Canva workflow is deliberate, not a limitation
|
|
- Analytics linkage — separate product domain; premature before content engine is validated
|
|
- Multi-user / team collaboration — personal tool by design; add basic HTTP auth only if explicitly needed
|
|
- AI image generation — uncanny for B2B professional content; Canva's own library is better
|
|
|
|
### Architecture Approach
|
|
|
|
The architecture is a three-layer system: nginx lab-router (prefix stripping) → single FastAPI container (API + SPA) → file system Docker volume (prompts, outputs, campaigns, swipe files). The application is organized into routers (HTTP), services (business logic), and schemas (Pydantic models). Services have zero HTTP imports; routers have zero business logic. The LLMService wraps all Claude API calls with retry, exponential backoff, and Pydantic JSON validation. CalendarService is pure Python with zero LLM dependency — calendar slots can be previewed and edited before burning API credits. CSVBuilder writes to disk, not memory, to avoid OOM on large batches.
|
|
|
|
**Major components:**
|
|
1. **FastAPI main.py** — mounts routers, configures `root_path="/postgenerator"` (via Uvicorn only), registers SPAStaticFiles catch-all last
|
|
2. **LLMService** — Claude API calls with retry (max 3), exponential backoff, `retry-after` header honor on 429, Pydantic JSON validation
|
|
3. **CalendarService** — pure Python; generates date-indexed 13-post schedule from campaign config; no LLM coupling
|
|
4. **PromptService** — reads/writes `.txt` files from `/data/prompts/`; templates use injectable variables (`{{num_slides}}` not literals)
|
|
5. **CSVBuilder** — transforms GeneratedContent list into Canva Bulk Create CSV using locked CANVA_FIELDS constant; writes to `/data/outputs/`
|
|
6. **SwipeService** — CRUD for JSON collections in `/data/swipe-files/`
|
|
7. **React SPA** — pages: Generator, CalendarView, PromptManager, SwipeFile; TanStack Query for polling; API client uses `/postgenerator/api` absolute path
|
|
|
|
### Critical Pitfalls
|
|
|
|
All 9 critical pitfalls identified map to Phase 1. Phase 2+ is comparatively low-risk. The top 5 requiring upfront architectural decisions:
|
|
|
|
1. **Italian prompt language** — Write system prompts IN Italian (not English + "write in Italian"). English instructions produce grammatically correct but translated-sounding Italian. This is a foundational decision; retrofitting is expensive. Italian prompts cost ~2x tokens — factor into cost estimates.
|
|
2. **Canva CSV column name mismatch** — Lock `CANVA_FIELDS = ["titolo", "sottotitolo", ...]` as a project constant before writing any generation code. Column names must exactly match Canva template placeholders. Mismatch causes silent empty fields in Canva designs — no error message.
|
|
3. **CSV encoding (UTF-8 BOM)** — Use `encoding='utf-8-sig'` always. Python's default UTF-8 without BOM causes Excel on Windows to misinterpret Italian accented characters (`à`, `è` → `Ã`, `è`) before the user uploads to Canva.
|
|
4. **FastAPI `root_path` double-path bug** — Set `root_path` ONLY via Uvicorn (`--root-path /postgenerator`), never in the `FastAPI()` constructor. If set in both places, paths double (`/postgenerator/postgenerator/...`). Nginx must strip prefix with trailing slash: `proxy_pass http://container:8000/;`.
|
|
5. **Bulk generation all-or-nothing failure** — Implement per-item status tracking from the first generation loop. Each carousel must be independently stored (status: pending/success/failed) so partial batches are recoverable. A single exception must not kill the entire batch.
|
|
|
|
Additional critical: Rate limit handling (Pitfall 6) — read `retry-after` header on 429 exactly; add configurable inter-request delay (2-3s) for bulk jobs to avoid OTPM ceiling at Tier 1 (8,000 tokens/min for Sonnet). Prompt template variable injection (Pitfall 7) — never hardcode slide counts or other config values in templates; use `{{variable}}` placeholders validated at startup.
|
|
|
|
---
|
|
|
|
## Implications for Roadmap
|
|
|
|
Based on research, suggested phase structure:
|
|
|
|
### Phase 1: Foundation + Core Generation Pipeline
|
|
**Rationale:** All 9 critical pitfalls are Phase 1 concerns. The plumbing (subpath routing, CSV encoding, Italian prompts, per-item error isolation) must be correct before any feature work. Architecture research confirms: build Docker → storage layer → LLMService → routers in that order. A working generation pipeline that produces a valid Canva CSV is the minimum viable product.
|
|
**Delivers:** Working end-to-end flow — Docker container deployed at `lab.mlhub.it/postgenerator/`, LLM generates 8-slide carousels, CSV downloads with correct encoding, Italian prompts produce native-quality copy.
|
|
**Addresses features:** Calendar Generator (13-post cycle), Format Selector, LLM Content Generation, CSV Builder, Image Keyword Generation
|
|
**Avoids:** Pitfalls 1-9 — all preventable with upfront decisions in this phase
|
|
**Research flag:** Standard patterns — FastAPI + Docker + Claude API are well-documented. No additional phase research needed.
|
|
|
|
### Phase 2: Prompt Management + Output Review
|
|
**Rationale:** Prompt quality is the primary lever for improving content quality after the pipeline works. The editor must exist before any iteration on content quality. Output review (preview before CSV export + per-item regeneration) is the UX prerequisite for confident publishing.
|
|
**Delivers:** Prompt editor UI (list/view/edit .txt files via web), output review UI (carousel preview before download), per-item regeneration, Single Post Generator mode
|
|
**Implements:** PromptService (full CRUD via UI), GenerateResponse preview components, single-post generation endpoint (reuses calendar engine)
|
|
**Addresses features:** Prompt File Store + Editor, Output Review UI, Single Post Generator
|
|
**Research flag:** Standard patterns — file-based CRUD and React form components. No additional research needed.
|
|
|
|
### Phase 3: Campaign History + Swipe File
|
|
**Rationale:** After the core pipeline and prompt management are working, the organizational layer (named campaigns, history browsing, inspiration capture) makes the tool usable as a sustained workflow rather than a one-shot generator.
|
|
**Delivers:** Named campaign management (create/browse/re-download past generations), Swipe File CRUD (capture inspiration items, tag by format/niche)
|
|
**Implements:** CampaignService (JSON metadata per campaign), SwipeService (CRUD JSON collections), Campaign History page, Swipe File page in React
|
|
**Addresses features:** Campaign History, Swipe File
|
|
**Research flag:** Standard patterns — JSON file CRUD, React list/form components. No research needed.
|
|
|
|
### Phase 4: Enrichment + Polish
|
|
**Rationale:** After core workflow is validated through real use, optional enrichments add value without blocking the primary workflow. Unsplash integration is gated on API key availability. Swipe-to-topic context injection adds intelligence once swipe file has real content.
|
|
**Delivers:** Unsplash keyword-to-URL resolution (when API key present), Swipe File → topic context injection, campaign performance notes, UI polish and UX improvements based on real use feedback
|
|
**Implements:** Unsplash API via httpx (async), topic generation context injection, free-text campaign notes field
|
|
**Addresses features:** Unsplash Integration (v1.x), Swipe File → Topic Context, Campaign Performance Notes
|
|
**Research flag:** Unsplash API integration may need rate limit strategy research (50 req/hr demo, 5000/hr production). Otherwise standard patterns.
|
|
|
|
### Phase Ordering Rationale
|
|
|
|
- **Phases 1-2 are sequential by hard dependency:** LLMService and CSVBuilder must exist before PromptEditor or OutputReview can function. Building review UI before generation pipeline is speculative work.
|
|
- **Phase 3 is independent of Phase 2:** Swipe File and Campaign History have no LLM coupling (per CalendarService isolation pattern). They could ship alongside Phase 2, but are lower priority than prompt editing.
|
|
- **Phase 4 is additive:** Unsplash and context injection enhance existing features without requiring architectural changes. They are safe to defer until core workflow is validated.
|
|
- **The calendar generation flow has a natural two-step UX** (plan calendar → generate content) documented in ARCHITECTURE.md. Phase 1 should implement both steps — the split is a feature, not a limitation.
|
|
|
|
### Research Flags
|
|
|
|
Phases likely needing deeper research during planning:
|
|
- **Phase 4 (Unsplash):** Verify current rate limits and caching strategy for demo vs. production tier. API is simple but rate limit enforcement needs validation with actual bulk generation volumes.
|
|
|
|
Phases with standard patterns (skip research-phase):
|
|
- **Phase 1:** FastAPI + Docker + Claude API + Vite subpath deployment — all well-documented with official sources
|
|
- **Phase 2:** File-based CRUD + React form components — standard CRUD patterns
|
|
- **Phase 3:** JSON file management + React list/detail pages — standard patterns
|
|
|
|
---
|
|
|
|
## Confidence Assessment
|
|
|
|
| Area | Confidence | Notes |
|
|
|------|------------|-------|
|
|
| Stack | HIGH | All versions verified against PyPI and npm registries as of March 2026. FastAPI 0.135.1, Anthropic 0.84.0, React 19.2.4, Tailwind 4.2.1 confirmed. Vite 7 MEDIUM (7.3.1 confirmed, v8 beta in progress but not yet stable). |
|
|
| Features | MEDIUM-HIGH | Domain is novel — no direct competitor exists combining all three aspects (bulk generation + Persuasion Nurturing + Canva CSV). Feature categorization is sound but based on adjacent tools and domain logic rather than direct market research. |
|
|
| Architecture | HIGH | FastAPI + React SPA single container is a well-established pattern with multiple official sources. Subpath deployment patterns verified against official FastAPI docs and confirmed GitHub issues. |
|
|
| Pitfalls | HIGH | 9 critical pitfalls identified, most verified against official docs (FastAPI GitHub, Anthropic rate limit docs, Microsoft Excel UTF-8 BOM support). Italian LLM quality confirmed by academic research (arXiv) and practitioner sources. |
|
|
|
|
**Overall confidence:** HIGH
|
|
|
|
### Gaps to Address
|
|
|
|
- **Canva field schema:** The exact placeholder names used in the Canva template are not yet defined. These must be locked as project constants (`CANVA_FIELDS`) in Phase 1 before writing any generation code. This is the most critical unresolved dependency — it determines the CSV schema and the LLM output schema simultaneously.
|
|
- **Anthropic Tier 1 limits in bulk context:** The 8,000 OTPM limit for claude-sonnet-4-5 at Tier 1 needs validation against actual bulk generation token usage. A 13-carousel batch at 8 slides each could approach or exceed the limit without inter-request delays. The `retry-after` + delay strategy in Phase 1 should be tested with a real batch of 13+ items before shipping.
|
|
- **Italian prompt quality baseline:** Written system prompts in Italian are foundational but cannot be validated until actual generation output is reviewed by a native Italian speaker. Plan for a prompt iteration cycle within Phase 1 before moving to Phase 2.
|
|
- **Vite 7 version stability:** Vite 7 is confirmed stable (7.3.1) but v8 beta is in progress. If Vite 7 minor versions introduce breaking changes during development, shadcn/ui compatibility should be re-verified.
|
|
|
|
---
|
|
|
|
## Sources
|
|
|
|
### Primary (HIGH confidence)
|
|
- [PyPI - fastapi 0.135.1](https://pypi.org/project/fastapi/) — version, install options, `fastapi[standard]` recommendation
|
|
- [PyPI - anthropic 0.84.0](https://pypi.org/project/anthropic/) — version, async support, streaming
|
|
- [Anthropic API Docs - Rate Limits](https://platform.claude.com/docs/en/api/rate-limits) — Tier 1 limits (50 RPM, 8,000 OTPM for Sonnet), `retry-after` header
|
|
- [Anthropic Docs - Structured Outputs](https://platform.claude.com/docs/en/build-with-claude/structured-outputs) — GA Nov 2025, schema constraints stripped
|
|
- [FastAPI Docs - Behind a Proxy](https://fastapi.tiangolo.com/advanced/behind-a-proxy/) — root_path mechanism
|
|
- [FastAPI GitHub - root_path double-path bug #9018, #11977](https://github.com/fastapi/fastapi/discussions/9018) — confirmed bug, prevention strategy
|
|
- [FastAPI Docs - Static Files](https://fastapi.tiangolo.com/tutorial/static-files/) — SPAStaticFiles pattern
|
|
- [Canva Help - Bulk Create](https://www.canva.com/help/bulk-create/) — max 300 rows, 150 columns, placeholder name matching
|
|
- [Tailwind CSS v4 Announcement](https://tailwindcss.com/blog/tailwindcss-v4) — v4 features, Vite plugin
|
|
- [Microsoft Support - CSV UTF-8 BOM in Excel](https://support.microsoft.com/en-us/office/opening-csv-utf-8-files-correctly-in-excel-8a935af5-3416-4edd-ba7e-3dfd2bc4a032) — BOM requirement confirmed
|
|
- [npm - react 19.2.4](https://www.npmjs.com/package/react) — version confirmed
|
|
- [npm - tailwindcss 4.2.1](https://www.npmjs.com/package/tailwindcss) — version confirmed
|
|
|
|
### Secondary (MEDIUM confidence)
|
|
- [PostNitro feature analysis](https://postnitro.ai/) — competitor feature mapping
|
|
- [aiCarousels feature analysis](https://www.aicarousels.com/) — competitor feature mapping
|
|
- [StoryChief editorial calendar](https://storychief.io/blog/best-content-calendar-tools) — adjacent tool features
|
|
- [arXiv - Evalita-LLM Italian benchmarks](https://arxiv.org/html/2502.02289v1) — Italian LLM quality research
|
|
- [Portkey - LLM retry patterns](https://portkey.ai/blog/retries-fallbacks-and-circuit-breakers-in-llm-apps/) — retry/backoff patterns
|
|
- [Unsplash API Docs](https://unsplash.com/documentation) — rate limits (50 req/hr demo, 5000/hr production)
|
|
- FastAPI + React single container patterns: multiple practitioner blog posts (davidmuraya.com, dakdeniz.medium.com)
|
|
|
|
### Tertiary (LOW confidence)
|
|
- [Create Stimulate - Canva CSV upload tips](https://createstimulate.com/blogs/news/canva-tips-for-uploading-csv-files-using-bulk-create) — community blog; BOM/encoding advice corroborated by Microsoft source
|
|
- [LinkedIn - Non-English prompt engineering](https://www.linkedin.com/pulse/non-english-languages-prompt-engineering-trade-offs-giorgio-robino) — Italian token cost 2x estimate; corroborated by Italian token length observation
|
|
|
|
---
|
|
*Research completed: 2026-03-07*
|
|
*Ready for roadmap: yes*
|