scriptflow/docs/plans/2026-03-11-upload-extract-episodes-design.md
Song367 b49d703e3c
All checks were successful
Gitea Actions Demo / Explore-Gitea-Actions (push) Successful in 1m22s
一键转换模式优化
2026-03-11 21:53:41 +08:00

45 lines
2.3 KiB
Markdown

# Upload And Extract Episodes Design
**Context**
The conversion mode currently accepts only manual text input in the left textarea. The app already has a Doubao streaming integration pattern for script generation, and the extracted content should feed back into the existing `sourceText` flow rather than replacing the rest of the conversion pipeline.
**Decision**
Adopt a client-side upload flow for four file types: Word (`.docx`), text (`.txt`), PDF (`.pdf`), and Markdown (`.md`). After upload, the app will read the file in the browser, send the raw text to a new Doubao extraction call using `doubao-seed-1-6-flash-250828`, and stream the model output directly into the left-side source textarea.
**Behavior**
- The source input area becomes a hybrid input surface: manual typing still works, and file upload is added alongside it.
- Upload immediately starts extraction without requiring the user to click `立即转换成剧本`.
- The extraction model is instructed to identify each episode and return the original script content 1:1 with no rewriting, normalization, cleanup, or omission.
- The streamed extraction result overwrites `sourceText` progressively so the user can see the result arrive in real time.
- Existing conversion generation stays separate. After extraction completes, the user can still click the existing conversion button to continue with the current workflow.
**Parsing Strategy**
- `.txt` and `.md`: read with `File.text()`.
- `.docx`: parse in-browser with a document-text extraction library.
- `.pdf`: parse in-browser with a PDF text extraction library.
**UI And State**
- Add upload affordance, accepted-file hint, extraction loading state, and extraction error state in conversion mode.
- Preserve `sourceText` local persistence.
- Keep manual editing enabled after extraction.
**AI Contract**
The new extraction API will:
- use Doubao only
- stream results
- instruct the model to output episode-separated original content only
- avoid any transformations beyond episode boundary recognition
**Risks**
- PDF text extraction quality depends on document structure.
- Even with strict prompting, model-based extraction is probabilistic, so the prompt must strongly prohibit edits and define a deterministic output format.
- Browser-side parsing adds dependency and bundle-size cost.