All checks were successful
Gitea Actions Demo / Explore-Gitea-Actions (push) Successful in 1m22s
2.3 KiB
2.3 KiB
Upload And Extract Episodes Design
Context
The conversion mode currently accepts only manual text input in the left textarea. The app already has a Doubao streaming integration pattern for script generation, and the extracted content should feed back into the existing sourceText flow rather than replacing the rest of the conversion pipeline.
Decision
Adopt a client-side upload flow for four file types: Word (.docx), text (.txt), PDF (.pdf), and Markdown (.md). After upload, the app will read the file in the browser, send the raw text to a new Doubao extraction call using doubao-seed-1-6-flash-250828, and stream the model output directly into the left-side source textarea.
Behavior
- The source input area becomes a hybrid input surface: manual typing still works, and file upload is added alongside it.
- Upload immediately starts extraction without requiring the user to click
立即转换成剧本. - The extraction model is instructed to identify each episode and return the original script content 1:1 with no rewriting, normalization, cleanup, or omission.
- The streamed extraction result overwrites
sourceTextprogressively so the user can see the result arrive in real time. - Existing conversion generation stays separate. After extraction completes, the user can still click the existing conversion button to continue with the current workflow.
Parsing Strategy
.txtand.md: read withFile.text()..docx: parse in-browser with a document-text extraction library..pdf: parse in-browser with a PDF text extraction library.
UI And State
- Add upload affordance, accepted-file hint, extraction loading state, and extraction error state in conversion mode.
- Preserve
sourceTextlocal persistence. - Keep manual editing enabled after extraction.
AI Contract
The new extraction API will:
- use Doubao only
- stream results
- instruct the model to output episode-separated original content only
- avoid any transformations beyond episode boundary recognition
Risks
- PDF text extraction quality depends on document structure.
- Even with strict prompting, model-based extraction is probabilistic, so the prompt must strongly prohibit edits and define a deterministic output format.
- Browser-side parsing adds dependency and bundle-size cost.