Upload And Extract Episodes Design

Context

The conversion mode currently accepts only manual text input in the left textarea. The app already has a Doubao streaming integration pattern for script generation, and the extracted content should feed back into the existing sourceText flow rather than replacing the rest of the conversion pipeline.

Decision

Adopt a client-side upload flow for four file types: Word (.docx), text (.txt), PDF (.pdf), and Markdown (.md). After upload, the app will read the file in the browser, send the raw text to a new Doubao extraction call using doubao-seed-1-6-flash-250828, and stream the model output directly into the left-side source textarea.

Behavior

The source input area becomes a hybrid input surface: manual typing still works, and file upload is added alongside it.
Upload immediately starts extraction without requiring the user to click 立即转换成剧本.
The extraction model is instructed to identify each episode and return the original script content 1:1 with no rewriting, normalization, cleanup, or omission.
The streamed extraction result overwrites sourceText progressively so the user can see the result arrive in real time.
Existing conversion generation stays separate. After extraction completes, the user can still click the existing conversion button to continue with the current workflow.

Parsing Strategy

.txt and .md: read with File.text().
.docx: parse in-browser with a document-text extraction library.
.pdf: parse in-browser with a PDF text extraction library.

UI And State

Add upload affordance, accepted-file hint, extraction loading state, and extraction error state in conversion mode.
Preserve sourceText local persistence.
Keep manual editing enabled after extraction.

AI Contract

The new extraction API will:

use Doubao only
stream results
instruct the model to output episode-separated original content only
avoid any transformations beyond episode boundary recognition

Risks

PDF text extraction quality depends on document structure.
Even with strict prompting, model-based extraction is probabilistic, so the prompt must strongly prohibit edits and define a deterministic output format.
Browser-side parsing adds dependency and bundle-size cost.

2.3 KiB Raw Permalink Blame History

Upload And Extract Episodes Design

2.3 KiB

Raw Permalink Blame History