Pipeline · four stages

From noisy references to a written related-work section.

Each stage of the pipeline has its own failure modes. Pick a stage to see the raw inputs, the transforms applied, and the resulting record — on real sample records from the dataset.

Phase 01Live

Cleaning

Normalize · Validate · Drop

Raw scraped records carry unicode noise, table-of-contents pseudo-abstracts, missing text, and citation-only related-work. This stage flags and removes them.

Open examples

Phase 02Coming next

Extractive

Salient sentence selection

Pick the sentences from each cited abstract that carry the reasoning most relevant to the query paper.

Preview soon

Phase 03Coming next

Abstractive

Draft the paragraph

Generate a fluent related-work paragraph from the extracted evidence and the query abstract.

Preview soon

Phase 04Coming next

Compare

Model vs. ground truth

Set the generated paragraph next to the paper's actual related-work section, side by side.

Preview soon