Assessment: Multi-step LLM workflow

**Is your feature request related to a problem?**  
Currently, there is no workflow to run LLM evaluations across multiple model configurations on the same dataset, which limits users' ability to compare model performance effectively.

**Describe the solution you'd like**  
Implement a multi-step Assessment module that includes:
- Dataset upload
- Column mapping
- Prompt and config selection
- Review
- Results tab with retry and export support
- Real-time status updates via SSE

**Why is this enhancement needed?**  
This enhancement allows for side-by-side comparison of model configurations on the same dataset and keeps concerns separate by being self-contained under app/assessment/.

<details><summary>Original issue</summary>

**Describe the current behavior**
A clear description of how it currently works and what the limitations are.
No workflow exists to run LLM evaluations across multiple model configurations on the same dataset.

**Describe the enhancement you'd like**
A clear and concise description of the improvement you want to see.
A multi-step Assessment module covering: dataset upload, column mapping, prompt + config selection, review, and a results tab with retry and export support. Real-time status updates via SSE.


**Why is this enhancement needed?**
Explain the benefits (e.g., performance, usability, maintainability, scalability).
Enables side-by-side comparison of model configs on the same dataset. Module is self-contained under app/assessment/ keeping concerns separate from the rest of the app.


**Additional context**
Add any other context, metrics, screenshots, or examples about the enhancement here.


</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assessment: Multi-step LLM workflow #124

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Assessment: Multi-step LLM workflow #124

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions