Download

Download the dataset.

Five training-ready formats. Each file is generated live when you click. Drop into trl.DPOTrainer, openai.fine_tuning, Axolotl, or LLaMA-Factory.

SFT · chat format

OpenAI · HuggingFace · TRL

Public human responses scoring ≥ 75/100, as { messages: [...] } per line.

0 rows

Insufficient data

SFT · ShareGPT

Axolotl · LLaMA-Factory

Same threshold, ShareGPT conventions.

0 rows

Insufficient data

SFT · Alpaca

Stanford Alpaca · LoRA

{ instruction, input, output } per line.

0 rows

Insufficient data

DPO · preference pairs

TRL DPOTrainer

{ prompt, chosen, rejected }; chosen beats rejected by ≥ 5 points.

0 rows

Insufficient data

Raw

Everything

Every scenario, response, per-criterion judgment with rationales.

0 rows

Insufficient data

Licensing

Released for research. Contributors consented to anonymous public release. Please do not use the corpus to train systems that manipulate emotionally vulnerable users.