SFT · chat format
OpenAI · HuggingFace · TRLPublic human responses scoring ≥ 75/100, as { messages: [...] } per line.
26 rows
Five training-ready formats. Each file is generated live when you click. Drop into trl.DPOTrainer, openai.fine_tuning, Axolotl, or LLaMA-Factory.
Public human responses scoring ≥ 75/100, as { messages: [...] } per line.
26 rows
Same threshold, ShareGPT conventions.
26 rows
{ instruction, input, output } per line.
26 rows
{ prompt, chosen, rejected }; chosen beats rejected by ≥ 5 points.
340 rows
Every scenario, response, per-criterion judgment with rationales.
214 rows
Released for research. Contributors consented to anonymous public release. Please do not use the corpus to train systems that manipulate emotionally vulnerable users.