Rl | Death By Knowledge

Death By Knowledge

Posts on Rl

Towards Smarter Computers: Training Small Models to Use the Terminal
August 12, 2025
TL;DR

I bring you:
- Sandboxing environment to train shell agents: repo
- A 14k shell task dataset for training on HF
- Synthetic data generation pipeline to generate customizable task datasets
- Batteries-included script to run RL with the above and your model of choice
Intro

Earlier this year I started messing more with “agents”, so much more that it ended up changing what I do at work (cool).