Towards Smarter Computers: Training Small Models to Use the Terminal
TL;DR
I bring you:
- Sandboxing environment to train shell agents: repo
- A 14k shell task dataset for training on HF
- Synthetic data generation pipeline to generate customizable task datasets
- Batteries-included script to run RL with the above and your model of choice
Intro
Earlier this year I started messing more with “agents”, so much more that it ended up changing what I do at work (cool).