Browse Skills

Yt Research

v1.0.0

|

Jeremy Longshore
5

Wandb Experiment Logger

v1.0.0

|

Jeremy Longshore
4

Creating Github Issues From Web Research

v1.0.0

|

Jeremy Longshore
4

Setting Up Experiment Tracking

v1.0.0

|

Jeremy Longshore
3

Meeting Prep

v1.0.0

Prepare briefings for today's meetings — attendee research, email history, past meeting notes, LinkedIn, and company context. Use when running the daily meeting prep cron, or when user asks to prepare for meetings, review who they're meeting with, or get context on upcoming calls.

Jeremy Longshore
6

Setting Up Experiment Tracking

v1.0.0

|

Jeremy Longshore
2

Creating Github Issues From Web Research

v1.0.0

|

Jeremy Longshore
2

Evaluating Llms Harness

v1.0.0

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.

Orchestra Research
4

Transformer Lens Interpretability

v1.0.0

Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.

Orchestra Research
5

Guidance

v1.0.0

Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework

Orchestra Research
4

Huggingface Tokenizers

v1.0.0

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.

Orchestra Research
8

Autoresearch

v1.0.0

Orchestrates end-to-end autonomous AI research projects using a two-loop architecture. The inner loop runs rapid experiment iterations with clear optimization targets. The outer loop synthesizes results, identifies patterns, and steers research direction. Routes to domain-specific skills for execution, supports continuous agent operation via Claude Code /loop and OpenClaw heartbeat, and produces research presentations and papers. Use when starting a research project, running autonomous experi...

Orchestra Research
4