Skip to content

vt-c-repo-evaluate

Evaluate external GitHub repositories for toolkit-relevant patterns using a tiered safety protocol. Generates evaluation reports and intake proposals. Defaults to API-only analysis (no clone).

Plugin: core-standards
Category: Other
Command: /vt-c-repo-evaluate


/vt-c-repo-evaluate — Repository Evaluation Safety Protocol

Safely evaluate external repositories for patterns, architecture, and toolkit-relevant learnings without executing untrusted code.

When to Use

  • Evaluating an open-source repo for patterns to adopt
  • Qualifying a repository linked in an intake proposal
  • Researching a library's source code before adding as a dependency
  • Comparing implementation approaches across projects

Safety Levels

Level Method Risk Requires
Level 1 (default) GitHub API only — no local code None Nothing
Level 2 Safe clone: --no-checkout, inspect hooks before checkout Low User approval
Level 3 Sandboxed execution in container Medium Explicit opt-in + Docker
Level 4 Dependency audit before install Medium Explicit opt-in

Default is always Level 1. Never escalate without asking.

Invocation

/vt-c-repo-evaluate owner/repo                      # Level 1 API analysis
/vt-c-repo-evaluate https://github.com/owner/repo   # Also accepts full URLs
/vt-c-repo-evaluate owner/repo --level 2             # Safe clone with inspection
/vt-c-repo-evaluate owner/repo --level 4             # Dependency audit

Execution Instructions

Step 0: Parse Input

Extract owner/repo from the argument. Accept formats: - owner/repo - https://github.com/owner/repo - https://github.com/owner/repo.git - github.com/owner/repo

If no argument provided, ask for the repository URL.

Step 1: Level 1 — API-Only Analysis (Default)

Perform all analysis via gh api without cloning:

# 1a. Repository metadata
gh api repos/{owner}/{repo} --jq '{
  name, full_name, description, language,
  stargazers_count, forks_count, open_issues_count,
  license: .license.spdx_id,
  created_at, updated_at, pushed_at,
  default_branch, archived, disabled
}'

# 1b. Top contributors
gh api repos/{owner}/{repo}/contributors --jq '.[0:5] | .[] | "\(.login) (\(.contributions) commits)"'

# 1c. Recursive directory tree (single call, check truncation + list paths)
TREE_JSON=$(gh api repos/{owner}/{repo}/git/trees/HEAD?recursive=1)
echo "$TREE_JSON" | jq -r '.truncated'
echo "$TREE_JSON" | jq -r '.tree[].path' | head -500

# 1d. Key files content (read via API, no clone)
# Priority order: entry points, config, architecture docs, CI
for file in README.md CLAUDE.md package.json pyproject.toml Gemfile go.mod Cargo.toml Dockerfile .github/workflows/ci.yml ARCHITECTURE.md; do
  gh api repos/{owner}/{repo}/contents/{file} --jq '.content' 2>/dev/null | base64 -d
done

# 1e. Languages breakdown
gh api repos/{owner}/{repo}/languages

# 1f. Recent commits (last 10)
gh api repos/{owner}/{repo}/commits --jq '.[0:10] | .[] | {sha: .sha[0:8], message: .commit.message | split("\n")[0], date: .commit.author.date}'

Output a structured report:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Repository Evaluation: {owner}/{repo}
Level: 1 (API-only, no local code)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Overview:
  Description: {description}
  Language: {language} | License: {license}
  Stars: {stars} | Forks: {forks} | Last push: {pushed_at}

Structure:
  {top-level file listing}

Key Findings:
  - {pattern observations from README, config files}
  - {architecture observations from file tree}
  - {dependency observations from manifest files}

Toolkit Relevance:
  - {specific patterns worth adopting}
  - {skill improvement ideas}

Step 1b: Toolkit Relevance Analysis

Compare discovered patterns against the VisiTrans toolkit inventory:

  1. Load toolkit inventory: List skills from plugins/core-standards/skills/ and agents from plugins/core-standards/agents/ (use Glob)
  2. Classify each discovered pattern into three categories:

Applicable Patterns — patterns the toolkit could adopt: - New capability not currently in any toolkit skill - Better implementation of an existing capability - Reusable architecture pattern (decorators, middleware, evaluators, pipelines) - For each: cite specific source file, explain what it does, identify which toolkit skill/agent would benefit

Inapplicable Patterns — patterns that do not fit: - Technology-specific to a stack VisiTrans does not use - Architecture mismatch (e.g., server-based when toolkit is CLI-native) - Already implemented by an existing toolkit skill (reference it) - For each: brief rationale

Security Concerns — anything noteworthy: - Hardcoded credentials or tokens in source - Overly permissive file operations or network calls - Dependency count, age, and maintenance status

Step 2: Offer Level Escalation

After Level 1 report, ask if deeper analysis is needed:

Use AskUserQuestion: - Done — Level 1 is sufficient → proceed to Step 7 (report generation) - Level 2 — Safe clone for deeper file inspection → proceed to Step 3 - Level 4 — Dependency audit → proceed to Step 5

Step 3: Level 2 — Safe Clone (Requires Approval)

Pre-clone: Check for stale temp directories from prior runs:

ls -d /tmp/eval-* 2>/dev/null

If stale directories found, use AskUserQuestion: - Clean up stale directoriesrm -rf /tmp/eval-* - Keep them and continue

Pre-clone safety checks:

# 3a. Clone WITHOUT checkout (no post-clone hooks run)
git clone --no-checkout --depth 1 https://github.com/{owner}/{repo}.git /tmp/eval-{repo}

cd /tmp/eval-{repo}

# 3b. Inspect hooks BEFORE checkout
ls -la .git/hooks/
for hook in .git/hooks/*; do
  if [ -f "$hook" ] && [ -x "$hook" ]; then
    echo "=== EXECUTABLE HOOK: $hook ==="
    head -20 "$hook"
  fi
done

# 3c. Check for dangerous files in tree (before checkout)
git ls-tree -r --name-only HEAD | grep -iE '(Makefile|setup\.py|setup\.cfg|postinstall|preinstall|\.sh$|\.bat$|\.cmd$|\.ps1$)'

Report hook findings to user before proceeding:

Hook inspection results:
  Executable hooks found: {count}
  {list any non-standard hooks}

Potentially dangerous files in tree:
  {list matches}

Proceed with checkout?

Use AskUserQuestion: - Yes, checkout is safe → run git checkout - No, abort → clean up and exit

# 3d. After checkout, analyze deeper
# Read specific files of interest (architecture, patterns, configs)
# NEVER run any scripts, build commands, or install commands

Step 4: Level 2 — Deep Analysis

With files checked out, analyze:

  1. Architecture patterns: Directory structure, module organization, dependency injection
  2. Skill patterns: Look for CLAUDE.md, agent configurations, skill definitions
  3. Configuration patterns: CI/CD, linting, testing setup
  4. Security patterns: Auth, input validation, secret management

Output findings appended to the Level 1 report.

Step 4b: Level 3 — Sandboxed Execution (Manual, Requires Approval)

Only offer if Level 2 was completed and user wants to run code.

Use AskUserQuestion with elevated warning: - Show Docker commands — "Level 3 provides manual Docker commands for running code in a read-only container. You execute these commands yourself in a separate terminal." - Skip Level 3 — "Proceed without sandboxed execution."

If approved, display commands (do NOT execute them):

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Sandboxed Execution Commands (Manual)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

IMPORTANT: The repo is mounted as READ-ONLY (:ro).
No changes can be written back to your filesystem.

Python projects:
  docker run --rm -v /tmp/eval-{repo}:/repo:ro python:3.11-slim bash
  # Inside: cd /repo && pip install -r requirements.txt && python -m pytest

Node.js projects:
  docker run --rm -v /tmp/eval-{repo}:/repo:ro node:20-slim bash
  # Inside: cd /repo && npm ci && npm test

General inspection:
  docker run --rm -v /tmp/eval-{repo}:/repo:ro ubuntu:22.04 bash
  # Inside: find /repo -type f -name "*.py" | head -20

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Run these in a separate terminal, then report findings here.

Ask user to report findings. Include them in the evaluation report.

Step 5: Level 4 — Dependency Audit (Requires Approval)

Audit dependencies without installing them:

# For Node.js projects
npx better-npm-audit audit --level moderate 2>/dev/null || npm audit --json

# For Python projects
pip-audit --requirement requirements.txt --dry-run 2>/dev/null

# For Ruby projects
bundle audit check --no-update 2>/dev/null

Report findings with severity counts.

Step 6: Cleanup

Always clean up cloned repos:

rm -rf /tmp/eval-{repo}

Step 7: Generate Evaluation Report

Write the evaluation report to intake/evaluations/YYYY-MM-DD-{repo-name}.md:

---
type: evaluation
date: YYYY-MM-DD
repository: https://github.com/{owner}/{repo}
level: {N}
patterns_found: {total_count}
applicable_patterns: {applicable_count}
---

# Repository Evaluation: {owner}/{repo}

**Date**: YYYY-MM-DD
**Evaluator**: /vt-c-repo-evaluate (Level {N})
**License**: {license}

## Repository Metadata

| Field | Value |
|-------|-------|
| Stars | {N} |
| Language | {lang} |
| License | {license} |
| Last Push | {date} |
| Contributors | {N} |
| Open Issues | {N} |

## Directory Overview

{tree summary with notable structural patterns}

## Key Files Analyzed

{list of files read with brief purpose of each}

## Applicable Patterns

### Pattern: {name}
- **Source**: `{file}`
- **What**: {description}
- **Toolkit Relevance**: {which skill/agent benefits}
- **Evidence**: {code snippet or quote}

## Inapplicable Patterns

- **{name}**: {rationale for exclusion}

## Security Assessment

{security observations from analysis}

## Dependency Analysis

{if Level 4 was run: dependency table and audit results}

## Recommended Next Steps

{concrete recommendations}

Step 8: Generate Intake Proposals

For each applicable pattern identified in Step 1b (maximum 5 per evaluation):

  1. If more than 5 applicable patterns found, use AskUserQuestion:
  2. Generate top 5 by relevance — auto-select the 5 most impactful
  3. Let me choose — present the full list for user selection

  4. For each selected pattern, write a proposal file to intake/pending/from-research/YYYY-MM-DD-eval-{pattern-slug}.md:

---
type: research-finding
date: YYYY-MM-DD
source: repo-evaluate
source_path: intake/evaluations/YYYY-MM-DD-{repo-name}.md
target_skill: "{best matching existing skill, or 'new: vt-c-{name}'}"
severity: medium
category: new-pattern
toolkit_proposal_status: open
tags: [{repo-name}, {pattern-keywords}]
---

# Proposal: {Pattern Name}

**Source**: {owner}/{repo} — `{source_file}`
**Type**: research-finding
**Priority**: MEDIUM
**Evaluated by**: /vt-c-repo-evaluate

## Finding

{What the pattern does, with evidence from the source repo}

## Toolkit Gap

{What capability is missing or could be improved}

**Evidence**: {Reference to specific toolkit skill/agent}

## Suggestion

- [ ] {Specific action item 1}
- [ ] {Specific action item 2}

## Action

- [ ] Review
- [ ] Implement
  1. Display summary:
    Generated {N} intake proposals:
    - intake/pending/from-research/YYYY-MM-DD-eval-{slug-1}.md
    - ...
    
    Run /vt-c-toolkit-review to process these proposals.
    

Step 9: Final Summary

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Evaluation Complete: {owner}/{repo}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Safety Level: {N} ({level name})
Patterns Found: {total}
  Applicable: {N} (proposals generated)
  Inapplicable: {N}
Security: {summary}

Report: intake/evaluations/YYYY-MM-DD-{repo-name}.md
Proposals: {N} files in intake/pending/from-research/

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

If Level 2 was used: confirm /tmp/eval-{repo} deleted.

Safety Rules (Non-Negotiable)

  1. NEVER run make, npm install, pip install, bundle install, or any build/install command on an untrusted repo
  2. NEVER execute scripts from an untrusted repo (setup.py, shell scripts, Makefiles)
  3. NEVER clone without --no-checkout at Level 2
  4. NEVER escalate beyond Level 1 without explicit user approval
  5. ALWAYS inspect hooks before checkout
  6. ALWAYS clean up /tmp/eval-* directories when done
  7. NEVER copy files from evaluated repos into the toolkit without user review

Integration

Works With

  • /vt-c-inbox-qualify: For repos dropped into the intake inbox
  • /vt-c-research-ingest: For repos identified during research scans
  • /vt-c-content-evaluate: For evaluating repo documentation quality
  • /vt-c-toolkit-review: Processes the intake proposals generated by this skill