Repo Evaluator finds high-signal engineering work in a target company's repository history, scores benchmark readiness, and helps you decide what codebases are worth licensing for a company-specific SWE-bench style dataset.
This is the non-technical version you can share with PE stakeholders.
flowchart LR A["1) You select target repositories"] --> B["2) Repo Evaluator reads code + PR history"] B --> C["3) It filters for high-signal engineering work"] C --> D["4) It scores each repo for benchmark readiness"] D --> E["5) It outputs candidate SWE-bench style tasks"] E --> F["6) Your team licenses approved codebases"]
flowchart TD U["User"] --> SEL["/select<br/>Pick repositories"] SEL --> DASH["/evaluate<br/>EvaluationDashboard"] DASH --> API["POST /api/evaluate/stream"] API --> AUTH["Read gh_token from cookie"] AUTH --> MODAL["Forward request to MODAL_EVALUATE_URL"] MODAL --> WEB["Modal web_app()<br/>FastAPI streaming endpoint"] WEB --> PART["Create stream partition (UUID)"] PART --> SPAWN["Spawn evaluate_single_repo() per repo"] SPAWN --> CLONE["clone_repo()"] CLONE --> EVAL["RepoEvaluator.evaluate()"] EVAL --> RM["RepoAnalyzer.analyze()<br/>repo quality score"] EVAL --> PR["PRAnalyzer.analyze_prs()<br/>accept/reject merged PRs"] PR --> F2P["Optional _run_f2p_analysis()<br/>validate F2P/P2P tests"] RM --> SCORE["overall = 0.6 * repo + 0.4 * PR acceptance"] PR --> SCORE F2P --> SCORE SCORE --> RESULT["to_json() + snake_to_camel + derive_verdict"] SPAWN --> QUEUE["Modal Queue partition"] QUEUE --> WEB SPAWN --> LOGS["Emit log/progress/result events"] LOGS --> QUEUE RESULT --> QUEUE WEB --> SSE["Emit SSE: log | progress | result | complete"] SSE --> API API --> DASH DASH --> PARSE["parseSSEStream()"] PARSE --> UI["Update per-repo logs, phase, progress"] PARSE --> STORE["saveResults() in sessionStorage"] STORE --> RES["/results and /results/[owner]/[repo]"]
Scoring model:0.6 * repo_quality_score + 0.4 * pr_acceptance_rate_percent