common-skills/skills/deep-dive/evals/evals.json

{
  "skill_name": "deep-dive",
  "evals": [
    {
      "id": 1,
      "prompt": "Help me understand the Redis codebase. I want to know its architecture, how the event loop works, and the key data structures it uses internally.",
      "expected_output": "A structured report covering: Redis overview (in-memory data store, single-threaded event loop), architecture diagram showing the ae event loop, networking layer, command dispatcher, and persistence modules, explanation of core data structures (SDS, dict, ziplist/listpack, skiplist), sequence diagram for a SET command, and further reading pointing to specific source files like ae.c, t_string.c, dict.c.",
      "assertions": [
        "Report includes an Overview section describing Redis as an in-memory data store",
        "Report includes a PlantUML architecture diagram",
        "Report explains the single-threaded event loop (ae)",
        "Report covers at least 3 internal data structures (e.g. SDS, dict, skiplist)",
        "Report includes a Further Reading section with at least 3 actionable items",
        "At least one PlantUML sequence diagram is included"
      ]
    },
    {
      "id": 2,
      "prompt": "I just joined a team working on a REST API built with FastAPI. Here's the project README:\n\n# OrderService\nA FastAPI service managing e-commerce orders. Uses PostgreSQL via SQLAlchemy, Redis for caching, and Celery for async tasks. Auth via JWT.\n\n## Endpoints\n- POST /orders — create order\n- GET /orders/{id} — get order\n- PATCH /orders/{id}/status — update status\n- GET /orders?user_id=X — list orders\n\n## Models\nOrder: id, user_id, status (pending/confirmed/shipped/delivered), items (JSON), created_at\n\nHelp me understand this service.",
      "expected_output": "Report covering: overview of OrderService purpose and stack (FastAPI, PostgreSQL, Redis, Celery, JWT), architecture diagram showing the components and their connections, data model ER diagram for the Order entity, sequence diagrams for at least POST /orders and PATCH /orders/{id}/status flows, API reference table for all 4 endpoints, notes on JWT auth, Redis caching strategy, and Celery async task usage, further reading recommendations.",
      "assertions": [
        "Report includes an Overview section mentioning FastAPI, PostgreSQL, Redis, Celery, and JWT",
        "Report includes a PlantUML architecture or component diagram",
        "Report includes a PlantUML data model diagram showing the Order entity",
        "Report includes a PlantUML sequence diagram for at least one endpoint flow",
        "Report includes an API reference section covering all 4 endpoints",
        "Report mentions JWT authentication",
        "Report includes a Further Reading section"
      ]
    },
    {
      "id": 3,
      "prompt": "Give me a deep dive on the Kafka consumer group protocol. I need to understand how rebalancing works, what the group coordinator does, and the difference between eager and cooperative rebalancing.",
      "expected_output": "Report covering: Kafka consumer group overview, architecture diagram showing brokers, group coordinator, and consumers, explanation of the group coordinator role (heartbeats, session timeout, offset commits), detailed sequence diagrams for both eager (stop-the-world) and cooperative (incremental) rebalance protocols, key concepts glossary (consumer group, partition assignment, rebalance, heartbeat, session.timeout.ms), known trade-offs between the two rebalance strategies, and further reading.",
      "assertions": [
        "Report includes an Overview section explaining consumer groups and their purpose",
        "Report includes a PlantUML diagram showing brokers, group coordinator, and consumers",
        "Report explains the group coordinator role",
        "Report covers both eager and cooperative rebalancing with their differences",
        "Report includes at least one PlantUML sequence diagram showing a rebalance flow",
        "Report includes a Key Concepts section with relevant terminology",
        "Report includes a Known Limitations or Trade-offs section comparing the two strategies",
        "Report includes a Further Reading section"
      ]
    },
    {
      "id": 4,
      "prompt": "I need to understand this Go file quickly:\n\n```go\npackage worker\n\ntype Job struct {\n    ID      string\n    Payload []byte\n    Retries int\n}\n\ntype Worker struct {\n    queue   chan Job\n    done    chan struct{}\n    handler func(Job) error\n}\n\nfunc New(concurrency int, handler func(Job) error) *Worker {\n    w := &Worker{\n        queue:   make(chan Job, 100),\n        done:    make(chan struct{}),\n        handler: handler,\n    }\n    for i := 0; i < concurrency; i++ {\n        go w.loop()\n    }\n    return w\n}\n\nfunc (w *Worker) Submit(j Job) { w.queue <- j }\n\nfunc (w *Worker) Stop() { close(w.done) }\n\nfunc (w *Worker) loop() {\n    for {\n        select {\n        case j := <-w.queue:\n            if err := w.handler(j); err != nil && j.Retries > 0 {\n                j.Retries--\n                w.queue <- j\n            }\n        case <-w.done:\n            return\n        }\n    }\n}\n```",
      "expected_output": "Report covering: overview of the worker pool pattern implemented, architecture/component description of Job, Worker structs and their roles, sequence diagram showing Submit -> loop -> handler -> retry flow, explanation of concurrency model (goroutines, buffered channel, done channel for shutdown), key concepts (worker pool, buffered channel backpressure, retry with decrement), known limitations (no graceful drain on Stop, fixed buffer size, no dead-letter queue), and further reading suggestions.",
      "assertions": [
        "Report identifies this as a worker pool / job queue pattern",
        "Report explains the role of the queue channel and done channel",
        "Report includes a PlantUML sequence or activity diagram showing the job processing flow including retry",
        "Report explains the concurrency model (goroutines spawned in New)",
        "Report identifies at least 2 limitations (e.g. no graceful shutdown drain, fixed buffer, no DLQ)",
        "Report includes a Further Reading section"
      ]
    }
  ]
}