{

Podpedia App — ADR-007/010 Pipeline Experiment (Run 3)

Date: May 15, 2026 Type: Local baseline — end-to-end pipeline latency against Docker Compose + Ollama


Summary

Third run of the ADR-007/010 pipeline experiment. Previous runs used Go micro-benchmarks and were inconclusive due to incomplete baselines. This run pivoted to end-to-end pipeline latency against the actual Docker Compose deployment with a real Ollama LLM.

Status: ✅ Confirmed — local deployment is viable and performant.


Test Setup

Parameter Value
Deployment Docker Compose (localhost:8080 backend, localhost:5173 frontend)
Entity extraction model qwen3.5:latest (7B, ~42 tok/s)
Query synthesis model qwen3.6:27b (Q5_K_M, ~35 tok/s)
Hardware RTX 3090 (24GB), AMD Ryzen 3700X, 64GB RAM
Input text 42 words — "Apple Inc. was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976..."
Trials 5 ingest, 1 query

Results

Ingest Pipeline Latency (5 trials)

Trial Latency (ms) Nodes Notes
1 52,688 11 Cold start — model loading
2 48,645 10 Warm
3 46,632 11 Warm
4 38,529 11 Warm
5 40,550 11 Warm

p50: 46,632 ms · p95: 52,688 ms · Min: 38,529 ms · Max: 52,688 ms

Graph-RAG Query Latency

Trial Latency (ms)
"Who founded Apple?" 21,669

Response: "Apple was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne."

Entity Extraction Quality

The pipeline correctly classified all entities:

Entity Type Correct?
Apple Inc. Organization
Steve Jobs Person
Steve Wozniak Person
Ronald Wayne Person
Tim Cook Person
iPhone Product
iPad Product
Mac Product
Cupertino Location
April 1976 Date/Temporal

Relationships extracted:


Analysis

Convergence Pattern

Trial 1 (cold, 52.7s) includes the overhead of loading qwen3.5:7b into VRAM. Once warm (trials 2-5), latency converges to 38-48s. The 10-15s difference between cold and warm is consistent with model loading time for a 7B Q4 model on an RTX 3090.

Why 7B for Extraction, 27B for Query

Comparison with Previous Runs


Conclusion

The local Docker Compose + Ollama deployment is confirmed as viable for development. Ingest pipeline completes in 40s warm (53s cold), query in ~22s. Entity extraction quality is high — all entity types correctly identified, all relationships valid.


See also: Podpedia Local Deployment, Qwen3.6:27b Capability Evaluation

}