Reverse engineering River Raid with Claude, Ghidra, and MCP
Connecting Claude to Ghidra via MCP to reverse engineer River Raid. A test of AI agents against 6502 assembly, memory mapping, and 80s game logic.
production-ready through
Independent evaluation and training for the AI agent ecosystem. Real-world complexity through simulation environments where agents face multi-hour tasks.
Talk to FounderLarge-scale RL datasets with tuned difficulty distributions. Cheat-proof reward functions. Teach skills scarce in public data (e.g. dependency hell, distributed system debugging).
Measure quality and uncover blind spots. Pick optimal models, tune prompts in a fast-changing world. Benchmark against competitors. Win deals and deliver on performance promises.
Independent verification of what actually works. Design processes based on real capabilities, not marketing hype. ROI-driven deployment decisions. Move from FOMO to measurable P&L impact.
Explore our research on AI agents, benchmarking, and evaluation
Connecting Claude to Ghidra via MCP to reverse engineer River Raid. A test of AI agents against 6502 assembly, memory mapping, and 80s game logic.
A lot of vendors pitch AI SRE. We tested 14 models across 11 programming languages; even the best ones struggle with instrumenting code with the leading open-source standard, OpenTelemetry.
Prompts are specs, not code. This influences git workflows for vibe coding: tracking LLM prompts in GitHub repositories, managing commit messages, and debugging non-deterministic AI outputs.
The Quesma database gateway IP has been acquired by Hydrolix to ensure continued support.
Read the announcement.