
Tau² Benchmark: How a Prompt Rewrite Boosted GPT-5-mini by 22%
We expected small models to be fast, but our benchmarks revealed a common reliability trap. Here’s our deep dive on finding and fixing it.
your way to
Quesma is a Slack-native operator for Grafana that turns prompts into dashboards, manages alerts, and speeds incident troubleshooting without forcing a new platform. Coming October 2025.
Using Grafana? Talk to our FounderYour tools become smarter overnight. Supports 100+ data sources including PromQL and ClickHouse SQL. Keep your data where it lives. Natural adoption.
Create stunning charts that tell a story. Investigate outages with clear timelines, KPI overlays, and annotated reports. Just by describing and selecting the best.
Open-source and free to install. Casual users can create powerful dashboards. Self-documenting setup in markdown. Reproducible with configuration as code.
Explore our thoughts on data, dashboards, and AI
We expected small models to be fast, but our benchmarks revealed a common reliability trap. Here’s our deep dive on finding and fixing it.
A one-line command to diagnose server health. Uses Nix to fetch tools without sudo and an LLM to summarize the output. No installation required.
Deep dive into the Tau² benchmark that goes beyond LLM evaluation to reveal innovative methodologies for testing AI agentic systems in realistic scenarios. Learn how this framework can transform how we test AI-powered software.
Our database proxy for seamless migration between Elasticsearch and ClickHouse.