Today we're releasing the core of Digital Tap AI's cloud optimization agent framework as open source. The repository is live at github.com/digital-tap, and everything you need to start optimizing your Databricks, EMR, Dataproc, and Kubernetes clusters is included.
This wasn't an obvious decision. We spent 18 months building these agents. They're the foundation of a product that saves our customers millions of dollars annually. Giving them away for free seems, on the surface, like terrible business strategy.
It's actually the best business decision we've made. Here's why.
Why Open Source?
1. Cloud Optimization Shouldn't Be a Black Box
When an autonomous agent decides to hibernate your production cluster, you need to understand why. When it migrates workloads from spot instances, you need to see the decision logic. When it right-sizes your instances, you need to verify it's making the right tradeoffs.
Closed-source optimization tools ask you to trust them with your production infrastructure based on marketing claims. Open-source tools let you read the code, understand the algorithms, and verify the behavior before you deploy.
For enterprises running critical data infrastructure, that transparency isn't a nice-to-have — it's a requirement. Security teams want to audit. Platform engineers want to understand. Compliance teams want to verify. Open source satisfies all of them.
2. The Real Value Is in the Platform, Not the Agents
The individual agents — idle detection, right-sizing, spot orchestration — are valuable, but they're not the whole product. The Digital Tap platform adds:
- Agent coordination — 27 agents working together through a shared intelligence layer, making decisions that account for each other's actions
- Predictive ML models — Trained on your specific usage patterns across weeks of historical data, achieving prediction accuracy that improves over time
- Enterprise dashboard — Real-time visibility into savings, waste, utilization, and water impact across all platforms
- Compliance and audit — Complete decision logs, approval workflows, and integration with enterprise governance tools
- Managed infrastructure — We run and scale the optimization platform so you don't have to
- Savings guarantee — 3-4× your subscription cost in verified savings, or a full refund
Open-sourcing the agents doesn't commoditize our product — it demonstrates its foundation. Teams that try the open-source agents and see 15-20% savings naturally want to explore what the full platform with coordinated agents and ML models can deliver (typically 35-42%).
3. The Problem Is Too Important for One Company
$44.5 billion in annual cloud waste is an industry-scale problem. Even if Digital Tap becomes the dominant optimization platform, we'll serve thousands of organizations at most. There are hundreds of thousands of companies wasting money on idle clusters.
Open-sourcing our agents means any organization — including those who would never buy a commercial product — can start reducing waste today. A startup with a $5K/month Databricks bill can deploy our idle detection agent and save $1,500/month without paying anything.
That's good for the industry, good for the planet (less wasted energy and water), and ultimately good for us — because it establishes Digital Tap as the standard for cloud optimization.
What's in the Open-Source Release
The release includes the core agent framework and five production-ready agents. Everything is Apache 2.0 licensed.
Core Framework
digital-tap/
├── core/
│ ├── agent.py # Base agent class with lifecycle management
│ ├── scheduler.py # Agent scheduling and coordination
│ ├── metrics.py # Metric collection and aggregation
│ ├── config.py # Configuration management
│ └── connectors/ # Platform connectors
│ ├── databricks.py
│ ├── emr.py
│ ├── dataproc.py
│ └── kubernetes.py
├── agents/
│ ├── idle_detection/ # Detect and hibernate idle clusters
│ ├── right_sizing/ # Analyze and recommend instance changes
│ ├── spot_manager/ # Basic spot instance lifecycle management
│ ├── cost_anomaly/ # Detect unusual cost patterns
│ └── tag_enforcer/ # Ensure cost allocation tags exist
├── cli/ # Command-line interface
├── tests/
└── docs/
Agent 1: Idle Detection
The idle detection agent monitors cluster utilization via platform APIs and takes action when clusters sit idle beyond a configurable threshold. It supports Databricks, EMR, Dataproc, and Kubernetes.
# Install and run the idle detection agent
pip install digital-tap
# Configure your Databricks connection
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-token"
# Run the idle detection agent
digital-tap agent run idle_detection \
--idle-threshold 10m \
--action hibernate \
--dry-run # Remove for production
In dry-run mode, the agent reports what it would do without taking action — perfect for building confidence before enabling automation.
Agent 2: Right-Sizing
Analyzes actual CPU, memory, and I/O utilization over configurable time windows and generates right-sizing recommendations. Supports automated application for Databricks cluster policies.
# Run right-sizing analysis
digital-tap agent run right_sizing \
--lookback 7d \
--target-utilization 70 \
--output recommendations.json
Agent 3: Spot Manager
Manages spot instance lifecycle with automatic fallback to on-demand. Monitors spot interruption warnings and handles graceful task migration. This is the community version — the full platform adds predictive interruption avoidance.
Agent 4: Cost Anomaly Detection
Uses statistical analysis to detect unusual cost patterns — a sudden spike in cluster count, an instance type change that doubles cost, or a new resource that wasn't budgeted. Sends alerts via webhook, Slack, or email.
Agent 5: Tag Enforcer
Ensures every cluster and resource has the required cost allocation tags (cost center, team, environment, project). Can warn, auto-tag with defaults, or terminate untagged resources based on policy.
How to Get Started
Getting started takes about 5 minutes:
- Install the package:
pip install digital-tap - Configure your platform connection (service principal for Databricks, IAM role for EMR, etc.)
- Run an agent in dry-run mode to see what it would do
- Review the output and adjust thresholds
- Enable live mode when you're confident
The documentation at github.com/digital-tap includes quick-start guides for each platform, configuration reference, and deployment guides for running agents as Kubernetes services or cron jobs.
Contributing
We welcome contributions. The areas where we'd particularly love community input:
- New platform connectors — Snowflake, BigQuery, Redshift, and other data platforms
- New agents — Storage optimization, network cost analysis, reserved instance recommendations
- Improved heuristics — Better idle detection for specific workload patterns
- Testing — More platforms, more edge cases, more environments
- Documentation — Deployment guides, best practices, case studies
Every contribution that ships in the open-source release also benefits Digital Tap platform customers — it's a positive-sum game. Contributors get recognition, the community gets better tools, and the platform gets a stronger foundation.
Open Source vs. Platform: Which Should You Use?
Here's an honest comparison:
Use the open-source agents if:
- Your cloud data spend is under $50K/month
- You have platform engineers who can deploy and manage the agents
- You want to start with basic optimization and grow from there
- You need to audit the code before trusting it with production infrastructure
Use the Digital Tap platform if:
- Your spend exceeds $50K/month (the savings ROI makes the platform free)
- You want coordinated multi-agent optimization with ML-powered prediction
- You need enterprise features: audit logs, approval workflows, compliance
- You want a savings guarantee and managed infrastructure
Many of our best customers started with the open-source agents, saw the value, and upgraded to the platform when their infrastructure grew. That's exactly the journey we designed.
"Open source isn't about giving away value. It's about proving value — transparently, verifiably, and at scale. The best products don't need to hide their code."
What's Next
This is version 1.0 of the open-source release. Over the coming months, we'll be adding:
- Kubernetes-native deployment — Helm charts for running agents as K8s services
- Terraform provider — Infrastructure-as-code for agent configuration
- Additional agents — Storage optimization and network cost analysis
- Community ML models — Pre-trained prediction models that work without the full platform
We believe cloud optimization should be accessible to every organization, regardless of size or budget. Open-sourcing our agents is our commitment to that belief. Star the repo, try the agents, and let us know what you think.
Start Optimizing Today
Try the open-source agents for free, or explore the full platform with a savings guarantee.