Files
James Bland e88609d724
All checks were successful
kinec.tech/airun-pathfinder-crud-pricing/pipeline/head This commit looks good
feat: crud-pricing initial implementation
Complete CRUD service for AWS pricing operations - single source of truth.

Features:
- Dual pricing model (retail + account-specific with auto EDP/PPA detection)
- Get/Put pricing operations with intelligent caching
- AWS Pricing API integration for public list prices
- AWS Cost Explorer integration for account-specific pricing
- Access counting for self-learning 14-day refresh
- Query most-accessed instances (powers smart refresh)
- TTL: 30 days (retail), 7 days (account-specific)

Architecture:
- All other lambdas use this for pricing operations
- No direct DynamoDB access from other components
- Consistent schema enforcement
- Complete IAM setup for Pricing API, Cost Explorer, STS

Infrastructure:
- Complete Terraform configuration
- Full CI/CD pipeline (Jenkinsfile)
- Comprehensive documentation
- Production-ready scaffolding

Part of Phase 1 - foundation for pricing system.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 04:20:56 -05:00

665 lines
15 KiB
Markdown

# crud-pricing - AWS Pricing CRUD Operations
**Single source of truth for all AWS pricing operations.**
## What It Does
This Lambda is the centralized CRUD service for AWS pricing data:
-**Get/Put pricing** in DynamoDB cache
-**Fetch from AWS Pricing API** (retail public prices)
-**Fetch from Cost Explorer** (account-specific with auto EDP/PPA!)
-**Access counting** for self-learning refresh
-**Query most-accessed instances** for smart 14-day refresh
## Why This Exists
**Problem:** Multiple lambdas touching pricing table = schema duplication
**Solution:** Single CRUD Lambda pattern
- All pricing operations go through one place
- Consistent schema enforcement
- Single place to evolve pricing logic
- Reusable across all components
## Architecture
```
tool-pricing-query (MCP tool) }
enrichment-server-pricing (Lambda) } → crud-pricing (CRUD) → DynamoDB
tool-pricing-refresh (Scheduler) } ↓
AWS Pricing API
AWS Cost Explorer
```
**Used by:**
- `tool-pricing-query` - MCP tool wrapper for agents
- `enrichment-server-pricing` - Server pricing enrichment
- `tool-pricing-refresh` - 14-day automated refresh
**This Lambda is NOT registered with AgentCore** - it's internal CRUD only.
## Operations
### 1. Get - Retrieve pricing (with optional AWS fetch)
```json
{
"operation": {
"type": "get",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "retail",
"fetchIfMissing": true
}
}
```
**Response:**
```json
{
"pricing": {
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "retail",
"ec2Pricing": {
"vcpus": 4,
"memoryGb": 16.0,
"onDemand": {
"hourly": 0.154,
"monthly": 112.42
},
"reserved": {
"standard": {
"1yr": {
"allUpfront": {
"effectiveMonthly": 72.27,
"totalUpfront": 867.24
}
}
}
}
},
"accessCount": 127,
"lastUpdated": "2025-11-27T00:00:00Z"
},
"cacheStatus": "hit"
}
```
### 2. Put - Store pricing data
```json
{
"operation": {
"type": "put",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "retail",
"pricingData": { /* PricingData object */ }
}
}
```
### 3. ListCommon - Get most-accessed instances
```json
{
"operation": {
"type": "listCommon",
"limit": 50,
"minAccessCount": 5
}
}
```
**Response:**
```json
{
"instances": [
{
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"accessCount": 347,
"lastAccessed": "2025-11-27T10:30:00Z",
"lastUpdated": "2025-11-20T00:00:00Z"
}
],
"count": 50
}
```
**Used by tool-pricing-refresh to discover common instances!**
### 4. IncrementAccess - Track usage
```json
{
"operation": {
"type": "incrementAccess",
"instanceType": "m6g.xlarge",
"region": "us-east-1"
}
}
```
### 5. QueryAwsApi - Direct AWS Pricing API call
```json
{
"operation": {
"type": "queryAwsApi",
"instanceType": "m6g.xlarge",
"region": "us-east-1"
}
}
```
### 6. QueryCostExplorer - Account-specific pricing (with EDP/PPA!)
```json
{
"operation": {
"type": "queryCostExplorer",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"awsAccountId": "123456789012",
"roleArn": "arn:aws:iam::123456789012:role/pathfinder-pricing-access"
}
}
```
**This automatically includes EDP/PPA discounts** by querying actual costs from the customer's account!
## Dual Pricing Model
### Retail Pricing (Public AWS List Prices)
**Cache Key:** `PK=INSTANCE#m6g.xlarge, SK=REGION#us-east-1#RETAIL`
**Source:** AWS Pricing API
**Use when:**
- No customer AWS account access
- Planning/estimates for new customers
- Baseline pricing comparisons
**TTL:** 30 days
### Account-Specific Pricing (Includes EDP/PPA Automatically!)
**Cache Key:** `PK=INSTANCE#m6g.xlarge, SK=REGION#us-east-1#ACCOUNT#123456789012`
**Source:** AWS Cost Explorer (from customer account)
**Use when:**
- Customer grants IAM role access
- Need actual cost projections
- Want automatic EDP/PPA discount detection
**TTL:** 7 days (more current)
**How EDP/PPA Auto-Detection Works:**
1. Assume IAM role into customer's AWS account (requires trust policy)
2. Query Cost Explorer for past 30 days of actual usage
3. Calculate average hourly rate from real costs
4. This rate automatically includes any EDP/PPA discounts!
5. No manual discount configuration needed
## DynamoDB Schema
### Pricing Items
```typescript
{
// Keys
"PK": "INSTANCE#m6g.xlarge",
"SK": "REGION#us-east-1#RETAIL", // or "REGION#us-east-1#ACCOUNT#123456789012"
// GSI for access queries
"GSI1PK": "PRICING",
"accessCount": 127, // Incremented on each query
// Pricing data (stored as JSON)
"pricingData": "{...}", // Full PricingData struct serialized
// Metadata (for querying)
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "retail",
"lastUpdated": "2025-11-27T00:00:00Z",
"lastAccessed": "2025-11-27T10:30:00Z",
"firstCached": "2025-11-01T08:00:00Z",
// TTL
"expiresAt": 1740614400 // 30 days for retail, 7 days for account-specific
}
```
### GSI: AccessCountIndex
**Purpose:** Query most-accessed instances for refresh
**Keys:** `GSI1PK=PRICING, accessCount (range key, numeric)`
**Usage:**
```rust
// Get top 50 most-accessed instances
query()
.index_name("AccessCountIndex")
.key_condition_expression("GSI1PK = :pk")
.scan_index_forward(false) // Descending
.limit(50)
```
## Self-Learning Refresh Strategy
**How it works:**
1. **Enrichment drives caching**
- Server created → enrichment queries pricing
- Cache miss → fetch from AWS API → cache for 30 days
- Cache hit → fast response
2. **Access counting tracks popularity**
- Every query increments `accessCount`
- Popular instances accumulate high counts
- Unpopular instances stay at low counts
3. **14-day refresh uses actual patterns**
- Query top 50 by `accessCount`
- Refresh only what's actually being used
- Adapts automatically as usage changes
**Result:** Cache reflects real usage patterns, no hardcoded lists!
## IAM Requirements
### This Lambda Needs:
**DynamoDB:**
```json
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:Query"
],
"Resource": [
"arn:aws:dynamodb:REGION:ACCOUNT:table/pathfinder-ENV-pricing",
"arn:aws:dynamodb:REGION:ACCOUNT:table/pathfinder-ENV-pricing/index/AccessCountIndex"
]
}
```
**AWS Pricing API:**
```json
{
"Effect": "Allow",
"Action": [
"pricing:GetProducts",
"pricing:DescribeServices"
],
"Resource": "*"
}
```
**STS (for assuming roles):**
```json
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::*:role/pathfinder-pricing-access"
}
```
### Customer Account Needs:
**IAM Role:** `pathfinder-pricing-access`
**Permissions:**
```json
{
"Effect": "Allow",
"Action": [
"ce:GetCostAndUsage",
"ce:GetCostForecast"
],
"Resource": "*"
}
```
**Trust Policy:**
```json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::OUR-ACCOUNT:role/crud-pricing-lambda-role"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "pathfinder-unique-external-id"
}
}
}]
}
```
## Development
### Build
```bash
./build.sh
# Or manually:
cargo lambda build --release --arm64 --output-format zip
```
### Test
```bash
# Run unit tests
cargo test
# Test locally
cargo run -- '{
"operation": {
"type": "get",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "retail",
"fetchIfMissing": true
}
}'
```
### Deploy
```bash
cd terraform
terraform init
terraform apply \
-var="environment=dev" \
-var="aws_region=us-east-1"
```
## Testing Deployed Lambda
### Get Retail Pricing (Cache Miss → Fetch from API)
```bash
aws lambda invoke \
--function-name airun-pathfinder-crud-pricing-dev \
--payload '{
"operation": {
"type": "get",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "retail",
"fetchIfMissing": true
}
}' \
output.json
cat output.json | jq .
```
### Get Account-Specific Pricing (With EDP/PPA)
```bash
aws lambda invoke \
--function-name airun-pathfinder-crud-pricing-dev \
--payload '{
"operation": {
"type": "queryCostExplorer",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"awsAccountId": "123456789012",
"roleArn": "arn:aws:iam::123456789012:role/pathfinder-pricing-access"
}
}' \
output.json
cat output.json | jq '.body.pricing.ec2Pricing.onDemand'
# Shows actual pricing with EDP/PPA discount included!
```
### List Most-Accessed Instances
```bash
aws lambda invoke \
--function-name airun-pathfinder-crud-pricing-dev \
--payload '{
"operation": {
"type": "listCommon",
"limit": 50,
"minAccessCount": 5
}
}' \
output.json
cat output.json | jq '.body.instances | .[0:10]'
# Shows top 10 most-accessed instances
```
## Environment Variables
```hcl
environment_variables = {
RUST_LOG = "info"
TABLE_NAME = "pathfinder-dev-pricing"
}
```
## Integration Examples
### From tool-pricing-query
```rust
// tool-pricing-query calls crud-pricing instead of direct DynamoDB
let payload = json!({
"operation": {
"type": "get",
"instanceType": instance_type,
"region": region,
"pricingType": "retail",
"fetchIfMissing": true
}
});
let result = lambda_client
.invoke()
.function_name("airun-pathfinder-crud-pricing-dev")
.payload(Blob::new(serde_json::to_vec(&payload)?))
.send()
.await?;
```
### From enrichment-server-pricing
```rust
// Enrichment lambda calls crud-pricing to get pricing
let payload = json!({
"operation": {
"type": "get",
"instanceType": "m6g.xlarge",
"region": "us-east-1",
"pricingType": "accountSpecific",
"awsAccountId": project.aws_account_id,
"fetchIfMissing": true
}
});
let pricing = invoke_crud_pricing(payload).await?;
```
### From tool-pricing-refresh
```rust
// Refresh lambda discovers common instances
// Step 1: Get most-accessed
let common = invoke_crud_pricing(json!({
"operation": {
"type": "listCommon",
"limit": 50,
"minAccessCount": 5
}
})).await?;
// Step 2: Refresh each one
for instance in common.instances {
invoke_crud_pricing(json!({
"operation": {
"type": "get",
"instanceType": instance.instanceType,
"region": instance.region,
"pricingType": "retail",
"fetchIfMissing": true // Forces refresh
}
})).await?;
}
```
## Cache Strategy
### Retail Pricing
- **TTL:** 30 days
- **Refresh:** 14-day automatic (top 50 instances)
- **Coverage:** All instance types (on-demand fetch)
### Account-Specific Pricing
- **TTL:** 7 days (shorter for more current costs)
- **Refresh:** Not automatic (requires account access)
- **Coverage:** Only instances with usage in customer account
### Access Counting
- Incremented on every Get operation
- Powers self-learning refresh
- GSI allows efficient queries
## Pricing API Details
### AWS Pricing API (Retail)
**What we get:**
- OnDemand hourly/monthly rates
- Reserved pricing (Standard/Convertible, 1yr/3yr, all payment options)
- Instance specs (vCPUs, memory, architecture)
**What we DON'T get:**
- Spot pricing (need EC2 Spot Price API)
- Customer-specific discounts (need Cost Explorer)
**Rate Limits:**
- 1,000,000 requests/month free
- Then $0.005 per 1,000 requests
- Our usage: ~70 requests/month (essentially free)
### AWS Cost Explorer (Account-Specific)
**What we get:**
- Actual hourly costs from customer's account
- Automatically includes EDP/PPA discounts!
- Real usage-based pricing
**What we DON'T get:**
- Instance specs (we fetch separately from Pricing API)
- Reserved/Spot breakdowns (just blended costs)
**Requirements:**
- Must assume role into customer account
- Customer must have Cost Explorer enabled
- Needs 30 days of usage history
## Error Handling
| Error | Cause | Mitigation |
|-------|-------|------------|
| `Pricing not found and fetch_if_missing=false` | Cache miss, no fetch | Set `fetchIfMissing=true` |
| `AWS Pricing API call failed` | API throttling, network issue | Retry with exponential backoff |
| `Failed to assume role` | IAM permissions, invalid role ARN | Check trust policy and role exists |
| `No usage data found` | No historical usage in account | Fall back to retail pricing |
| `No pricing found for instance` | Invalid instance type | Validate instance type exists |
## Performance
**Expected latencies:**
- Cache hit: 10-20ms (DynamoDB query)
- Cache miss (Pricing API): 2-3 seconds (API call + parse)
- Cost Explorer: 3-5 seconds (assume role + API call)
**Access count update:** Async (doesn't add latency)
## Code Structure
```
src/
├── main.rs # Lambda handler and operation router
├── models.rs # Data types and schemas
├── db.rs # DynamoDB CRUD operations
├── aws_pricing.rs # AWS Pricing API client
└── cost_explorer.rs # Cost Explorer client (EDP/PPA detection)
```
**Total:** ~1,100 lines of well-structured Rust
## Testing Strategy
**Unit Tests:**
- [ ] Test all DynamoDB operations
- [ ] Test pricing API response parsing
- [ ] Test Cost Explorer response parsing
- [ ] Test key generation (retail vs account-specific)
- [ ] Test expiration logic
- [ ] Test access counting
**Integration Tests:**
- [ ] Test full Get operation (cache miss → fetch → cache)
- [ ] Test access count increment
- [ ] Test ListCommon with various filters
- [ ] Test account-specific pricing flow
## Monitoring
**CloudWatch Metrics to Add:**
```rust
// Custom metrics
putMetric("CrudPricing", {
"CacheHitRate": hits / total * 100,
"AvgResponseTime": avg_latency_ms,
"ApiCallCount": api_calls,
"AccessCountUpdates": count_updates,
});
```
**Alarms:**
- High cache miss rate (>20%)
- Slow response time (>5s)
- API failures (>5 in 5 minutes)
## Related Components
- `tool-pricing-query` - MCP tool that wraps this CRUD service
- `enrichment-server-pricing` - Uses this to get pricing during enrichment
- `tool-pricing-refresh` - Uses ListCommon to discover instances to refresh
- `pathfinder-dev-pricing` - DynamoDB table managed by this service
## Next Steps
After deployment:
1. Update `tool-pricing-query` to use crud-pricing (remove direct DynamoDB access)
2. Create `enrichment-server-pricing` that uses crud-pricing
3. Create `tool-pricing-refresh` that uses ListCommon operation
4. Update DynamoDB table with AccessCountIndex GSI
See `PRICING-ENHANCEMENTS-PLAN.md` for complete roadmap.