Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,5 @@ bottlecap/proptest-regressions

.gitlab/pipeline*
/CLAUDE.md
implementation-plan.md
research-summary.md
2 changes: 1 addition & 1 deletion .gitlab/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ outputFiles:
datasources:
flavors:
url: .gitlab/datasources/flavors.yaml

environments:
url: .gitlab/datasources/environments.yaml

Expand Down
93 changes: 66 additions & 27 deletions .gitlab/templates/pipeline.yaml.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -324,48 +324,83 @@ signed layer bundle:
- mkdir -p datadog_extension-signed-bundle-${CI_JOB_ID}
- cp .layers/datadog_extension-*.zip datadog_extension-signed-bundle-${CI_JOB_ID}

# Integration Tests - Build Java Lambda function
build java lambda:
# Integration Tests - Build Lambda functions in parallel by runtime

build java lambdas:
stage: integration-tests
image: registry.ddbuild.io/images/docker:27.3.1
tags: ["docker-in-docker:arm64"]
rules:
- when: on_success
needs: []
cache:
key: maven-cache-${CI_COMMIT_REF_SLUG}
paths:
- integration-tests/.cache/maven/
artifacts:
expire_in: 1 hour
paths:
- integration-tests/lambda/*/target/
script:
- cd integration-tests
- ./scripts/build-java.sh

build dotnet lambdas:
stage: integration-tests
image: registry.ddbuild.io/images/docker:27.3.1
tags: ["docker-in-docker:arm64"]
rules:
- when: on_success
needs: []
cache:
key: nuget-cache-${CI_COMMIT_REF_SLUG}
paths:
- integration-tests/.cache/nuget/
artifacts:
expire_in: 1 hour
paths:
- integration-tests/lambda/*/bin/
script:
- cd integration-tests
- ./scripts/build-dotnet.sh

build python lambdas:
stage: integration-tests
image: registry.ddbuild.io/images/docker:27.3.1
tags: ["docker-in-docker:arm64"]
rules:
- when: on_success
needs: []
cache:
key: pip-cache-${CI_COMMIT_REF_SLUG}
paths:
- integration-tests/.cache/pip/
artifacts:
expire_in: 1 hour
paths:
- integration-tests/lambda/base-java/target/
- integration-tests/lambda/*/package/
script:
- cd integration-tests/lambda/base-java
- docker run --rm --platform linux/arm64
-v "$(pwd)":/workspace
-w /workspace
maven:3.9-eclipse-temurin-21-alpine
mvn clean package

# Integration Tests - Build .NET Lambda function
build dotnet lambda:
- cd integration-tests
- ./scripts/build-python.sh

build node lambdas:
stage: integration-tests
image: registry.ddbuild.io/images/docker:27.3.1
tags: ["docker-in-docker:arm64"]
rules:
- when: on_success
needs: []
cache:
key: npm-cache-${CI_COMMIT_REF_SLUG}
paths:
- integration-tests/.cache/npm/
artifacts:
expire_in: 1 hour
paths:
- integration-tests/lambda/base-dotnet/bin/
- integration-tests/lambda/*/node_modules/
script:
- cd integration-tests/lambda/base-dotnet
- docker run --rm --platform linux/arm64
-v "$(pwd)":/workspace
-w /workspace
mcr.microsoft.com/dotnet/sdk:8.0-alpine
sh -c "apk add --no-cache zip &&
dotnet tool install -g Amazon.Lambda.Tools || true &&
export PATH=\"\$PATH:/root/.dotnet/tools\" &&
dotnet lambda package -o bin/function.zip --function-architecture arm64"
- cd integration-tests
- ./scripts/build-node.sh

# Integration Tests - Publish arm64 layer with integration test prefix
publish integration layer (arm64):
Expand Down Expand Up @@ -405,11 +440,15 @@ integration-deploy:
- when: on_success
needs:
- publish integration layer (arm64)
- build java lambda
- build dotnet lambda
- build java lambdas
- build dotnet lambdas
- build python lambdas
- build node lambdas
dependencies:
- build java lambda
- build dotnet lambda
- build java lambdas
- build dotnet lambdas
- build python lambdas
- build node lambdas
variables:
IDENTIFIER: ${CI_COMMIT_SHORT_SHA}
AWS_DEFAULT_REGION: us-east-1
Expand All @@ -428,7 +467,7 @@ integration-deploy:
- export CDK_DEFAULT_ACCOUNT=$(aws sts get-caller-identity --query Account --output text)
- export CDK_DEFAULT_REGION=us-east-1
- npm run build
- npx cdk deploy "integ-$IDENTIFIER-*" --require-approval never
- npx cdk deploy "integ-$IDENTIFIER-*" --require-approval never --concurrency 10

# Integration Tests - Run Jest test suite
integration-test:
Expand Down
176 changes: 176 additions & 0 deletions bundling-research.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Lambda Function Bundling Research

## Current State Analysis

### 1. Java Functions (`base-java`, `otlp-java`)
**Bundling Approach:** Pre-build artifacts
- **CDK Stack:** Uses `lambda.Code.fromAsset('./lambda/base-java/target/function.jar')`
- **Build Process:** Maven compilation in Docker
- **Artifacts:** `target/function.jar`
- **Local Build:** `./lambda/base-java/build.sh`
- **CI Build:** Separate pipeline job with Docker-in-Docker
- **Status:** ✅ **Consistent** - works locally and in CI

### 2. .NET Functions (`base-dotnet`, `otlp-dotnet`)
**Bundling Approach:** Pre-build artifacts
- **CDK Stack:** Uses `lambda.Code.fromAsset('./lambda/base-dotnet/bin/function.zip')`
- **Build Process:** dotnet CLI in Docker
- **Artifacts:** `bin/function.zip`
- **Local Build:** `./lambda/base-dotnet/build.sh`
- **CI Build:** Separate pipeline job with Docker-in-Docker
- **Status:** ✅ **Consistent** - works locally and in CI

### 3. Python Functions
#### `base-python`
**Bundling Approach:** Direct directory reference (no build)
- **CDK Stack:** Uses `lambda.Code.fromAsset('./lambda/base-python')`
- **Dependencies:** None (uses Datadog layer)
- **Build Process:** None needed
- **Status:** ✅ Works in CI and locally

#### `otlp-python`
**Bundling Approach:** CDK automatic bundling at deploy time
- **CDK Stack:** Uses `lambda.Code.fromAsset('./lambda/otlp-python', { bundling: {...} })`
- **Dependencies:** `opentelemetry-api`, `opentelemetry-sdk`, `opentelemetry-exporter-otlp-proto-http`
- **Bundling:** Docker-based pip install during CDK synthesis
- **Build Process:** None - bundled during `cdk deploy`
- **Local Build:** ❌ No build script
- **CI Build:** ❌ Not pre-built, bundled during deploy
- **Status:** ❌ **FAILS IN CI** - requires Docker during deployment, but integration-deploy job doesn't have Docker access

### 4. Node.js Functions
#### `base-node`
**Bundling Approach:** Direct directory reference (no build)
- **CDK Stack:** Uses `lambda.Code.fromAsset('./lambda/base-node')`
- **Dependencies:** None (uses Datadog layer)
- **Build Process:** None needed
- **Status:** ✅ Works in CI and locally

#### `otlp-node`
**Bundling Approach:** Pre-installed dependencies (committed node_modules)
- **CDK Stack:** Uses `lambda.Code.fromAsset('./lambda/otlp-node')`
- **Dependencies:** OpenTelemetry packages in package.json
- **Build Process:** `npm install` already run, `node_modules/` committed to git
- **Local Build:** ❌ No build script (but can run `npm install`)
- **CI Build:** ❌ No pipeline job
- **Status:** ⚠️ **Inconsistent** - works but relies on committed node_modules

## Problems Identified

1. **`otlp-python` breaks CI**: Uses CDK bundling that requires Docker at deploy time, but the `integration-deploy` job doesn't have Docker access
2. **Inconsistent approaches**: Some functions pre-build, some use CDK bundling, some commit dependencies
3. **No local build scripts for Python/Node**: Java and .NET have `build.sh` scripts, but Python and Node don't
4. **Committed node_modules**: `otlp-node` has committed `node_modules/` which is generally an anti-pattern

## Solution Options

### Option 1: Pre-build Everything (Recommended)
**Approach:** Build all functions with dependencies ahead of time, consistent with Java/.NET pattern

**Pros:**
- ✅ Consistent approach across all runtimes
- ✅ Works in CI without Docker during deployment
- ✅ Faster deployments (no bundling during synthesis)
- ✅ Clear separation between build and deploy
- ✅ Works locally with build scripts
- ✅ No committed dependencies

**Cons:**
- ⚠️ Requires changes to Python/Node CDK stacks
- ⚠️ Need to create build scripts for Python/Node
- ⚠️ Need to add CI pipeline jobs for Python/Node

**Implementation:**
1. Create `build.sh` scripts for Python functions
2. Create `build.sh` scripts for Node.js functions
3. Add Python/Node build jobs to GitLab pipeline
4. Update CDK stacks to point to pre-built artifacts
5. Remove CDK bundling from `otlp-python` stack
6. Remove committed `node_modules` from `otlp-node`

### Option 2: Use CDK Bundling for Everything
**Approach:** Remove pre-build jobs, add CDK bundling to all stacks

**Pros:**
- ✅ Simpler pipeline (no separate build jobs)
- ✅ CDK handles all bundling automatically

**Cons:**
- ❌ Requires Docker during CDK deploy (need docker-in-docker in CI)
- ❌ Slower deployments (bundling during synthesis)
- ❌ Inconsistent with established Java/.NET pattern
- ❌ More complex local development (need Docker)
- ❌ Would require reworking existing Java/.NET stacks

### Option 3: Fix Docker Access in integration-deploy
**Approach:** Keep current mixed approach but give integration-deploy Docker access

**Pros:**
- ✅ Minimal changes to existing setup

**Cons:**
- ❌ Still inconsistent across runtimes
- ❌ Python/Node don't have local build scripts
- ❌ Committed node_modules remains
- ❌ Only fixes the immediate problem, doesn't improve consistency

## Recommended Solution: Option 1

Pre-build all Lambda functions consistently. This matches the established pattern for Java/.NET and provides the best developer experience.

### Implementation Steps:

1. **Create build scripts for Python functions:**
- `integration-tests/lambda/otlp-python/build.sh`
- Install dependencies to a directory that will be deployed

2. **Create build scripts for Node.js functions:**
- `integration-tests/lambda/otlp-node/build.sh`
- Run `npm install` to generate fresh `node_modules`

3. **Update datasource to include Python/Node functions:**
- Add to `.gitlab/datasources/lambda-functions.yaml`

4. **Add CI pipeline jobs for Python/Node:**
- Similar to Java/.NET build jobs
- Use appropriate Docker images (Python, Node)

5. **Update CDK stacks:**
- Remove CDK bundling from `otlp-python-stack.ts`
- Point to pre-built artifact directories
- Ensure all stacks use consistent `Code.fromAsset()` approach

6. **Clean up:**
- Remove committed `node_modules` from `otlp-node`
- Add to `.gitignore`
- Update documentation

### Expected File Structure After Implementation:

```
integration-tests/lambda/
├── base-java/
│ ├── build.sh ✅ Exists
│ └── target/function.jar (artifact)
├── otlp-java/
│ ├── build.sh ✅ Exists
│ └── target/function.jar (artifact)
├── base-dotnet/
│ ├── build.sh ✅ Exists
│ └── bin/function.zip (artifact)
├── otlp-dotnet/
│ ├── build.sh ✅ Exists
│ └── bin/function.zip (artifact)
├── base-python/
│ └── lambda_function.py (no dependencies, no build needed)
├── otlp-python/
│ ├── build.sh ❌ Need to create
│ ├── requirements.txt
│ └── build/ ❌ Will contain bundled code + dependencies
├── base-node/
│ └── index.js (no dependencies, no build needed)
└── otlp-node/
├── build.sh ❌ Need to create
├── package.json
└── build/ ❌ Will contain bundled code + node_modules
```
10 changes: 10 additions & 0 deletions integration-tests/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,13 @@ Thumbs.db
# Lambda artifacts
response.json
lambda-bundle.zip

# Lambda build outputs
lambda/*/target/
lambda/*/bin/
lambda/*/obj/
lambda/*/package/
lambda/*/node_modules/

# Build caches
.cache/
Loading
Loading