Skip to content

Conversation

@askerNQK
Copy link
Contributor

@askerNQK askerNQK commented Oct 1, 2025

Overview of Current Aether CLI Architecture (Rust)

Commands:
login, deploy, logs, list, completions (hidden), and benchmarking modes.
It automatically detects NodeJS projects (package.json), runs npm/yarn/pnpm install/prune (production mode), and packages the source + node_modules into app-<sha256>.tar.gz.
It calculates the SHA256 hash as a stream, generates a per-file manifest, produces an SBOM (legacy + optional CycloneDX 1.5), and signs with Ed25519 (if AETHER_SIGNING_KEY is available).

Artifact Upload:

  • Legacy: POST /artifacts (multipart, deprecated).

  • Standard: Two-phase presign/complete process with HEAD verification for size/metadata and multipart mode (init/presign-part/complete). Includes idempotency, quota, and retention support.
    After upload, deployment is triggered via POST /deployments using either artifact_url or storage_key.
    Progress display is implemented for large PUT/multipart uploads, with caching support for node_modules based on the lockfile.


Control Plane (Rust + Axum + SQLx + PostgreSQL)

API Endpoints:
health/ready/startupz, artifacts (legacy + presign/complete + multipart + metadata + HEAD existence), deployments (list/get/create/patch), apps (minimal CRUD + public keys), logs (stub), provenance/SBOM/manifest upload and enforcement, Prometheus metrics, OpenAPI JSON + Swagger.

Authentication + RBAC:
Bearer token via environment variable (AETHER_API_TOKENS), with admin guard for write endpoints.

Storage:
Abstracted backend for mock and S3 (via aws-sdk-s3 feature); supports presigned PUT, HEAD for size/metadata, remote hashing with retry/backoff.

Kubernetes:
Applies Deployment objects (kube-rs) in two modes:

  • Non-dev-hot: init container downloads artifact, verifies SHA256, extracts; main container runs node server.js.

  • Dev-hot: sidecar fetcher polls/watches for new digests via pod annotations, downloads artifact, verifies checksum, and performs hot refresh. Includes supervisor script and readiness drain.


Database and Metrics

Full PostgreSQL migrations, schema matches README (applications, artifacts, deployments, public_keys, …) plus extended columns (signature, provenance flags, manifest digest, etc.).
Metrics: extensive counters/gauges/histograms for upload lifecycle, multipart handling, quotas, and HTTP metrics.

Background Tasks:
GC for pending artifacts and failed deployments, SBOM/sign/provenance coverage updates; includes timeouts and can be disabled in tests.


Operator

Rust + kube CRD AetherApp v1 (fields: image, replicas + status).
Tool crd-gen generates CRD YAML.
No full reconciliation logic yet (only CRD definition + generator).

Kubernetes manifests include Control Plane Deployment + Service (namespace aether-system), CRD AetherApp, and example secret for dev-hot pubkey.


Build / Lint / Test Results

  • Build:cargo build --workspace successful.

  • Lint/Clippy:cargo clippy --workspace --all-targets --all-features passes with no warnings.

  • Tests: ⚠️ Partial failure.

    • CLI tests: ✅ PASS (many unit + integration tests on packaging, streaming, SBOM, JSON output all green).

    • Control Plane tests: ❌ FAIL due to PoolTimedOut (database connection error) when DATABASE_URL points to non-existent Postgres.

Diagnosis:
When DATABASE_URL is unset, the test harness uses Testcontainers to spin up ephemeral Postgres. The current environment likely lacks Docker, so Testcontainers cannot start. S3 tests (presign/multipart/remote hash) are feature-gated under “s3” and MINIO_TEST; skipped due to missing local MinIO.


Completed Features (MVP Scope)

CLI:
Auto NodeJS detection, dependency installation (production), artifact packaging, manifest/SBOM generation, optional Ed25519 signing, two-phase + multipart uploads, deployment creation, stable JSON output, and caching of node_modules.

Control Plane:
Full artifact ingestion (legacy + presign/complete/multipart), idempotency, quotas, retention, metadata, and verification (size/digest/optional remote hash).
Auth/RBAC with token; middleware for tracing, metrics; deployments CRUD; signature verification; optional SBOM/provenance enforcement.
K8s apply logic with mock-kube tests; detailed dev-hot sidecar fetcher; OpenAPI + Swagger; storage abstraction (mock/S3).
Migrations complete; schema aligned with documentation.


Missing / Incomplete Areas

  • Logs: /apps/{app}/logs endpoint is a stub — no integration with log aggregator or Kubernetes log streaming yet.

  • E2E / Helm: No Helm chart/kustomize for control-plane or RBAC/service account aether-dev-hot; missing YAML definitions for ServiceAccount.

  • Operator: CRD only — no controller logic.

  • Base Image: aether-nodejs:20-slim referenced but not yet built/published.

  • CI/CD & Benchmarks: README mentions badges and baselines but no verified CI run.

  • Control Plane Tests: Fail due to missing Postgres or Docker; harness fallback incomplete.

  • TLS: Expected to be handled by ingress; binary does not configure TLS directly.


Completion Estimates

Aspect | Completion | Reason -- | -- | -- Technical (CLI, Control Plane, S3, K8s) | 75–80% | Nearly full feature set; missing logs and Helm integration. Product (End-to-End Deployment Flow) | 70–80% | Works end-to-end via CLI, but lacks Helm and operational polish. Business (Speed Improvement Goal) | 50–60% | No real-world benchmarks yet; needs production tests.

Expected to pass fully when Postgres/Docker is available.


Final Summary

The codebase demonstrates strong depth and coverage for the core MVP — CLI, two-phase/multipart upload, presigned S3, verification, and Kubernetes deployment (including dev-hot).
Missing production-ready components (log streaming, Helm/RBAC, base image pipeline, real E2E measurement) prevent full MVP closure.

Estimated MVP readiness: ~75–80% complete.

@askerNQK askerNQK closed this Oct 7, 2025
@askerNQK askerNQK reopened this Oct 7, 2025
iOS E2E Implementation added 18 commits October 7, 2025 14:53
…orcement, metrics (provenance_emitted_total, sbom_invalid_total)
…agation; provenance + keystore tests; clippy fixes
iOS E2E Implementation added 12 commits October 13, 2025 06:26
…avoid krates panic on aws_lc_rs feature resolution under --all-features; keep bans strict
…in Control-plane S3 tests via AETHER_ENABLE_S3_FULL_CI; docs(issue-11): note S3 CI toggle and steps post AWS hyper 1.x
…cs(issue-11): add latest build time and binary sizes; note AWS hyper 1.x bump is blocked upstream; keep aws crates at 1.x rustls/rt-tokio
…x/hashlink, 0.15 via aws-sdk-s3/lru); to be removed when upstream unifies
…, THROUGHPUT_TOLERANCE); defaults remain 20%/25%
… runner noise; throughput tolerance unchanged
… for control-plane; deny(advisories): temporarily ignore instant/paste and sqlx advisories until deps bump
…w Unicode-3.0 & MPL-2.0; add webpki-roots CDLA exception; run cargo-deny with --all-features clean
…enance, CI/bench/deny/observability, fixes, docs, tests)
…ature and skip when AETHER_STORAGE_MODE!=s3 to prevent false CI failures; no functional code changes
…yle when AETHER_S3_ENDPOINT_URL set for MinIO compatibility; gate S3 integration tests behind s3 feature and skip when not in S3 mode
… localhost; include x-amz-meta-sha256 in presigned headers; gate all S3 integration tests behind feature and skip if not in S3 mode
@askerNQK askerNQK changed the title feat(issue-05): add basic graceful reload (node --watch), sidecar bac… 70-80% Oct 13, 2025
@askerNQK askerNQK changed the title 70-80% 75-80% Oct 13, 2025
askerNQK and others added 16 commits October 13, 2025 22:11
- TDD: add tests/scripts for base image pipeline
- Dockerfile and README for images/aether-nodejs/20-slim
- Makefile targets: base-image-build/scan/sbom/push
- CI workflow: .github/workflows/base-image.yml (GHCR push, monthly rebuild, SBOM, Trivy/Grype scans, optional cosign)
… on success; attach JSON + SARIF + SBOM; cosign post-push
…aether-dev-hot, CI helm lint/template, docs, tests (TDD)
…l) allowlists; gate on HIGH+CRITICAL via env GATE_SEVERITY; honor allowlists in scans; keep Grype non-blocking
…-util/urlencoding; switch test server to axum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant