Cloud Spectra Gateway - Frequently Asked Questions

This FAQ answers the questions enterprise buyers and AWS Marketplace reviewers ask most often about Cloud Spectra Gateway. Cloud Spectra Gateway deploys entirely into your own AWS account and replaces metered AWS networking and per-token LLM-API spend with a fixed EC2 cost -- "Your Cloud, Off the Meter." There is no vendor control plane: your data, traffic, and prompts stay inside your account boundary.

If a topic is not covered here, see the Quick Start for a guided deployment, the User Guide for the full configuration reference, or the Architecture page for how the components fit together.

graph LR
    subgraph ACCT["Your AWS Account (trust boundary)"]
        direction LR
        APP["Private workloads
(EC2 / EKS / Lambda VPC)"]
        GW["Cloud Spectra Gateway
per-AZ ASG behind GWLB"]
        AWS["AWS services in-account
SSM / ACM / Bedrock"]
        APP -->|"traffic + prompts"| GW
        GW -->|"NAT egress / LLM proxy"| AWS
    end
    GW -->|"internet egress via EIP"| NET["Internet / LLM providers"]
    style ACCT fill:#eef2ff,stroke:#6366f1,color:#312e81
    style GW fill:#d1fae5,stroke:#10b981,color:#065f46
    style NET fill:#fef3c7,stroke:#f59e0b,color:#92400e

Everything inside the dashed boundary runs in your account. Cloud Spectra operates no external service that sees your traffic.

Billing & Cost

How is Cloud Spectra Gateway billed?

Cloud Spectra Gateway is sold through AWS Marketplace as a software product that runs on EC2. The Cloud Spectra software fee is a fixed hourly fee per running instance, charged through your existing AWS bill, in addition to the underlying EC2 instance cost. There is no separate invoice from Cloud Spectra and no usage-based metering on the software itself -- the fee does not change with how much traffic you push, how many gigabytes you NAT, or how many tokens you proxy.

How does this replace metered AWS networking spend?

AWS managed networking services bill on usage. A NAT Gateway charges an hourly rate plus a per-gigabyte data-processing charge; a Network Load Balancer charges per capacity unit. Cloud Spectra Gateway performs the equivalent work -- outbound NAT, inbound port forwarding, L4 load balancing, TLS termination -- on EC2 instances you own. You pay the EC2 instance cost plus the fixed software fee instead of the per-gigabyte and per-capacity-unit meters. Because the software fee does not scale with throughput, the savings grow as your traffic grows.

flowchart LR
    subgraph M["Metered model"]
        NATM["NAT Gateway
hourly + per-GB"]
        NLBM["NLB
per capacity unit"]
        LLMM["LLM APIs
per token"]
    end
    subgraph F["Cloud Spectra fixed model"]
        EC2["EC2 instance cost"]
        SW["Fixed software fee
per running instance"]
    end
    M -->|"replace with"| F
    style NATM fill:#fecaca,stroke:#ef4444,color:#991b1b
    style NLBM fill:#fecaca,stroke:#ef4444,color:#991b1b
    style LLMM fill:#fecaca,stroke:#ef4444,color:#991b1b
    style EC2 fill:#d1fae5,stroke:#10b981,color:#065f46
    style SW fill:#d1fae5,stroke:#10b981,color:#065f46

How does it reduce per-token LLM API spend?

The AI Gateway tier adds an OpenAI-compatible reverse proxy with exact-match response caching and an embedding-based semantic cache. When a prompt has been seen before (or is semantically close to one already answered), the cached response is returned without a new upstream call, so you do not pay the provider's per-token charge again. The AI Gateway tier also supports local in-account inference with vLLM, where serving runs on GPU instances you already pay for rather than a per-token API. See the AI Gateway section for details.

Can I run it on Spot Instances?

Yes. Cloud Spectra Gateway runs in a per-Availability-Zone Auto Scaling Group, so you can choose On-Demand or Spot Instances per your tolerance for interruption. Spot lowers the EC2 portion of the cost; On-Demand gives the most predictable capacity. Because the fleet is horizontally scaled behind a Gateway Load Balancer, the loss of a single Spot instance is absorbed by the surviving instances in that zone.

Cost summary: You pay (1) the EC2 instance cost, On-Demand or Spot, and (2) a fixed per-instance software fee through AWS Marketplace. There are no per-GB, per-capacity-unit, or per-token charges from Cloud Spectra.

Deployment

How do I deploy Cloud Spectra Gateway?

There are three supported paths, all of which run in your own account:

Path	How it works	Best for
Marketplace 1-click	Subscribe on AWS Marketplace, launch the provided CloudFormation template (new-VPC or existing/BYO-VPC).	Fastest first deployment
Terraform	Use the `cloudspectra/cloudspectra` provider (installed through a one-time network mirror block in `~/.terraformrc`) plus standard AWS modules.	Infrastructure-as-code teams
Standalone AMI	Launch the AMI directly. It boots with NAT and the dashboard, with no CloudFormation stack.	Minimal / no-CloudFormation footprints

flowchart TD
    A["Subscribe on AWS Marketplace"] --> B{"Choose deployment path"}
    B -->|"1-click"| C["CloudFormation
new-VPC or existing/BYO-VPC"]
    B -->|"IaC"| D["Terraform
cloudspectra/cloudspectra provider"]
    B -->|"direct"| E["Standalone AMI
boots NAT + dashboard"]
    C --> F["Per-AZ ASG behind GWLB,
EIP endpoint, dashboard on 443"]
    D --> F
    E --> F
    style A fill:#dbeafe,stroke:#3b82f6,color:#1e3a8a
    style F fill:#d1fae5,stroke:#10b981,color:#065f46

Which AWS Regions are supported?

Cloud Spectra Gateway runs in commercial AWS Regions that offer the EC2 instance families and AWS services it relies on -- Elastic IP, SSM Parameter Store, Gateway Load Balancer, and ACM. The AI Gateway's Bedrock integration is available where Amazon Bedrock and your chosen models are available. Confirm Region availability for your selected instance types on the EC2 instance types page.

What is the difference between the new-VPC and existing-VPC templates?

The new-VPC template provisions a fresh VPC with its subnets, route tables, and gateway in one step -- ideal for a clean greenfield deployment or an evaluation. The existing-VPC (BYO-VPC) template deploys the gateway into a VPC and subnets you already operate, so it slots into your established network design and CIDR plan. Both produce the same runtime: a per-AZ ASG behind a Gateway Load Balancer with an Elastic IP endpoint.

What is the standalone AMI for?

The standalone AMI launches the appliance directly with no CloudFormation stack. It boots ready with NAT and the management dashboard, which is useful when you want the smallest possible footprint, are deploying outside a CloudFormation workflow, or want to evaluate the gateway on a single instance before adopting one of the orchestrated paths.

How do I install the Terraform provider?

The Cloud Spectra Terraform provider is distributed through a network mirror. You add a one-time network_mirror block to your ~/.terraformrc, after which terraform init resolves the cloudspectra/cloudspectra provider from that mirror. From there you manage the gateway configuration as code alongside your AWS resources.

How do upgrades work?

Upgrades roll through the Auto Scaling Group fleet. New gateway instances launch from an updated AMI and join the Gateway Load Balancer target set; older instances are drained and retired. Because traffic is balanced across the fleet behind the GWLB, replacement happens instance by instance without taking the data plane offline. Configuration is preserved in SSM Parameter Store, so new instances read the same settings as the ones they replace.

How do I update the operational IAM role over time?

Through CloudFormation, not by the gateway editing your IAM at runtime. The operational (cross-account / home-account) role is deployed from a frozen template; as Cloud Spectra ships features that need new permissions, only the template's parameter values change. After your gateway auto-upgrades it regenerates the policies and, on the dashboard's account view, stages a CloudFormation Change Set on your existing stack against the new release. You open the link, review the diff in the CloudFormation console, and click Execute -- you stay in control of applying the change. (The cloudspectra-setup CLI is an alternative: re-run it with a fresh token to apply the update.) A dashboard permission picker also lets you deselect optional features, which removes their roles on the next stack update; core features are always retained. See the User Guide for the full procedure.

Note: All three deployment paths converge on the same runtime architecture. Choosing a path is about how you orchestrate provisioning, not about which features you get.

Security & Compliance

Does my data leave my AWS account?

No. Cloud Spectra Gateway is deployed into your account, and the data plane runs entirely on instances you own. Traffic that the gateway NATs, load-balances, inspects, or proxies stays inside your VPC and your account boundary. Cloud Spectra operates no external control plane that receives, brokers, or observes your traffic. The only outbound destinations are the ones your own routes and policies send traffic to -- the public internet for NAT egress, or the LLM providers you explicitly configure in the AI Gateway.

Is there a vendor control plane or call-home?

No. There is no Cloud Spectra-hosted control plane. Configuration lives in your account's SSM Parameter Store, management is performed through the in-account dashboard and config API, and the gateway does not depend on an external Cloud Spectra service to forward traffic. This is central to the product's model: your cloud, off the meter, under your control.

flowchart LR
    subgraph YOURS["Your AWS Account"]
        direction TB
        DASH["Dashboard + config API
(in-account)"]
        SSM["SSM Parameter Store
(config of record)"]
        DP["Data plane
NAT / NLB / firewall / AI proxy"]
        DASH --> SSM --> DP
    end
    VENDOR["Cloud Spectra
(software vendor)"]
    VENDOR -. "Marketplace AMI + software fee only" .-> YOURS
    VENDOR -. "NO traffic, NO control plane, NO data egress" .-x DP
    style YOURS fill:#eef2ff,stroke:#6366f1,color:#312e81
    style DP fill:#d1fae5,stroke:#10b981,color:#065f46
    style VENDOR fill:#f1f5f9,stroke:#94a3b8,color:#334155

What IAM permissions does the gateway need?

The gateway runs under a least-privilege IAM role scoped to the resources it actually manages -- for example its own Elastic IP, network interfaces, Auto Scaling and Gateway Load Balancer resources, SSM parameters for its configuration, and, on the AI Gateway tier, Bedrock model invocation. The CloudFormation templates create this role for you, and the permissions are documented so your security team can review them before deployment.

Why do most features need an extra IAM role?

The base CloudFormation template ships an intentionally minimal instance role -- enough for the gateway to boot, associate its Elastic IP, and run outbound NAT on its primary interface, but nothing more. Every other capability (the full per-AZ NAT data plane for your private subnets, Gateway Load Balancer, EIP-pool DNS, flow logs, EventBridge, scaling, teardown, the AI Gateway, and managing other accounts) is gated on a separately deployed operational IAM role -- the cross-account / home-account role. This keeps the always-on base permissions small and lets your security team review and approve the operational permissions on their own. You deploy the operational role once per account from a CloudFormation stack the dashboard pre-fills; without it, most features will not work. See the User Guide for the full setup.

How is the dashboard secured?

The Angular management dashboard is served over HTTPS. TLS is terminated by HAProxy on port 443 using an AWS Certificate Manager (ACM) certificate, so the certificate is issued and stored in your account. You restrict who can reach the dashboard with security groups and the client CIDR you allow at deployment time.

Where do my prompts and traffic live?

In your account. NAT, load-balancing, firewall, and proxy traffic transits the gateway instances in your VPC. LLM prompts sent to the AI Gateway are processed on the gateway in your account; cache entries and audit logs are stored in your account. When you use local vLLM inference with no remote fallback configured, prompts and completions never leave your account at all. See the AI Gateway section for the data-privacy specifics.

For Marketplace reviewers: Cloud Spectra distributes software (AMIs) and charges a fixed Marketplace software fee. It does not receive, store, or proxy customer traffic or prompts. All processing and storage occur within the customer's AWS account.

Networking

Does it replace the AWS NAT Gateway?

Yes. The Network Gateway tier provides source NAT (sNAT) for outbound internet access from private instances, replacing the metered NAT Gateway hourly and per-gigabyte charges. You point a private subnet's default route at the gateway's network interface, and its traffic egresses through the gateway's Elastic IP. It also provides destination NAT (dNAT) for inbound TCP port forwarding to private targets.

Does it replace the AWS Network Load Balancer?

The gateway provides in-appliance Linux IPVS L4 load balancing, kept in sync with an AWS NLB target set, plus TLS termination via HAProxy using an ACM certificate. This covers the common L4 load-balancing and HTTPS-termination patterns customers use a Network Load Balancer for, while keeping the work on instances you own.

How does the per-AZ design avoid cross-AZ data charges?

Cloud Spectra runs one Auto Scaling Group per Availability Zone, and each AZ egresses through its own network interface. When you route each AZ's private subnets to that same AZ's gateway interface, traffic NATs locally instead of hopping to another zone -- which avoids the inter-AZ data-transfer charges that a single centralized appliance would incur. This per-AZ locality is built into the architecture.

graph TD
    subgraph AZ1["Availability Zone A"]
        S1["Private subnets (AZ A)"] --> G1["Gateway ASG (AZ A)
own ENI"]
    end
    subgraph AZ2["Availability Zone B"]
        S2["Private subnets (AZ B)"] --> G2["Gateway ASG (AZ B)
own ENI"]
    end
    G1 -->|"local egress, no cross-AZ fee"| NET["Internet"]
    G2 -->|"local egress, no cross-AZ fee"| NET
    style G1 fill:#d1fae5,stroke:#10b981,color:#065f46
    style G2 fill:#d1fae5,stroke:#10b981,color:#065f46
    style NET fill:#fef3c7,stroke:#f59e0b,color:#92400e

What about source IP -- is the client IP preserved?

For outbound NAT, private instances appear on the internet as the gateway's Elastic IP, which is the expected and desired behavior for a NAT appliance (a stable, allow-list-friendly egress address). For inbound and load-balanced flows, source-IP handling follows the relevant data path -- consult the User Guide for the specifics of each feature so you can match it to your application's needs.

How is high availability achieved?

Each zone's Auto Scaling Group runs behind a Gateway Load Balancer (using the GENEVE protocol) so the fleet scales horizontally and survives the loss of an individual instance: traffic shifts to the surviving instances in the zone. The gateway also maintains an Elastic IP for a stable endpoint. Deploy across multiple Availability Zones for zone-level resilience.

Can I scale the gateway up or out?

Yes -- both vertically and horizontally. You can change the instance size live to add capacity per node, and you can scale out by adding instances behind the Gateway Load Balancer. The per-AZ Auto Scaling Groups handle adding and removing instances within each zone.

AI Gateway

Which LLM providers does the AI Gateway support?

The AI Gateway is an OpenAI-compatible reverse proxy that routes to Amazon Bedrock, OpenAI, and Anthropic. Your clients point their OpenAI base URL at the gateway's endpoint (port 8090), and the gateway forwards to the configured upstream provider. Because the interface follows the OpenAI API shape, most existing SDKs and applications work by changing only the base URL.

flowchart LR
    APP["Your app
(OpenAI base URL -> gateway:8090)"] --> GW["AI Gateway proxy"]
    GW --> CACHE{"Cache hit?"}
    CACHE -->|"exact or semantic match"| HIT["Return cached response
(no provider call)"]
    CACHE -->|"miss"| ROUTE{"Route to provider"}
    ROUTE -->|"local/<model>"| VLLM["vLLM in-account GPU"]
    ROUTE -->|"bedrock"| BR["Amazon Bedrock"]
    ROUTE -->|"openai"| OAI["OpenAI"]
    ROUTE -->|"anthropic"| ANT["Anthropic"]
    style GW fill:#d1fae5,stroke:#10b981,color:#065f46
    style HIT fill:#dbeafe,stroke:#3b82f6,color:#1e3a8a
    style VLLM fill:#ede9fe,stroke:#8b5cf6,color:#5b21b6

How does local vLLM inference keep data in my account?

The AI Gateway tier can serve models locally with vLLM on in-account GPU instances, exposed through the same OpenAI-compatible interface and addressed as local/<model>. When you call a local model and configure no remote fallback, the prompt and completion never leave your account. If a request fails before the first token, the gateway can fall back to a configured remote model according to an overflow policy of queue, spill, or reject; choosing an empty fallback guarantees data stays in-account.

Overflow policy	Behavior on pre-first-token failure
`queue`	Hold the request and wait for local capacity.
`spill`	Fall back to the configured remote model.
`reject`	Return an error rather than leaving the account.

What do the response cache and semantic cache do?

The response cache stores completions keyed on an exact match of the request, so an identical prompt returns the stored answer without a new upstream call. The semantic cache goes further: it uses embeddings to recognize prompts that are similar (not just byte-identical) to ones already answered, raising the hit rate beyond exact match. Both caches live in your account and reduce the number of billable provider calls.

What are the data-privacy properties of prompts sent to the AI Gateway?

Prompts are processed by the gateway running in your account. Cache entries, token-metering counters, and audit logs are stored in your account. When you route to a remote provider (Bedrock, OpenAI, or Anthropic), the request goes to that provider exactly as your configuration directs -- Cloud Spectra is not in that path and does not receive a copy. When you route to a local vLLM model with no remote fallback, nothing leaves your account. You decide, per model and per policy, where each request may go.

Audit and metering: The AI Gateway records token metering and audit logs in your account, giving you a centralized view of LLM usage without sending that telemetry to any third party.

Support & Operations

How do I manage the gateway day to day?

The primary management surface is the Angular dashboard, served over HTTPS on port 443. From it you enable and configure features, review status, and adjust scaling. The same configuration is available through the config API on port 8080 and through the Terraform provider, so you can manage the gateway interactively or as code. The configuration of record is stored in SSM Parameter Store in your account.

Port	Service
`443`	HTTPS management dashboard (TLS via ACM, terminated by HAProxy)
`8080`	Configuration API
`8090`	AI Gateway OpenAI-compatible endpoint AI Gateway
configurable	Forward HTTP proxy (Squid) port
`80`	Redirect to HTTPS

Where are the logs?

Operational logs -- including AI Gateway audit logs and token metering -- are produced and retained within your account, alongside the metrics surfaced in the dashboard. Because the data plane runs on your instances, you can also forward host and service logs into your own observability stack.

Which features are included, and how do tiers relate?

Cloud Spectra Gateway ships in three tiers, each a strict superset of the one below it:

Tier	Adds
Network Gateway	sNAT, dNAT/port forwarding, IPVS L4 NLB, TLS termination (ACM), per-AZ Auto Scaling, vertical + horizontal scaling (GWLB), forward HTTP proxy with caching (Squid).
Security Gateway	Inline Suricata IDS/IPS (NFQUEUE), nftables firewall rules, domain/URL filtering.
AI Gateway	OpenAI-compatible LLM proxy with response caching, local vLLM inference, and semantic cache.

How does scaling behave during operations?

Scaling is handled by the per-AZ Auto Scaling Groups behind the Gateway Load Balancer. You scale out by adding instances to absorb more load and scale in when demand drops; you scale up by selecting a larger instance size. Combined with multi-AZ deployment, this lets the fleet match capacity to demand while remaining resilient to the loss of any single instance.

Next steps: Walk through a guided deployment in the Quick Start, find every configuration option in the User Guide, or see how the pieces connect in Architecture.

Cloud Spectra Gateway -- Frequently Asked Questions v1.0.0

Billing & Cost

How is Cloud Spectra Gateway billed?

How does this replace metered AWS networking spend?

How does it reduce per-token LLM API spend?

Can I run it on Spot Instances?

Deployment

How do I deploy Cloud Spectra Gateway?

Which AWS Regions are supported?

What is the difference between the new-VPC and existing-VPC templates?

What is the standalone AMI for?

How do I install the Terraform provider?

How do upgrades work?

How do I update the operational IAM role over time?

Security & Compliance

Does my data leave my AWS account?

Is there a vendor control plane or call-home?

What IAM permissions does the gateway need?

Why do most features need an extra IAM role?

How is the dashboard secured?

Where do my prompts and traffic live?

Networking

Does it replace the AWS NAT Gateway?

Does it replace the AWS Network Load Balancer?

How does the per-AZ design avoid cross-AZ data charges?

What about source IP -- is the client IP preserved?

How is high availability achieved?

Can I scale the gateway up or out?

AI Gateway

Which LLM providers does the AI Gateway support?

How does local vLLM inference keep data in my account?

What do the response cache and semantic cache do?

What are the data-privacy properties of prompts sent to the AI Gateway?

Support & Operations

How do I manage the gateway day to day?

Where are the logs?

Which features are included, and how do tiers relate?

How does scaling behave during operations?