Summary

The Admin Lambda provides a REST API for forking indexes. Forking is a mechanism for making a copy of a customer's index and (optionally) transparently switching to serve from the copy. This can be used for development and testing on customer index, and blue-green deployments changing otherwise immutable settings on an index (e.g. model, account, cluster, etc.), or for green-blue deployment of changes to infra.

Concepts

Index name: the name of a Marqo Cloud index. This is unique per account, and in the Marqo Classic API, it is immutable per index and thus a unique identifier.
Index ID: a general unique identifier for an index (e.g. {system_account_id}-{index_name}). In the ecom world, a request for an index by ID (via the x-marqo-index-id header) does not necessarily map to an index with the specified name.
Index settings: the immutable settings of a Marqo index, sent in the Classic API create index request body.

Components

Fork will interact with the following components:

Data:
AccountsTable / UsersAccountsTable
CustomerIndexConfigTable
env-EcomIndexSettingsTable
Index query configs table
Merchandising table
Feature flags JSON
Infra:
kops clusters
Multitenant clusters
Workflows
Tests:
Canary tests

Steps

Describe

Get all the necessary details about the source and target.

The set of details is mostly covered by Ops/Tactical/Validationgeneral index readiness, and Ecom Canary Testing which validates changes to configs.

Source and target account and cluster details
IDs
Feature flags
Source index details
Name
Settings
Infrastructure
Bespoke infra config (e.g. scaled out API nodes)
URLs
Target index
Exists?
Source ecom index settings
Configs (add_docs_config, collections_config, search_config)
Infrastructure (especially the queue ARN)
Query configs
Configs
Merchandising
Config
Rules
Pixel
Mappings for automatic doc updates

TODO: In general, how to behave if the target resources/config already exist.

Create

Create a new index with the desired immutable configuration.

Once the queue is created, also deactivate the trigger for the ecom indexer so the queue isn't consumed until we're ready.

Configure

Create or update any mutable configuration (most of the things in "Describe"), defaulting to copies of the old index, able to be overridden at clone time.

Transfer

Once the target index is ready, update the source index's add_docs_config.index_write_aliases to start forking all subsequent writes to the target index.

In parallel, being the transfer operation for the existing docs (either manual snapshot to be restored, or the reindexing pipeline).

Persistence

Fork details are stored in a DynamoDB table called {env}-IndexForksTable. A new record is created for each state change of each fork.

Column	Description
pk	Source system account ID
sk	(Source index name)#(System timestamp, ISO format)
fork_id	Fork ID
status	pending, in_progress, ready, failed, rolled_back, aborted, complete
source_cell_id	Source cell ID
source_system_account_id	Source system account ID
source_index_name	Source index name
target_cell_id	Target cell ID
target_system_account_id	Target system account ID
target_index_name	Target index name
created_at	Timestamp of creation of this record (particular status reached)
updated_at	Timestamp of last update of this record

For each fork, we store:

One record with all the context with which the fork was created.
One record for each status change with timestamps.

Access patterns:

List all forks for a given index ID (account ID + index name) and their latest status
Get the latest status for a given fork ID
List the history of a given fork ID
Create a new fork record
Update the
Update the status of a fork (by creating a new record with the same fork ID and a new timestamp)

API

POST /api/v1/accounts/{account_id}/indexes/{index_name}/forks

Create a new fork.

POST /api/v1/accounts/{account_id}/indexes/{index_name}/forks/{fork_id}/cutover

Cut over to the target index, routing all search traffic to the target index. Leaves the source index untouched.

POST /api/v1/accounts/{account_id}/indexes/{index_name}/forks/{fork_id}/rollback

Revert the necessary configs to serve all traffic from the source index.

POST /api/v1/accounts/{account_id}/indexes/{index_name}/forks/{fork_id}/cleanup

Check that the target index is successfully serving all traffic, and tear down the source index.

POST /api/v1/accounts/{account_id}/indexes/{index_name}/forks/{fork_id}/abort

Abort the fork, tearing down the target index and restoring the source index to its original state.

Implementation Plan

1. Persistence Layer

Schema Design: Define a DynamoDB table schema for ForksTable to store fork ID, status, source/target details, and step progress.
Service: Create a ForkService to handle CRUD operations for fork records.
Deployment: Deploy the new table to the production environment via admin_stack in CDK.

2. Core Fork Logic (Orchestrator)

Describe & Validation: Implement logic to fetch source index details and validate target index parameters.
Resource Creation: Integrate with IndexSettingsService to create the target index (immutable settings).
Configuration Sync: Implement logic to copy and merge mutable settings (Ecom, Query Configs, Merchandising, Pixel) from source to target.
Write Aliasing: Implement the update of add_docs_config on the source index to alias writes to the target.

3. API Implementation

POST /forks:
Generate Fork ID.
Create initial record in ForksTable.
Trigger the asynchronous fork workflow (likely via Step Functions or async Lambda invocation).
Return Fork ID and pending status.
POST /cutover:
Retrieve fork record.
Verify fork is in ready state.
Update routing configuration (Index Registry/DNS/Gateway) to point search traffic to target.
Update status to complete.
POST /rollback:
Revert write aliases on source index.
Revert search routing if cutover was attempted.
Update status to rolled_back.
POST /cleanup:
Verify traffic is serving correctly on target.
Delete source index resources.
POST /abort:
Revert any changes to source (aliases).
Delete target index resources.
Mark fork as aborted.

4. Asynchronous Workflow

The fork workflow is orchestrated by a Step Functions state machine (AdminIndexForkWorkflow). Each step invokes the Admin Lambda with a specific action:

fork.ensure_target → fork.configure_target → fork.prepare_transfer → snapshot → restore → fork.activate_target → fork.verify → succeed

Step	Lambda Action	Description
Ensure Target	`fork.ensure_target`	Create target index if missing, single readiness check (SFN retries on `TargetIndexNotReadyError`), validate infra compatibility
Configure Target	`fork.configure_target`	Export source config, import into target, validate post-import export matches
Prepare Transfer	`fork.prepare_transfer`	Disable target SQS ESM, add write alias (source → target)
Snapshot	(cross-account SFN)	Snapshot source index documents
Restore	(cross-account SFN)	Restore snapshot onto target index
Activate Target	`fork.activate_target`	Re-enable target SQS ESM so queued writes drain
Verify	`fork.verify`	Compare search results between source and target, mark READY or FAILED

All steps are idempotent for safe Step Functions retries. Failures mark the fork as FAILED with a descriptive message.

5. Testing

Unit Tests: Test individual components (Service, API models, Logic).
Integration Tests: Test the full flow with mocked infrastructure calls.