Skip to main content

Institutional Hotpath Latency

This page is the latency and determinism contract for institutional market makers. It separates the exchange hotpath from proxy/network effects and makes the response semantics explicit enough for an unattended quote engine.

Target

For a market maker quote loop, the target is sub-50 ms p99 from exchange ingress to HTTP/BSL ACK on the provisioned low-latency lane. This target is not a guarantee for public-internet round trips, Cloudflare/proxy paths, full responses, or durable/backoffice workflows.

Measure three separate clocks:

ClockMeaningHow to measure
Client RTTClient send to client receive. Includes network, TLS, proxy, and client runtime.Client-side monotonic timer.
Core ACKSequencer request start to ACK response body finalization.x-mm-perf-total-us or bsl.timing.durationsUs.coreAck.
BSL facade overheadJSON/BSL response envelope work around the core submit.bsl.timing.durationsUs.facadeAugment.

If client RTT is high but core ACK is low, optimize network path, DNS/direct endpoint choice, TLS reuse, and proxy hops before changing strategy logic.

Route selection

Use the connectivity bundle for the actual endpoints assigned to the account:

GET /api/v1/bsl/connectivity
SC-Auth-Version: 2
SC-Key: <institutional apiKeyId>
SC-Nonce: <monotonic nonce>
SC-Timestamp: <unix ms>
SC-Passphrase: <apiPassphrase>
SC-Signature: <hmac>

Hotpath preference:

PreferenceRouteUse
1BSL Direct TCP/TLSLowest overhead native binary session where provisioned.
2/api/order-entry/binaryDirect binary HTTP lane.
3/api/v1/bsl/orders/compactBSL compact HTTP facade with institutional HMAC.
4Generic platform/proxy pathIntegration compatibility only; benchmark separately.

Keep connections warm. Reusing TLS sessions and a warm credential/signature cache matters more than micro-optimizing one isolated request.

Signing path

Do not call POST /api/v1/trading/actions/hash in the steady-state quote loop. That endpoint is a reference and conformance tool. Production quote engines compute the canonical action hash locally, sign locally, and submit the signed batch in one order-entry request. See Local Action Signing and pin the SDK golden vectors in CI.

The common two-request flow:

  1. POST /api/v1/trading/actions/hash
  2. sign
  3. POST /api/v1/bsl/orders/compact

adds a full network/API round trip before every quote refresh. Keep it for bring-up and signer validation only.

Live stress reports from scripts/e2e/live-order-entry-stress.cjs expose this split under latencyAttribution: actionBuild.localSigning measures local SDK signing, actionBuild.nonce measures bootstrap/reservation work, submitHttp measures the order-entry request RTT, and removedHotpathRoundtrip.endpoint records that /api/v1/trading/actions/hash was intentionally not called in the hotpath. Use those fields to prove that a post-change run is no longer paying the 200-270 ms server-hash round trip observed in the original integration review.

ACK and response modes

There are two independent knobs:

HeaderControlsHotpath setting
X-BSL-Result-ModeACK boundary: ack, durable, full.ack for quote loops.
X-Senticore-Response-ModeBody verbosity: summary, detailed, full.detailed during integration; summary only after reconciliation is proven.

ACK boundaries:

BoundaryServer waits forClient meaning
accepted / BSL ackAdmission boundary only.Request entered the low-latency lane; not terminal order state.
ingress_walPrimary ingress WAL append and forced sync.Safer admission boundary; still not terminal order state.
applied / BSL fullEngine apply / terminal receipts.Useful for conformance and slow paths; keep out of live quote refresh loops.
primary_durable / durableDurable boundary.Backoffice or explicit durability workflows, not low-latency quoting.

Do not run a market-maker quote loop with full responses unless the account has explicitly passed conformance for that mode and the latency budget includes engine terminal wait time.

Recommended submit headers:

X-BSL-Result-Mode: ack
X-Senticore-Response-Mode: detailed
Idempotency-Key: <strategy-request-key>

Expected successful shape:

{
"ok": true,
"seqs": [123],
"derivedOrderIds": ["0x..."],
"acceptedActions": 1,
"responseMode": "detailed",
"ackMode": "ingress_wal",
"ackClass": "ingress_wal",
"durableLsn": 22376075,
"durablePendingSeqs": 1,
"bsl": {
"reconciliation": {
"ackIsTerminalState": false,
"bodyIncludesSeqs": true,
"bodyIncludesDerivedOrderIds": true,
"missingDerivedOrderIdCount": 0,
"missingDerivedOrderIdsRecoveredLocally": 0,
"missingDerivedOrderIdsFinalMissingAfterLocalDerivation": 0,
"receiptPath": "/api/mm/action_receipts?account={account}&seqs={seqs}",
"missingDerivedOrderIdsAction": "derive_locally_from_signed_payload_or_reconcile_by_seq",
"quoteReplaceSummary": {
"requestedLegs": [
{
"derivedOrderId": "0x...",
"expectedDerivedOrderId": "0x...",
"derivedOrderIdSource": "response.derivedOrderIds",
"responseDerivedOrderIdMissing": false
}
]
}
},
"timing": {
"durationsUs": {
"total": 4210,
"coreAck": 3830,
"facadeAugment": 380,
"auth": 120,
"routing": 540,
"risk": 0,
"enqueue": 2100,
"ackResolution": 0,
"responseEmit": 90
}
}
}
}

coreAck is the number to compare against the exchange-side sub-50 ms target. total in the BSL envelope includes facade work after the core response. bsl.reconciliation.ackIsTerminalState is false for the low-latency ACK lane. Use seqs[], lastSeq, seqChecksum, local derived order ids, and the receipt/drop-copy paths as the authoritative post-submit reconciliation anchors.

Latency budget

Use this budget when investigating a p99 spike:

StageSignalTypical fix
Client transportClient-side timer before response headersReuse one SDK client per process or strategy shard; keep HTTP/FIX/BSL TCP sessions persistent.
Authx-mm-perf-routing-auth-usWarm HMAC credential cache, monotonic SC-Nonce, persistent session.
Signature verifyx-mm-perf-signature-verify-usUse self-signed account actions where possible; keep signer sets warm.
Validation/routingx-mm-perf-routing-validation-us and detailed routing headersSingle account, single shard, valid market, fresh timestamps, low fanout.
Risk/precheckx-mm-perf-prepare-risk-approval-us, x-mm-perf-prepare-risk-apply-usKeep quote shapes hotpath-safe; avoid full mode in quote loops.
WAL/enqueuex-mm-perf-enqueue-usKeep batches small, direct lane warm, avoid backlog and forced slow-lane syncs.
ACK waitx-mm-perf-ack-resolution-usDo not request applied, durable, or full on the quote loop.
Responsex-mm-perf-response-emit-usUse summary only after stream/drop-copy reconciliation is proven.

Set MM_DETAILED_PERF_RESPONSE_HEADERS=true only for diagnostics. The basic hotpath headers are always enough for first-pass triage.

Batch shape

For sub-50 ms behavior, keep the quote batch boring:

  • one engine account per batch,
  • one market shard where possible,
  • QuoteReplace / SpotQuoteReplace over separate cancel plus place calls,
  • small batches inside the low-latency limit from GET /api/v1/bsl/limits,
  • hotpath-safe order types and post-only quote legs where appropriate,
  • no full result mode on the steady-state quote refresh path.

Read the live limits before the strategy starts:

GET /api/v1/bsl/limits

Important fields:

FieldMeaning
ackModes.lowLatencyCurrent low-latency ACK boundary for the account/lane.
actions.lowLatencyMaxActionsMaximum batch size that stays in the low-latency class.
actions.singleAccountRequiredWhether a batch may mix engine accounts.
ingress.lowLatencyMaxInflightPush-side concurrency cap for the low-latency lane.
backlog.ingressWalMaxPendingActionsWAL/backlog pressure signal for admission throttling.
nonceModel.actionNonceLive action nonce window and recovery rule.
nonceModel.machineAuthNonceSC-Nonce replay contract for HMAC/agent auth.
errorGuidance.nonceCodesMachine-readable nonce reject recovery actions.
errorGuidance.requestCodesQueue, risk, duplicate nonce, and reservation guidance.

Nonce rules

Do not mix nonce families:

NonceOwnerRetry rule
SC-NonceHMAC credential / API keyMonotonic per key. Reuse is an auth replay and must not be retried.
Action payload.nonceEngine accountWindowed admission. Re-sign with a fresh in-window nonce on stale/replay rejects.
Idempotency-KeyClient requestSame key and same body may replay the response; same key and different body conflicts.

Changing only Idempotency-Key does not fix an action nonce error because the action nonce is inside the signed payload.

Recommended action nonce flow:

  1. Read GET /api/v1/accounts/:engineAccount/bootstrap?fresh=true.
  2. Start from nonceFloor or nextNonce.
  3. Allocate unused in-window nonces locally.
  4. Persist nonce plus signed payload before submit.
  5. On nonce_below_floor, nonce_replayed, or nonce_outside_window, re-read bootstrap/account state and re-sign.

QuoteReplace guarantees

QuoteReplace is the market-maker primitive. One signed parent action contains one or more quote legs. Each replacement child order id is deterministic from the signed parent payload and leg index.

Guarantees to rely on:

ContractGuarantee
OrderingA leg's cancelOrderId is processed before that leg's replacement order.
IdentityderivedOrderIds[] is deterministic and should match SDK local derivation.
AdmissionAn ack response means the group reached the configured admission boundary.
Terminal truthFinal open, filled, canceled, or rejected state comes from drop-copy/private stream/receipts/account reads.

Do not assume every replacement order is resting because the HTTP response is 200. A risk reject, market halt, post-only cross, cancel race, or terminal engine reject can still appear after an admission ACK.

Reconciliation

Persist before each submit:

KeyWhy
Idempotency-KeyRetry correlation.
action nonceRe-sign/replay diagnosis.
clientOrderIdStrategy identity.
expected derivedOrderIds[]Cancel/replace and public-book verification.
response seqs[]Receipt, stream, and replay gap-fill correlation.
market, book, side, price, qtyLocal inventory/book model.

Reconciliation order:

  1. Private stream or FIX/BSL drop-copy for terminal execution truth.
  2. /api/v1/bsl/accounts/:account/executions?fromSeq=... for replay/gap-fill.
  3. /api/v1/accounts/:engineAccount/orders and /fills for account-state confirmation.
  4. Public market snapshot/order book only to verify visible resting liquidity.

Public book visibility is not account truth. It can lag, aggregate levels, or omit account ownership. Use it to confirm quote visibility after private state has been reconciled.

Local verification

Use these checks before handing a new endpoint bundle to a market maker:

cargo test -p senticore-sequencing-core bsl_order_entry_returns_contract_envelope_and_mode_headers -- --nocapture --test-threads=1
cargo test -p senticore-sequencing-core bsl_limits_route_exposes_business_line_contract -- --nocapture --test-threads=1
cargo test -p senticore-sequencing-core mm_action_batch_binary_quote_replace_applies_compact_action -- --nocapture --test-threads=1
cargo test -p senticore-sequencing-core external_gateway_ipc_accepts_atomic_group_as_one_batch_message -- --nocapture --test-threads=1

For latency work, run the dedicated exchange-side ACK gate and label the artifact with the route, host, profile, and result mode:

export RUNTIME_DATABASE_URL=postgres://...
export ENGINE_TRUTH_DATABASE_URL=$RUNTIME_DATABASE_URL

python3 backend/sequencer/scripts/run_exchange_h0_perf_baseline.py \
--profile institutional_mm_hotpath_ack_smoke \
--allow-dedicated \
--artifact-dir backend/sequencer/runtime/perf/institutional-mm-hotpath-smoke

python3 backend/sequencer/scripts/run_exchange_h0_perf_baseline.py \
--profile institutional_mm_hotpath_ack_50ms \
--allow-dedicated \
--artifact-dir backend/sequencer/runtime/perf/institutional-mm-hotpath

The smoke profile validates host wiring and artifact schema only. The 50ms gate fails when exchange-side HTTP ACK p99 is above 50000us. Add --process-timeout-sec <seconds> while debugging a stuck host so the runner writes a bounded failure artifact. Never compare a full or durable ACK run against an ack quote-loop target.

In the JSON artifact, use requestAckExclusiveBreakdownUs for the first-pass "where did the ACK time go?" attribution. It is non-overlapping. The detailed requestBreakdownUs object includes nested routing and enqueue diagnostics and must not be summed as if every field were exclusive.

Read processTotal and processUninstrumented before optimizing the engine. Those fields isolate the exchange process path. A run with low processTotal and high handlerUninstrumented is pointing at HTTP/Axum/harness wrapper, runtime scheduling, socket, or response decode pressure rather than direct Engine/WAL/order-book work.

The stable-account benchmark profile uses multiple pre-seeded account slots per phase so load generation stays inside the engine action nonce replay window. That is a harness constraint, not a relaxation of production nonce rules: market-maker clients must still allocate action nonces per engine account and recover from nonce rejects by re-reading bootstrap/account state and re-signing.