Institutional Hotpath Latency
This page is the latency and determinism contract for institutional market makers. It separates the exchange hotpath from proxy/network effects and makes the response semantics explicit enough for an unattended quote engine.
Target
For a market maker quote loop, the target is sub-50 ms p99 from exchange
ingress to HTTP/BSL ACK on the provisioned low-latency lane. This target is
not a guarantee for public-internet round trips, Cloudflare/proxy paths, full
responses, or durable/backoffice workflows.
Measure three separate clocks:
| Clock | Meaning | How to measure |
|---|---|---|
| Client RTT | Client send to client receive. Includes network, TLS, proxy, and client runtime. | Client-side monotonic timer. |
| Core ACK | Sequencer request start to ACK response body finalization. | x-mm-perf-total-us or bsl.timing.durationsUs.coreAck. |
| BSL facade overhead | JSON/BSL response envelope work around the core submit. | bsl.timing.durationsUs.facadeAugment. |
If client RTT is high but core ACK is low, optimize network path, DNS/direct endpoint choice, TLS reuse, and proxy hops before changing strategy logic.
Route selection
Use the connectivity bundle for the actual endpoints assigned to the account:
GET /api/v1/bsl/connectivity
SC-Auth-Version: 2
SC-Key: <institutional apiKeyId>
SC-Nonce: <monotonic nonce>
SC-Timestamp: <unix ms>
SC-Passphrase: <apiPassphrase>
SC-Signature: <hmac>
Hotpath preference:
| Preference | Route | Use |
|---|---|---|
| 1 | BSL Direct TCP/TLS | Lowest overhead native binary session where provisioned. |
| 2 | /api/order-entry/binary | Direct binary HTTP lane. |
| 3 | /api/v1/bsl/orders/compact | BSL compact HTTP facade with institutional HMAC. |
| 4 | Generic platform/proxy path | Integration compatibility only; benchmark separately. |
Keep connections warm. Reusing TLS sessions and a warm credential/signature cache matters more than micro-optimizing one isolated request.
Signing path
Do not call POST /api/v1/trading/actions/hash in the steady-state quote loop.
That endpoint is a reference and conformance tool. Production quote engines
compute the canonical action hash locally, sign locally, and submit the signed
batch in one order-entry request. See
Local Action Signing and pin the SDK golden
vectors in CI.
The common two-request flow:
POST /api/v1/trading/actions/hash- sign
POST /api/v1/bsl/orders/compact
adds a full network/API round trip before every quote refresh. Keep it for bring-up and signer validation only.
Live stress reports from scripts/e2e/live-order-entry-stress.cjs expose this
split under latencyAttribution: actionBuild.localSigning measures local SDK
signing, actionBuild.nonce measures bootstrap/reservation work, submitHttp
measures the order-entry request RTT, and
removedHotpathRoundtrip.endpoint records that
/api/v1/trading/actions/hash was intentionally not called in the hotpath.
Use those fields to prove that a post-change run is no longer paying the
200-270 ms server-hash round trip observed in the original integration review.
ACK and response modes
There are two independent knobs:
| Header | Controls | Hotpath setting |
|---|---|---|
X-BSL-Result-Mode | ACK boundary: ack, durable, full. | ack for quote loops. |
X-Senticore-Response-Mode | Body verbosity: summary, detailed, full. | detailed during integration; summary only after reconciliation is proven. |
ACK boundaries:
| Boundary | Server waits for | Client meaning |
|---|---|---|
accepted / BSL ack | Admission boundary only. | Request entered the low-latency lane; not terminal order state. |
ingress_wal | Primary ingress WAL append and forced sync. | Safer admission boundary; still not terminal order state. |
applied / BSL full | Engine apply / terminal receipts. | Useful for conformance and slow paths; keep out of live quote refresh loops. |
primary_durable / durable | Durable boundary. | Backoffice or explicit durability workflows, not low-latency quoting. |
Do not run a market-maker quote loop with full responses unless the account
has explicitly passed conformance for that mode and the latency budget includes
engine terminal wait time.
Recommended submit headers:
X-BSL-Result-Mode: ack
X-Senticore-Response-Mode: detailed
Idempotency-Key: <strategy-request-key>
Expected successful shape:
{
"ok": true,
"seqs": [123],
"derivedOrderIds": ["0x..."],
"acceptedActions": 1,
"responseMode": "detailed",
"ackMode": "ingress_wal",
"ackClass": "ingress_wal",
"durableLsn": 22376075,
"durablePendingSeqs": 1,
"bsl": {
"reconciliation": {
"ackIsTerminalState": false,
"bodyIncludesSeqs": true,
"bodyIncludesDerivedOrderIds": true,
"missingDerivedOrderIdCount": 0,
"missingDerivedOrderIdsRecoveredLocally": 0,
"missingDerivedOrderIdsFinalMissingAfterLocalDerivation": 0,
"receiptPath": "/api/mm/action_receipts?account={account}&seqs={seqs}",
"missingDerivedOrderIdsAction": "derive_locally_from_signed_payload_or_reconcile_by_seq",
"quoteReplaceSummary": {
"requestedLegs": [
{
"derivedOrderId": "0x...",
"expectedDerivedOrderId": "0x...",
"derivedOrderIdSource": "response.derivedOrderIds",
"responseDerivedOrderIdMissing": false
}
]
}
},
"timing": {
"durationsUs": {
"total": 4210,
"coreAck": 3830,
"facadeAugment": 380,
"auth": 120,
"routing": 540,
"risk": 0,
"enqueue": 2100,
"ackResolution": 0,
"responseEmit": 90
}
}
}
}
coreAck is the number to compare against the exchange-side sub-50 ms target.
total in the BSL envelope includes facade work after the core response.
bsl.reconciliation.ackIsTerminalState is false for the low-latency ACK
lane. Use seqs[], lastSeq, seqChecksum, local derived order ids, and the
receipt/drop-copy paths as the authoritative post-submit reconciliation anchors.
Latency budget
Use this budget when investigating a p99 spike:
| Stage | Signal | Typical fix |
|---|---|---|
| Client transport | Client-side timer before response headers | Reuse one SDK client per process or strategy shard; keep HTTP/FIX/BSL TCP sessions persistent. |
| Auth | x-mm-perf-routing-auth-us | Warm HMAC credential cache, monotonic SC-Nonce, persistent session. |
| Signature verify | x-mm-perf-signature-verify-us | Use self-signed account actions where possible; keep signer sets warm. |
| Validation/routing | x-mm-perf-routing-validation-us and detailed routing headers | Single account, single shard, valid market, fresh timestamps, low fanout. |
| Risk/precheck | x-mm-perf-prepare-risk-approval-us, x-mm-perf-prepare-risk-apply-us | Keep quote shapes hotpath-safe; avoid full mode in quote loops. |
| WAL/enqueue | x-mm-perf-enqueue-us | Keep batches small, direct lane warm, avoid backlog and forced slow-lane syncs. |
| ACK wait | x-mm-perf-ack-resolution-us | Do not request applied, durable, or full on the quote loop. |
| Response | x-mm-perf-response-emit-us | Use summary only after stream/drop-copy reconciliation is proven. |
Set MM_DETAILED_PERF_RESPONSE_HEADERS=true only for diagnostics. The basic
hotpath headers are always enough for first-pass triage.
Batch shape
For sub-50 ms behavior, keep the quote batch boring:
- one engine account per batch,
- one market shard where possible,
QuoteReplace/SpotQuoteReplaceover separate cancel plus place calls,- small batches inside the low-latency limit from
GET /api/v1/bsl/limits, - hotpath-safe order types and post-only quote legs where appropriate,
- no
fullresult mode on the steady-state quote refresh path.
Read the live limits before the strategy starts:
GET /api/v1/bsl/limits
Important fields:
| Field | Meaning |
|---|---|
ackModes.lowLatency | Current low-latency ACK boundary for the account/lane. |
actions.lowLatencyMaxActions | Maximum batch size that stays in the low-latency class. |
actions.singleAccountRequired | Whether a batch may mix engine accounts. |
ingress.lowLatencyMaxInflight | Push-side concurrency cap for the low-latency lane. |
backlog.ingressWalMaxPendingActions | WAL/backlog pressure signal for admission throttling. |
nonceModel.actionNonce | Live action nonce window and recovery rule. |
nonceModel.machineAuthNonce | SC-Nonce replay contract for HMAC/agent auth. |
errorGuidance.nonceCodes | Machine-readable nonce reject recovery actions. |
errorGuidance.requestCodes | Queue, risk, duplicate nonce, and reservation guidance. |
Nonce rules
Do not mix nonce families:
| Nonce | Owner | Retry rule |
|---|---|---|
SC-Nonce | HMAC credential / API key | Monotonic per key. Reuse is an auth replay and must not be retried. |
Action payload.nonce | Engine account | Windowed admission. Re-sign with a fresh in-window nonce on stale/replay rejects. |
Idempotency-Key | Client request | Same key and same body may replay the response; same key and different body conflicts. |
Changing only Idempotency-Key does not fix an action nonce error because the
action nonce is inside the signed payload.
Recommended action nonce flow:
- Read
GET /api/v1/accounts/:engineAccount/bootstrap?fresh=true. - Start from
nonceFloorornextNonce. - Allocate unused in-window nonces locally.
- Persist nonce plus signed payload before submit.
- On
nonce_below_floor,nonce_replayed, ornonce_outside_window, re-read bootstrap/account state and re-sign.
QuoteReplace guarantees
QuoteReplace is the market-maker primitive. One signed parent action contains
one or more quote legs. Each replacement child order id is deterministic from
the signed parent payload and leg index.
Guarantees to rely on:
| Contract | Guarantee |
|---|---|
| Ordering | A leg's cancelOrderId is processed before that leg's replacement order. |
| Identity | derivedOrderIds[] is deterministic and should match SDK local derivation. |
| Admission | An ack response means the group reached the configured admission boundary. |
| Terminal truth | Final open, filled, canceled, or rejected state comes from drop-copy/private stream/receipts/account reads. |
Do not assume every replacement order is resting because the HTTP response is
200. A risk reject, market halt, post-only cross, cancel race, or terminal
engine reject can still appear after an admission ACK.
Reconciliation
Persist before each submit:
| Key | Why |
|---|---|
Idempotency-Key | Retry correlation. |
| action nonce | Re-sign/replay diagnosis. |
clientOrderId | Strategy identity. |
expected derivedOrderIds[] | Cancel/replace and public-book verification. |
response seqs[] | Receipt, stream, and replay gap-fill correlation. |
| market, book, side, price, qty | Local inventory/book model. |
Reconciliation order:
- Private stream or FIX/BSL drop-copy for terminal execution truth.
/api/v1/bsl/accounts/:account/executions?fromSeq=...for replay/gap-fill./api/v1/accounts/:engineAccount/ordersand/fillsfor account-state confirmation.- Public market snapshot/order book only to verify visible resting liquidity.
Public book visibility is not account truth. It can lag, aggregate levels, or omit account ownership. Use it to confirm quote visibility after private state has been reconciled.
Local verification
Use these checks before handing a new endpoint bundle to a market maker:
cargo test -p senticore-sequencing-core bsl_order_entry_returns_contract_envelope_and_mode_headers -- --nocapture --test-threads=1
cargo test -p senticore-sequencing-core bsl_limits_route_exposes_business_line_contract -- --nocapture --test-threads=1
cargo test -p senticore-sequencing-core mm_action_batch_binary_quote_replace_applies_compact_action -- --nocapture --test-threads=1
cargo test -p senticore-sequencing-core external_gateway_ipc_accepts_atomic_group_as_one_batch_message -- --nocapture --test-threads=1
For latency work, run the dedicated exchange-side ACK gate and label the artifact with the route, host, profile, and result mode:
export RUNTIME_DATABASE_URL=postgres://...
export ENGINE_TRUTH_DATABASE_URL=$RUNTIME_DATABASE_URL
python3 backend/sequencer/scripts/run_exchange_h0_perf_baseline.py \
--profile institutional_mm_hotpath_ack_smoke \
--allow-dedicated \
--artifact-dir backend/sequencer/runtime/perf/institutional-mm-hotpath-smoke
python3 backend/sequencer/scripts/run_exchange_h0_perf_baseline.py \
--profile institutional_mm_hotpath_ack_50ms \
--allow-dedicated \
--artifact-dir backend/sequencer/runtime/perf/institutional-mm-hotpath
The smoke profile validates host wiring and artifact schema only. The 50ms gate
fails when exchange-side HTTP ACK p99 is above 50000us. Add
--process-timeout-sec <seconds> while debugging a stuck host so the runner
writes a bounded failure artifact. Never compare a full or durable ACK run
against an ack quote-loop target.
In the JSON artifact, use requestAckExclusiveBreakdownUs for the first-pass
"where did the ACK time go?" attribution. It is non-overlapping. The detailed
requestBreakdownUs object includes nested routing and enqueue diagnostics and
must not be summed as if every field were exclusive.
Read processTotal and processUninstrumented before optimizing the engine.
Those fields isolate the exchange process path. A run with low processTotal
and high handlerUninstrumented is pointing at HTTP/Axum/harness wrapper,
runtime scheduling, socket, or response decode pressure rather than direct
Engine/WAL/order-book work.
The stable-account benchmark profile uses multiple pre-seeded account slots per phase so load generation stays inside the engine action nonce replay window. That is a harness constraint, not a relaxation of production nonce rules: market-maker clients must still allocate action nonces per engine account and recover from nonce rejects by re-reading bootstrap/account state and re-signing.