go: keepalive, exponential backoff, chain_id metadata, durability guarantees
Three related fixes that turn the go template into a client that survives the full matrix of server restart, client restart, network blip, half-open TCP, and long outages (hours → months) — without the user writing a line of reconnect logic in process.go. 1. gRPC keepalive: Time=10s, Timeout=3s, PermitWithoutStream=true. Half-open TCP (silent server restart, resumed laptop, NAT drop) is detected within ~13s. Previously the OS TCP keepalive took ~2h to notice, leaving the client as a ghost stream while prime logged "no active gRPC connection" for every skipped transaction. 2. Exponential backoff with jitter on reconnect. Effective delay = min(max_backoff_seconds, reconnect_delay_seconds * 2^attempts) + random(0, reconnect_delay_seconds). The attempts counter resets after any session that runs healthy for 60+ seconds. Jitter desynchronises clients so a server restart doesn't trigger a thundering herd. New max_backoff_seconds config field, default 120. 3. Unified error signalling: the sender goroutine now tears down the stream's context when it hits a Send error. Previously only Recv errors triggered a reconnect — a stale stream where only Send was broken could sit there indefinitely. Also: chain_id is a required config field now and goes in the x-chain-id gRPC metadata header alongside x-api-key and x-smart-contract-id. Prime rejects streams without it with "missing chain ID", which was silently breaking every template-based client until users discovered it the hard way. README documents the durability contract so contract authors know they don't have to reimplement any of it.
This commit is contained in:
16
go/README.md
16
go/README.md
@@ -43,6 +43,7 @@ make tools
|
||||
4. **Configure your connection** by editing `config.yaml`:
|
||||
```yaml
|
||||
server_address: "your-dragonchain-server:50051"
|
||||
chain_id: "your-chain-public-id"
|
||||
smart_contract_id: "your-smart-contract-id"
|
||||
api_key: "your-api-key"
|
||||
```
|
||||
@@ -60,13 +61,24 @@ make tools
|
||||
| Field | Description | Default |
|
||||
|-------|-------------|---------|
|
||||
| `server_address` | gRPC server address | Required |
|
||||
| `chain_id` | Public chain id the SC is registered on (sent as `x-chain-id` metadata) | Required |
|
||||
| `smart_contract_id` | Your smart contract ID | Required |
|
||||
| `api_key` | API key for authentication | Required |
|
||||
| `use_tls` | Enable TLS encryption | `false` |
|
||||
| `tls_cert_path` | Path to TLS certificate | - |
|
||||
| `num_workers` | Concurrent transaction processors | `10` |
|
||||
| `reconnect_delay_seconds` | Delay between reconnection attempts | `5` |
|
||||
| `max_reconnect_attempts` | Max reconnect attempts (0 = infinite) | `0` |
|
||||
| `reconnect_delay_seconds` | Base delay for exponential backoff between reconnect attempts | `3` |
|
||||
| `max_backoff_seconds` | Ceiling for the exponential backoff | `120` |
|
||||
| `max_reconnect_attempts` | Max reconnect attempts (0 = infinite, recommended) | `0` |
|
||||
|
||||
## Durability guarantees (provided by `main.go`, no work for you)
|
||||
|
||||
- **Server restart, update, crash, or network blip** → the client auto-reconnects and resumes processing. Transactions observed while the stream was down stay queued on the Dragonchain Prime side and are delivered (oldest first) on reconnect.
|
||||
- **Client restart or long outage** → when this process comes back up (minutes, hours, months later), it rejoins the stream and prime re-delivers every still-pending transaction that should have invoked it.
|
||||
- **Half-open TCP** (silent peer, resumed laptop, corporate NAT dropping idle flows) is detected within ~13 seconds via gRPC keepalive and triggers a reconnect. No dangling ghost streams.
|
||||
- **Reconnect storms** are avoided: exponential backoff with jitter means many clients reconnecting after a server restart don't all slam `accept()` at the same instant. The timer resets after a stream has been healthy for 60 seconds.
|
||||
|
||||
These are invariants of the template — you do not add any of this in `process.go`.
|
||||
|
||||
## Implementing Your Smart Contract
|
||||
|
||||
|
||||
Reference in New Issue
Block a user