Scalable Objects Persistence
This guide covers the operational aspects of running SOP in production, including failover handling, connection management, and backup strategies.
SOP relies on long-lived (pooled) connections to Redis and Cassandra (for incfs).
Note: If running in Standalone Mode (using
sop.InMemorycache), Redis is not required, and this section can be ignored.
Note: SOP does not require Redis data persistence (RDB/AOF). Redis is used for ephemeral locking and caching. If Redis restarts, SOP detects the change and recovers safely.
go-redis client.redis.Options includes:
PoolSize: Set according to your concurrency needs (e.g., 10 * CPU cores).MinIdleConns: Keep some connections warm to avoid latency spikes.ReadTimeout / WriteTimeout: Tune these to avoid premature timeouts during heavy load.incfs)LOCAL_QUORUM for strong consistency.NetworkTopologyStrategy) for your cluster.SOP includes sophisticated logic to handle storage failures transparently.
Not all errors trigger a failover. SOP distinguishes between transient errors (retryable) and permanent hardware/filesystem failures.
Triggers for Failover:
syscall.EIO (Input/output error)syscall.EROFS (Read-only file system)syscall.ENOSPC (No space left on device)EUCLEAN).Behavior:
[Failover] tagged logs.sop_failover_count metric (if you have instrumented your application).incfs)Backing up a hybrid system requires coordination.
nodetool snapshot to capture the state of the Cassandra keyspace.SOP includes a powerful HTTP Server and Web UI that functions as a full database management suite. It allows you to:
IndexSpecification to define and search on compound indexes with multiple fields and custom sort orders, giving you RDBMS-like power.Note on Architecture: This tool is not a central database server. In SOP’s masterless architecture, this UI is simply another client node. You can run it locally on your laptop to manage a remote production cluster, or deploy it as a sidecar. It connects directly to the storage layer, respecting all ACID guarantees without introducing a central bottleneck. Each user managing data via this app participates in “swarm” computing, where changes are efficiently merged or rejected (if conflicting) with full ACID guarantees.
You can run the tool directly from the source:
# Point to your SOP registry folder
go run ./tools/httpserver -registry /path/to/your/sop/data
Access the UI at http://localhost:8080.
IndexSpecification).For more details, see the SOP Data Manager Documentation.
When running SOP in Clustered Mode (using Redis + Disk Storage), it is critical to maintain synchronization between the persistent data on disk and the ephemeral locks/cache in Redis.
Recommended Practice:
RemoveBtree, DeleteDatabase, or RemoveStore) to delete data.Manual Deletion (Development Only): If you must delete the data files manually (e.g., during local development or a hard reset):
FLUSHALL on your Redis instance immediately after deleting the files.
redis-cli flushall
Why this is necessary: Redis maintains locks and cached metadata with a default timeout (typically 15 minutes). If you delete the files but leave the Redis keys active, any new application instance you start will see “ghost” locks or stale metadata, preventing it from recreating the stores or acquiring locks until the timeout expires.