sop

Scalable Objects Persistence


Project maintained by SharedCode Hosted on GitHub Pages — Theme by mattgraham

SOP Scalability & Performance Limits

SOP is designed to be a “Limitless” storage engine, constrained only by the physical hardware it runs on. By decoupling Coordination (Redis) from Storage (Filesystem/B-Trees), SOP achieves massive horizontal scalability.

1. Theoretical Capacity Limits

SOP uses a Registry to map Logical IDs (UUIDs) to Physical Locations. This registry is sharded into “Segment Files” on disk.

The Math: Capacity Per Segment

Using the max configuration constants and realistic load factors:

1.1. Virtual IDs (Handles) Per Segment

Each Registry Segment File can address: \(750,000 \text{ blocks} \times 66 \text{ handles/block} = 49,500,000 \text{ Handles}\)

Result: A single 3GB Registry File can track 49.5 Million unique B-Tree Nodes.

1.2. Total Items Per Segment

If each Handle represents one B-Tree Node, and each Node holds up to 20,000 items @ 68% load: \(49,500,000 \text{ Nodes} \times (20,000 \times 0.68) \text{ Items/Node} = 673,200,000,000 \text{ Items}\)

Result: A single Registry Segment can index 673.2 Billion Items.

The Math: Horizontal Scaling

SOP automatically allocates new Registry Segments as needed. It is designed to handle high volume segments, BUT currently set to a limit of 1,000 to keep the logic simple.

Segments Total Nodes (Handles) Total Capacity (Items @ 20k Slots, 68% Load)
1 49.5 Million 673.2 Billion
10 495 Million 6.73 Trillion
100 4.95 Billion 67.3 Trillion
1,000 49.5 Billion 673.2 Trillion

Conclusion: With just 1,000 registry segments (a manageable number for any modern filesystem), SOP can address Over Half a Quadrillion Items. The limit is purely your disk space (Petabytes).


2. Throughput & IOPS (The “Speed Limit”)

Since SOP stores data on disk (DirectIO) and uses Redis only for lightweight locking, the throughput is defined by:

  1. Redis Cluster Performance (Coordination)
  2. Network Fabric (Bandwidth)
  3. Storage Backend (IOPS)

2.1. Redis Cluster (Coordination Layer)

SOP uses Redis for Locking and L2 Caching. It does not store data in Redis.

2.2. Storage Backend (Data Layer)

SOP uses DirectIO (bypassing OS page cache) to write 4KB blocks directly to disk.

3. Storage Architecture: Distribution & Normalization

SOP achieves its “Super Scaling” capabilities through a smart storage design that normalizes the database into manageable pieces rather than a single monolithic file.

3.1. Segmentation & Registry

3.2. Storage-Friendly Hierarchy (S3-Style)

SOP is designed to be “Super Friendly” to storage drives and filesystems.

4. The “SOP Edge”

Why SOP competes with the Giants:

  1. Independent Nodes: Each SOP instance (embedded in your app) talks directly to the storage. There is no “Database Server” middleman to become a bottleneck.
  2. Linear Scalability: To double your throughput, simply add more Application Nodes and more Redis Shards. The architecture has no intrinsic ceiling.
  3. Petabyte Scale: As calculated above, the addressing scheme supports Trillions of objects. You will run out of physical hard drives long before you hit SOP’s architectural limits.

Summary: SOP turns your Hardware Limit into your Only Limit.

5. The Embedded Advantage: Zero-Admin Mode

SOP is not just for the data center; its architecture is uniquely suited for embedded systems and applications where a traditional server-based database is impossible.