sop

Scalable Objects Persistence


Project maintained by SharedCode Hosted on GitHub Pages — Theme by mattgraham

SOP Workflows: From Local Dev to Enterprise Swarm

SOP is designed to adapt to your project’s lifecycle. Whether you are a solo developer building a knowledge base or an architect designing a financial system, SOP offers flexible workflows that scale with you.

Here are the common implementation patterns.

1. The infs Path: File System Simplicity

The infs package uses the file system as the storage backend. While it works perfectly on a local disk for development, its true power is unlocked when using Network Attached Storage (NAS), S3-mounted drives, or Cloud Volumes. This allows your data to scale far beyond the limits of a single machine’s local disk.

Scenario A: The Seamless Scale-Up

Ideal for: Startups, internal tools, and applications that need to start simple but grow big.

  1. Develop Locally (Standalone Mode)
    • Configure SOP to use Standalone Mode.
    • Target a local folder or a mounted network drive.
    • Benefit: Zero dependencies. No Redis to install. You can code, test, and debug on a plane without internet.
    • Tip: Use the SOP Data Manager (go run ./tools/httpserver) to visually inspect and manage your local data as you build.
  2. Release to Production (Clustered Mode)
    • Mount that same network drive (or share the S3 bucket) to your production servers.
    • Switch the configuration to Clustered Mode and point it to a Redis instance.
    • Benefit: Your application instantly gains distributed coordination, locking, and caching. Multiple nodes can now read and write to the same data safely.

Scenario B: The “Build-Once, Read-Many” Engine

Ideal for: Knowledge Bases, AI RAG Stores, Static Content Delivery, Configuration Management.

  1. Build Phase
    • A “Builder” process runs in Standalone Mode to generate or update the dataset (e.g., ingesting thousands of documents into a Vector Store).
  2. Serve Phase
    • Deploy the dataset to your production cluster.
    • Configure the reader applications to use NoCheck Transactions.
    • Benefit: Since the data is static (or rarely changes), NoCheck skips the overhead of conflict detection and version tracking. You get raw, unbridled read speeds—perfect for high-traffic read endpoints.

Scenario C: General Purpose Enterprise Data

Ideal for: Financial Systems, Inventory Management, User Data, Transactional Logs.

Scenario D: Swarm Computing

Ideal for: Massive ETL jobs, Scientific Computing, Global-Scale Data Processing.

Scenario E: The Multimedia Library (Smart Blob Store)

Ideal for: Video Streaming Services, Digital Asset Management, Medical Imaging Archives.

Scenario F: Search Engines & AI Pipelines

Ideal for: Text Search, RAG (Retrieval-Augmented Generation), LLM Memory.

Scenario G: Desktop Publishing & Read-Only Distribution

Ideal for: E-Books, Legal Archives, Offline Encyclopedias, Shared Network Libraries.

Scenario H: The Database Engine Construction Kit

Ideal for: Database Developers, Custom Query Languages, Specialized RDBMS Makers.

Scenario I: The Edge-to-Cloud Continuum

Ideal for: IoT Fleets, Connected Cars, Medical Devices, Smart Sensors.

Scenario J: Data Sovereignty & Compliance

Ideal for: GDPR, HIPAA, Government, Multi-Region Clouds.

Scenario K: DevOps, Testing & Release Management

Ideal for: CI/CD Pipelines, Integration Testing, QA, Environment Promotion.

Scenario L: Big Data & Analytics

Ideal for: Log Management, Audit Trails, IoT Telemetry, Large-Scale Document Stores.


2. The incfs Path: Cassandra Supercharged

The incfs package allows you to use Apache Cassandra as the registry backend while keeping data blobs on the file system.

Scenario: The Legacy Upgrade

Ideal for: Teams with existing or planned Cassandra infrastructure who need stronger consistency.


3. Universal Interoperability

SOP is designed to be the universal storage layer for your entire stack, regardless of language or platform.