Core Concepts

BSPump is built around a composable architecture where data flows through pipelines consisting of sources, processors, and sinks. Understanding these core concepts is essential for building effective data pipelines.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                        Application                          │
│  ┌───────────────────────────────────────────────────────┐  │
│  │                      Pipeline                         │  │
│  │                                                       │  │
│  │   ┌────────┐   ┌───────────┐   ┌───────────┐   ┌────┐ │  │
│  │   │ Source │──▶│ Processor │──▶│ Processor │──▶│Sink│ │  │
│  │   └────────┘   └───────────┘   └───────────┘   └────┘ │  │
│  │                                                       │  │
│  └───────────────────────────────────────────────────────┘  │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │ Connection  │  │   Lookup    │  │   Trigger   │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
└─────────────────────────────────────────────────────────────┘

Key Components

Pipeline

The core abstraction that chains components together. Events flow from source through processors to the sink.

Source

Entry point for data. Sources can pull data (polling) or receive data (push-based like webhooks). See Source.

Processor

Transforms, filters, or enriches events. Multiple processors can be chained together. See Processor.

Sink

Exit point for data. Sinks write events to external systems, files, or other destinations. See Sink.

Connection

Shared, reusable connections to external systems (databases, message queues, etc.). See Connection.

Lookup

Data enrichment tables that can be used to add context to events. See Lookup.

Trigger

Controls when sources produce events (cron schedules, pub/sub, etc.). See Trigger.

Event Flow

Events flow through the pipeline in a linear fashion:

  1. Source generates or receives an event

  2. Event passes through each Processor in order

  3. Each processor can transform, filter, or split the event

  4. Sink receives the final event and outputs it

Events can be any Python object, but are commonly:

  • Bytes (raw data)

  • Dictionaries (structured data)

  • Dataclasses or typed objects

Async-First Design

BSPump is built on Python’s asyncio, enabling:

  • Non-blocking I/O operations

  • High concurrency with minimal threads

  • Efficient handling of many simultaneous connections

  • Natural integration with async libraries