BSPump Documentation

BSPump Data Pipeline

bspump is at the heart of bitswan4stream and part of the bitswan4telco family.

BSPump is an async-first stream processing framework for building data pipelines and automations in Python and Jupyter notebooks. It provides a powerful, composable architecture for ingesting, processing, and routing data at scale.

Getting Started

Install BSPump and build your first pipeline in minutes.

Getting Started
Core Concepts

Understand pipelines, sources, processors, sinks, and connections.

Core Concepts
Patterns

Learn common patterns: webhook-to-kafka, kafka processing, scheduled tasks.

Common Patterns
Integrations

Connect to Kafka, HTTP, PostgreSQL, MongoDB, Elasticsearch, and more.

Integrations

Key Features

BSPump Integrations
  • Async-First: Built on asyncio for high-performance, non-blocking I/O

  • Jupyter Integration: Develop and test pipelines interactively in notebooks

  • Composable Architecture: Mix and match sources, processors, and sinks

  • Production Ready: Battle-tested in telco and enterprise environments

  • Rich Integrations: Kafka, HTTP webhooks, databases, message queues

Architecture

BSPump Architecture

BSPump pipelines consist of:

  • Sources: Entry points for data (Kafka, HTTP webhooks, files, databases)

  • Processors: Transform, filter, and enrich events

  • Sinks: Output destinations (Kafka, databases, APIs, files)

  • Connections: Shared, reusable connections to external systems

Quick Example

from bspump.jupyter import *
import bspump.kafka

@register_connection
def connection(app):
    return bspump.kafka.KafkaConnection(app, "KafkaConnection")

auto_pipeline(
    source=lambda app, pipeline: bspump.kafka.KafkaSource(
        app, pipeline, connection="KafkaConnection"
    ),
    sink=lambda app, pipeline: bspump.kafka.KafkaSink(
        app, pipeline, connection="KafkaConnection"
    ),
    name="ProcessingPipeline",
)

# Process events in notebook cells
event = json.loads(event.decode("utf8"))
event["processed"] = True
event = json.dumps(event).encode()

Indices and tables