MOVED
Data Storage Overview
Overview
The Leverege IoT Stack uses multiple specialized databases to optimize for different data access patterns. Device data flows through several storage layers, each designed for specific query characteristics—real-time lookups, short-term historical queries, and long-term analytics.
Database Architecture
There are a suite of transponders - determined by the devops team and cluster setup, but commonly rt, tsdb and bq. These Transponder’s listen to the writer topic on pubsub and are responsible for updating their corresponding database. In the common set up, the rt transponder is responsible for updating elasticsearch and firebase, the tsdb transponder keeps the timescale database up to date, and bq transponder keeps big query up to date.
Data Flow Diagram
Hardware Message → Transponder → ┬→ Firebase (realtime state)
├→ Elasticsearch (searchable state)
├→ TimescaleDB (short-term history)
└→ BigQuery (long-term history)
Databases by Purpose
Real-Time State
Device current state is stored in two locations that serve different access patterns:
Firebase Realtime Database - JSON document store optimized for real-time sync to client applications. Changes propagate to connected clients instantly. Limited search capabilities. Elasticsearch - Stores device header and rt state side-by-side. Supports complex queries across all device attributes. Historical Data
Historical data is stored as diffs of device state over time for efficient storage, and in a dense device state snapshot form (dense history) for full device state reconstruction at any point in time.
TimescaleDB (PostgreSQL extension) - Short-term historical storage, typically the last 30 days. Self-managed service optimized for time-series queries with fast response times. No per-query cost. BigQuery (Google Data Warehouse) - Long-term historical storage for all data indefinitely (supports 10+ years). Optimized for analytics workloads on massive datasets. Pay-per-query model. Query Routing: The platform automatically routes historical queries to the appropriate database based on the time range requested—recent data from TimescaleDB, older data from BigQuery. This can be overridden by specifying a “source” field in the query.
Models
PostgreSQL (Models DB) - Stores platform configuration: blueprints, attributes, systems, devices, users, roles, permissions, rules, templates, and other Architect-configured entities. Caching & Coordination
Redis - In-memory data store used for: Caching - Frequently accessed data (blueprints, devices, API access tokens) Rate limiting - Login attempt tracking, API rate limits Distributed locks - Coordination between service instances Key Services
Transponder
Transponder writes device data to storage. It runs as multiple instances, each configured for different destinations:
transponder-rt - Writes to Firebase and Elasticsearch (real-time state) transponder-bq - Writes to BigQuery (long-term history) transponder-postgres - Writes to TimescaleDB (short-term history) Data is written to all configured stores simultaneously, ensuring consistency across databases.
DB Curator
DB Curator is a maintenance service that removes old or stale data after configured retention periods. It runs cleanup loops once per day for each database type (Firebase, PostgreSQL, etc.).
Data Forwarding (a specialized part of Transponder)
Data forwarding is a mechanism to automatically copy data from one device to a related device. This is useful when devices are closely linked—for example, a locator inside a vehicle.
Benefits:
Single lookup - Access all related data without making multiple queries Historical continuity - Vehicle retains location history even when locators are swapped Auditability - Track the history of constructed objects across hardware changes Example: When a locator is paired to a vehicle, locator data is “forwarded” to the vehicle. The vehicle’s history includes location data from any locator that was paired to it over time.
Storage Formats
Historical data is stored in two formats, controlled by platform configuration:
Sparse format - Only stores values when they change. Efficient for data that updates infrequently. Dense format - stores complete device state on every change. More expensive to query and computationally expensive to maintain, but more complete. Cost Optimization Notes
The multi-database architecture balances performance against cost:
TimescaleDB has no per-query cost but requires infrastructure management BigQuery charges per data scanned, so very large queries (e.g., 3 months of data) can be expensive Most applications query recent data frequently (served by TimescaleDB) and historical data rarely (served by BigQuery) This document provides a developer-focused overview of the Leverege data storage architecture. For Architect configuration options, see the Stack Documentation.