Close Menu
  • Home
  • Crypto News
    • Bitcoin
    • NFT News
  • Metaverse
  • Defi
  • Blockchain
  • Regulations
  • Trading

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

XRP Price Prediction: RLUSD Pushes Ripple Stablecoin Adoption, But XRP Lags

April 30, 2026

Tether-Linked £5 Million Political Donation Draws Regulatory Scrutiny

April 30, 2026

Ripple Penetrates Middle East After Vegas: Garlinghouse Masterclass?

April 30, 2026
Facebook X (Twitter) Instagram
CredBit.com
  • Home
  • Crypto News
    • Bitcoin
    • NFT News
  • Metaverse
  • Defi
  • Blockchain
  • Regulations
  • Trading
Facebook X (Twitter) Instagram
CredBit.com
Home » Goldsky’s Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes
Blockchain

Goldsky’s Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes

October 30, 20233 Mins Read
Facebook Twitter WhatsApp Pinterest Telegram LinkedIn Tumblr Email Reddit VKontakte
Goldsky’s Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes
Share
Facebook Twitter LinkedIn Pinterest Telegram Email

Goldsky created a platform for the real-time processing of blockchain data. The platform allows clients to extract data from blockchains into their databases to support product features without running the data pipeline infrastructure. The event-driven architecture (EDA) of Goldsky leverages Apache Flink, Redpanda, Kubernetes, and cloud provider services.

Goldsky’s platform provides blockchain indexing, subgraphs, and data streaming pipelines that can be used by developers building dApps (decentralized applications) who perhaps are not versed in data engineering and are not familiar with key technologies such as Apache Kafka or Apache Flink.

Yaroslav Tkachenko, principal software engineer at Goldsky, talks about data engineering for blockchain applications:

There has been a paradigm shift in the industry recently where people are realizing that you can use data platform technology that was previously used by internal analytics teams to power customer-facing features. The sort of data pipelines that previously only supported reporting and dashboards are now supporting web application functionality.

The architecture of Goldsky’s platform consists of the control plane and data plane components. Control plane components are responsible for exposing configuration management APIs, allowing configuring data processing pipelines, including blockchain data sources, client database sinks, any access credential secrets, and other configuration options. UI and CLI applications utilize control plane APIs to allow clients to configure the pipelines. The data plane executes configured data pipelines, pulling the raw data from source blockchains, transforming it, and inserting it into client data-store sinks.

Goldsky’s Streaming Data Architecture (Source: Redpanda Technology Blog)

Goldsky supports two ways of extracting data from blockchain sources. Direct indexing is based on the Ethereum ETL (extract, transform, load) project, and it works by connecting to blockchain nodes directly and extracting low-level data like logs and transactions. Subgraphs, on the other hand, rely on the processing event telemetry for smart contracts using simple TypeScript applications.

Goldsky uses Redpanda, a Kafka-compatible message broker written in C++ for direct indexing, with blockchain data serialized using Avro. Redpanda is used for messaging and data storage with S3-compatible tiered storage, allowing for much longer data retention in a cost-effective way.

The transformation layer leverages Flink SQL and enables customers to define custom SQL transformations to perform filtering, projections, complex joins, or aggregations. Flink jobs are executed on Kubernetes using Flink Kubernetes Operator. Customers can choose from many pipeline sink types, including PostgreSQL, S3, ElasticSearch, ClickHouse, Rockset, and Apache Kafka.

Tkachenko summarises the benefits of data streaming for blockchain:

Data streaming concepts work extremely well for blockchain data—we’re able to solve challenging problems, such as blockchain reorgs, by modeling them as well-known stream-processing problems like retractions. Building on this further lets us support even more advanced use cases, like enriching on-chain data with off-chain data, reliably calculating Top-N aggregations, and combining data from multiple blockchains together.


Credit: Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Reddit VKontakte Telegram WhatsApp

Related Posts

California’s DMV is using the blockchain to prevent title fraud

July 30, 2024

What Are the Hottest Blockchain Stocks Right Now? 3 Top Pick

July 30, 2024

MetaCene and Mantle: Pioneering Blockchain Evolution in Gaming Industry

July 30, 2024

Argo Blockchain PLC Announces Private Placement With Institutional Investor

July 30, 2024

Transitioning from Miners to Stakers: Securing the Ethereum (ETH) Blockchain

July 30, 2024

This is How Developed a Decentralized e-Mail System Is on the Blockchain

July 29, 2024

Comments are closed.

Editors Picks

XRP Price Prediction: RLUSD Pushes Ripple Stablecoin Adoption, But XRP Lags

April 30, 2026

Tether-Linked £5 Million Political Donation Draws Regulatory Scrutiny

April 30, 2026

Ripple Penetrates Middle East After Vegas: Garlinghouse Masterclass?

April 30, 2026

XRP Price Prediction: Garlinghouse Locks In as Ripple Raises the Standard in Las Vegas

April 29, 2026
© 2026 - credbit.com - All Rights Reserved!
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms of Use
  • DMCA

Type above and press Enter to search. Press Esc to cancel.