Building Scalable
Web3
Applications on
Google Cloud
Sri Lanka
What is Web3?
● A quick recap: Decentralized, user-owned, built on blockchains (e.g.,
Ethereum).
● Core components: Smart Contracts, On-Chain Events, Transactions.
The Problem with On-Chain Data
● Blockchains are a slow "database." Direct querying for complex dApp front-
ends is inefficient and often impossible.
● dApps need fast, reliable, and indexed access to both real-time and historical
on-chain data.
Why Google Cloud for Decentralized Apps?
● We're not putting the blockchain on GCP. We're building a highly performant
off-chain infrastructure layer to support our on-chain logic.
● Gain scalability, reliability, advanced data processing, and security that are
difficult to achieve otherwise.
The Web3 Challenge &
The Cloud Opportunity
1. The Core Architecture: An overview of our data pipeline.
2. Listening to the Chain: Connecting to the blockchain with Infura.
3. Ingesting Events: Decoupling services with Pub/Sub.
4. Processing & Caching: The role of workers and Memorystore.
5. Choosing Your Database: Storing data in Cloud SQL, Firestore, &
BigQuery.
6. Serving Your dApp: Building a scalable API on GKE.
7. Putting It All Together: A complete, scalable Kubernetes architecture.
Workshop Agenda
System Architecture
What is Infura?
● A Blockchain Node Provider. It gives us a reliable API (HTTP & WebSocket) endpoint to
communicate with a blockchain network without running our own node.
Listening for Events
● We use a WebSocket connection to an Infura gateway.
● This allows us to subscribe to specific smart contract events in real-time.
● Example: Subscribing to a Transfer event on an ERC-20 token contract.
The "Listener" Service
● A simple, lightweight microservice whose only job is to maintain this WebSocket
connection and receive events.
● Once an event is received, it immediately passes it on for processing.
Step 1 - Listening to the Chain with Infura
The Problem:
What if your listener service crashes? What if event processing is slow? You'll lose events and create a
bottleneck.
Solution: Google Cloud Pub/Sub
● A fully-managed, real-time messaging service.
● Our Listener becomes a Publisher: It receives an event from Infura and immediately publishes it to a
Pub/Sub "topic". Its job is done.
● This decouples event ingestion from event processing, ensuring no data is lost.
● It acts as a buffer, smoothing out traffic spikes.
Step 2 - Decoupling with Event Handlers (Pub/Sub)
Step 2 - Decoupling with Event Handlers (Pub/Sub)
The "Worker" Service
● A separate microservice that subscribes to the
Pub/Sub topic.
● Its job is to pull events from the queue and
perform the heavy lifting:
○ Decode event data.
○ Enrich data by calling other APIs.
○ Format data for storage in a database.
Step 3 - Processing Events with Workers & Caching
Caching with Memorystore for Redis
● Before hitting a database, we can use an in-
memory cache for ultra-fast operations.
● Why Cache?
○ Deduplication: Block reorgs can cause
duplicate events. Use Redis to check if an
event has already been processed.
○ Session Data: Store temporary user data.
○ Rate Limiting: Control access to
resources.
There's no single best database; use the right tool for the job.
Cloud SQL (Managed PostgreSQL/MySQL)
● Use for: Structured, relational data.
● Web3 Example: Storing user profiles, transaction histories with clear relationships,
financial ledgers.
Firestore (NoSQL Document DB)
● Use for: Flexible, semi-structured data; real-time front-end updates.
● Web3 Example: Storing NFT metadata (attributes can vary), user-specific settings, activity
feeds.
BigQuery (Serverless Data Warehouse)
● Use for: Large-scale analytics on all your blockchain data.
● Web3 Example: Analyzing token velocity, finding top NFT holders, tracking DeFi protocol
health. Workers can stream all processed events directly into BigQuery.
Step 4 - Choosing the Right Google
Cloud Database
Why Kubernetes?
● We have multiple microservices (listeners, workers, API). Kubernetes is perfect for
orchestrating them.
● Google Kubernetes Engine (GKE) is Google's managed Kubernetes service.
Key Benefits for Web3:
● Auto-scaling: Automatically add or remove service replicas (pods) based on load. If your
Pub/Sub queue gets long, GKE can automatically spin up more worker pods.
● Resilience: If a pod crashes, GKE automatically restarts it.
● Resource Management: Efficiently pack your services onto Compute Engine VMs (your
GKE nodes). You can choose different VM types (e.g., e2-standard for APIs, n2-high-cpu
for compute-heavy workers).
A Scalable Kubernetes-Based
Architecture (GKE)
This is how we organize our services inside a GKE cluster for maximum scalability and
security.
1. The Event Listener Deployment
● A set of pods running the listener code.
● Only needs outbound internet access to connect to Infura.
2. The Worker Deployment
● A set of pods running the processing code.
● Doesn't need any public internet access. It just talks to Pub/Sub and your internal
databases. This is more secure.
3. The API Layer Deployment
● A set of pods running your API server (e.g., Express.js, FastAPI).
● Exposed to the internet via a Google Cloud Load Balancer to serve requests from your
dApp's front-end.
Splitting Your Backend on GKE
● Cloud Run: A serverless alternative for your API or simple workers. Pay-per-use
and scales to zero.
● API Gateway / Apigee: Secure your public API with authentication, rate limiting,
and monitoring.
● Cloud Armor: Protect your API layer from DDoS and other web-based attacks.
● Cloud Monitoring & Logging: Get full observability into your entire system. Create
dashboards and alerts to monitor the health of your pipeline.
● Identity Platform: Manage user identity with both traditional (email/social) and
Web3 (Connect Wallet) sign-in methods.
Other Powerful GCP Products
Recap: We combine the decentralized world of blockchain events with the scalable,
reliable infrastructure of Google Cloud.
Key Architecture Pattern:
1. Ingest events reliably using an external gateway (Infura).
2. Decouple services with a message queue (Pub/Sub).
3. Process data with scalable workers (GKE / Cloud Run).
4. Store data in the right database for the job (Cloud SQL, Firestore, BigQuery).
5. Serve data to your users via a managed API layer (GKE / Cloud Run).
Questions?
Summary & Q&A
- Suresh Peiris
@sureshmichael
Q&A

Building Web3 Applications with Google Cloud

  • 1.
  • 2.
    What is Web3? ●A quick recap: Decentralized, user-owned, built on blockchains (e.g., Ethereum). ● Core components: Smart Contracts, On-Chain Events, Transactions. The Problem with On-Chain Data ● Blockchains are a slow "database." Direct querying for complex dApp front- ends is inefficient and often impossible. ● dApps need fast, reliable, and indexed access to both real-time and historical on-chain data. Why Google Cloud for Decentralized Apps? ● We're not putting the blockchain on GCP. We're building a highly performant off-chain infrastructure layer to support our on-chain logic. ● Gain scalability, reliability, advanced data processing, and security that are difficult to achieve otherwise. The Web3 Challenge & The Cloud Opportunity
  • 3.
    1. The CoreArchitecture: An overview of our data pipeline. 2. Listening to the Chain: Connecting to the blockchain with Infura. 3. Ingesting Events: Decoupling services with Pub/Sub. 4. Processing & Caching: The role of workers and Memorystore. 5. Choosing Your Database: Storing data in Cloud SQL, Firestore, & BigQuery. 6. Serving Your dApp: Building a scalable API on GKE. 7. Putting It All Together: A complete, scalable Kubernetes architecture. Workshop Agenda
  • 4.
  • 5.
    What is Infura? ●A Blockchain Node Provider. It gives us a reliable API (HTTP & WebSocket) endpoint to communicate with a blockchain network without running our own node. Listening for Events ● We use a WebSocket connection to an Infura gateway. ● This allows us to subscribe to specific smart contract events in real-time. ● Example: Subscribing to a Transfer event on an ERC-20 token contract. The "Listener" Service ● A simple, lightweight microservice whose only job is to maintain this WebSocket connection and receive events. ● Once an event is received, it immediately passes it on for processing. Step 1 - Listening to the Chain with Infura
  • 6.
    The Problem: What ifyour listener service crashes? What if event processing is slow? You'll lose events and create a bottleneck. Solution: Google Cloud Pub/Sub ● A fully-managed, real-time messaging service. ● Our Listener becomes a Publisher: It receives an event from Infura and immediately publishes it to a Pub/Sub "topic". Its job is done. ● This decouples event ingestion from event processing, ensuring no data is lost. ● It acts as a buffer, smoothing out traffic spikes. Step 2 - Decoupling with Event Handlers (Pub/Sub)
  • 7.
    Step 2 -Decoupling with Event Handlers (Pub/Sub)
  • 9.
    The "Worker" Service ●A separate microservice that subscribes to the Pub/Sub topic. ● Its job is to pull events from the queue and perform the heavy lifting: ○ Decode event data. ○ Enrich data by calling other APIs. ○ Format data for storage in a database. Step 3 - Processing Events with Workers & Caching Caching with Memorystore for Redis ● Before hitting a database, we can use an in- memory cache for ultra-fast operations. ● Why Cache? ○ Deduplication: Block reorgs can cause duplicate events. Use Redis to check if an event has already been processed. ○ Session Data: Store temporary user data. ○ Rate Limiting: Control access to resources.
  • 10.
    There's no singlebest database; use the right tool for the job. Cloud SQL (Managed PostgreSQL/MySQL) ● Use for: Structured, relational data. ● Web3 Example: Storing user profiles, transaction histories with clear relationships, financial ledgers. Firestore (NoSQL Document DB) ● Use for: Flexible, semi-structured data; real-time front-end updates. ● Web3 Example: Storing NFT metadata (attributes can vary), user-specific settings, activity feeds. BigQuery (Serverless Data Warehouse) ● Use for: Large-scale analytics on all your blockchain data. ● Web3 Example: Analyzing token velocity, finding top NFT holders, tracking DeFi protocol health. Workers can stream all processed events directly into BigQuery. Step 4 - Choosing the Right Google Cloud Database
  • 11.
    Why Kubernetes? ● Wehave multiple microservices (listeners, workers, API). Kubernetes is perfect for orchestrating them. ● Google Kubernetes Engine (GKE) is Google's managed Kubernetes service. Key Benefits for Web3: ● Auto-scaling: Automatically add or remove service replicas (pods) based on load. If your Pub/Sub queue gets long, GKE can automatically spin up more worker pods. ● Resilience: If a pod crashes, GKE automatically restarts it. ● Resource Management: Efficiently pack your services onto Compute Engine VMs (your GKE nodes). You can choose different VM types (e.g., e2-standard for APIs, n2-high-cpu for compute-heavy workers). A Scalable Kubernetes-Based Architecture (GKE)
  • 12.
    This is howwe organize our services inside a GKE cluster for maximum scalability and security. 1. The Event Listener Deployment ● A set of pods running the listener code. ● Only needs outbound internet access to connect to Infura. 2. The Worker Deployment ● A set of pods running the processing code. ● Doesn't need any public internet access. It just talks to Pub/Sub and your internal databases. This is more secure. 3. The API Layer Deployment ● A set of pods running your API server (e.g., Express.js, FastAPI). ● Exposed to the internet via a Google Cloud Load Balancer to serve requests from your dApp's front-end. Splitting Your Backend on GKE
  • 13.
    ● Cloud Run:A serverless alternative for your API or simple workers. Pay-per-use and scales to zero. ● API Gateway / Apigee: Secure your public API with authentication, rate limiting, and monitoring. ● Cloud Armor: Protect your API layer from DDoS and other web-based attacks. ● Cloud Monitoring & Logging: Get full observability into your entire system. Create dashboards and alerts to monitor the health of your pipeline. ● Identity Platform: Manage user identity with both traditional (email/social) and Web3 (Connect Wallet) sign-in methods. Other Powerful GCP Products
  • 14.
    Recap: We combinethe decentralized world of blockchain events with the scalable, reliable infrastructure of Google Cloud. Key Architecture Pattern: 1. Ingest events reliably using an external gateway (Infura). 2. Decouple services with a message queue (Pub/Sub). 3. Process data with scalable workers (GKE / Cloud Run). 4. Store data in the right database for the job (Cloud SQL, Firestore, BigQuery). 5. Serve data to your users via a managed API layer (GKE / Cloud Run). Questions? Summary & Q&A
  • 15.