In today’s fast-evolving cryptocurrency landscape, access to timely and structured price data is essential for traders, analysts, and developers building financial applications. This article explores a fully automated, serverless cloud pipeline that fetches hourly cryptocurrency prices — including Bitcoin (BTC), Ethereum (ETH), Litecoin (LTC), and XRP — from the CoinGecko API, stores them in Amazon S3, and manages infrastructure through Terraform and GitHub Actions.
Designed for scalability, reliability, and ease of deployment, this solution eliminates the need for manual intervention while ensuring consistent data collection. Whether you're building historical datasets, powering analytics dashboards, or training machine learning models, this architecture offers a solid foundation.
👉 Discover how to automate crypto data collection securely and efficiently.
Core Features of the Pipeline
The pipeline delivers several powerful capabilities out of the box:
- Hourly price updates for major cryptocurrencies in USD
- Serverless execution using AWS Lambda, triggered by Amazon EventBridge
- Persistent storage of timestamped JSON files in Amazon S3
- Infrastructure as Code (IaC) via Terraform for repeatable, version-controlled deployments
- Continuous Integration/Continuous Deployment (CI/CD) powered by GitHub Actions
These components work together seamlessly to create a robust, low-maintenance system ideal for developers focused on data-driven applications.
How the Architecture Works
The system follows a clean, event-driven design:
CoinGecko API → AWS Lambda (triggered hourly by EventBridge) → JSON data stored in S3Here’s a breakdown of each component:
1. Data Source: CoinGecko API
The pipeline pulls real-time cryptocurrency prices from the public CoinGecko API, which provides reliable, up-to-date market data without requiring authentication for basic endpoints.
2. Execution Layer: AWS Lambda + EventBridge
An AWS Lambda function runs every hour, initiated by an Amazon EventBridge rule. This serverless approach ensures cost efficiency (you only pay when the function runs) and automatic scaling.
3. Storage: Amazon S3
Each execution generates a JSON file named with a UTC timestamp (e.g., crypto-prices/2025-04-05T10:00:00Z.json) and uploads it to a designated S3 bucket. This structure enables easy querying, backup, and integration with downstream tools like AWS Athena or data lakes.
4. Infrastructure Management: Terraform
All AWS resources — including the Lambda function, S3 bucket, IAM roles, and EventBridge schedule — are defined in main.tf using Terraform. This allows full version control, reproducibility across environments, and safe rollbacks if needed.
5. Deployment Automation: GitHub Actions
Every push to the main branch triggers a GitHub Actions workflow (defined in .github/workflows/main.yml) that automatically applies Terraform changes and deploys the latest Lambda code — enabling true CI/CD with zero manual steps.
👉 Learn how automation can streamline your crypto data workflows.
Getting Started: Setup Guide
To deploy this pipeline, follow these steps:
Prerequisites
Before beginning, ensure you have:
- An active AWS account with permissions to create Lambda functions, S3 buckets, and EventBridge rules
- Terraform installed locally (version 1.0 or higher recommended)
- AWS CLI configured with valid credentials (
aws configure) - A GitHub account and cloned repository
Step-by-Step Deployment
Clone the Repository
git clone https://github.com/emaseku/E-Crypto-Prices.git cd E-Crypto-PricesConfigure Terraform Variables
Create a
terraform.tfvarsfile to specify your AWS region and preferred bucket name:aws_region = "us-east-1" s3_bucket_name = "your-unique-bucket-name-crypto-prices"Initialize and Apply Terraform
Initialize the backend and apply the configuration:
terraform init terraform applyConfirm the plan to provision all necessary AWS resources.
Enable GitHub Actions (Optional but Recommended)
Push your code to a GitHub repository. The included workflow will automatically detect changes and deploy updates on every commit to the
mainbranch.
Project Structure Overview
Understanding the file layout helps with customization and maintenance:
handle.py: The core Lambda function written in Python. It calls the CoinGecko API, processes the response, and uploads formatted JSON to S3.main.tf: Contains Terraform resource definitions for AWS Lambda, IAM roles, S3 bucket, and EventBridge scheduler..github/workflows/main.yml: YAML configuration for GitHub Actions CI/CD pipeline.
You can extend handle.py to include additional coins, convert prices into other fiat currencies, or add logging and error alerts.
Use Cases and Data Applications
Once deployed, the collected data opens doors to various applications:
- Historical price analysis for trading strategy backtesting
- Time-series dashboards using tools like Grafana or Amazon QuickSight
- Machine learning models predicting price trends based on hourly patterns
- Audit trails for compliance or internal reporting
Because each record includes a precise timestamp and standardized format, integrating this data into larger systems becomes straightforward.
Frequently Asked Questions (FAQ)
Q: Can I add more cryptocurrencies beyond BTC, ETH, LTC, and XRP?
A: Yes! The handle.py script can be modified to include any coin supported by the CoinGecko API. Simply update the list of coin IDs in the request URL.
Q: Is there a cost associated with running this pipeline?
A: While most services have free tiers (e.g., 1M Lambda requests/month, 5GB S3 storage), long-term usage may incur minimal charges based on AWS pricing. Always monitor usage in the AWS console.
Q: How is data organized in the S3 bucket?
A: Files are stored under the crypto-prices/ prefix with filenames matching ISO 8601 timestamps (e.g., crypto-prices/2025-04-05T12:00:00Z.json), making them easy to sort and query.
Q: What happens if the API call fails?
A: Currently, failures result in missing entries. For production use, consider adding retry logic or CloudWatch alarms to notify of execution errors.
Q: Can I change the frequency from hourly to every 15 minutes?
A: Yes — adjust the EventBridge schedule expression in main.tf from rate(1 hour) to rate(15 minutes).
Q: Is sensitive data involved in this pipeline?
A: No personal or private data is processed. The CoinGecko API returns public market data, and no credentials are stored in the codebase when using proper IAM roles.
Final Thoughts
This serverless crypto price ingestion pipeline demonstrates how modern DevOps practices — combining cloud computing, infrastructure automation, and CI/CD — can simplify data engineering tasks. By leveraging AWS Lambda, S3, Terraform, and GitHub Actions, developers gain a maintainable, scalable way to collect valuable cryptocurrency insights without managing servers.
Whether you're exploring blockchain analytics or building next-generation fintech tools, automating data collection is the first step toward smarter decision-making.