AWS Snowball
AWS Snowball is a data transport solution that addresses challenges associated with large-scale data transfers, including high network costs, long transfer times, and security concerns. This service is part of the AWS Snow Family, which helps physically transport up to exabytes of data into and out of AWS.
Use case: Large-scale Data Migration, Data Center Decommissioning, Disaster Recovery, Content Distribution, Data Collection in Remote or Edge Locations, Secure Data Transfer
Steps: You start by ordering a Snowball device through the AWS Management Console. During this process, you specify the AWS S3 buckets to which your data will be transferred. AWS then prepares and ships a Snowball device to your location. Once the Snowball arrives, you connect it to your local network.
Reference: https://docs.aws.amazon.com/snowball/latest/developer-guide/whatisedge.html
AWS Athena
Log Analysis, ETL Data Transform, Security Auditing queries.
Amazon Redshift
Data Warehousing, Business Intelligence and Reporting, Data Analytics, Log and Event Data Analysis
Choosing Between Redshift and Athena
- Performance and Scale: Choose Redshift for high-performance, complex querying needs, and large-scale data warehousing. Opt for Athena for serverless querying that’s scalable and easy to use for ad-hoc analysis.
- Data Storage and Management: If you already have a significant investment in Amazon S3 and need to query data where it resides without moving it into a separate storage system, Athena is the right choice. If your use case involves complex data aggregation, transformation, and the need for a persistent data store, Redshift is more suitable.
- Operational Overhead and Cost: Athena offers a pay-per-query model that might be more cost-effective for irregular query patterns, with minimal operational overhead. Redshift requires upfront provisioning and has ongoing costs associated with cluster management, but it offers better performance for complex queries and large datasets.
AWS organization of IAM principals
https://aws.amazon.com/blogs/security/control-access-to-aws-resources-by-using-the-aws-organization-of-iam-principals/
AWS Endpoints
https://docs.aws.amazon.com/general/latest/gr/rande.html#view-service-endpoints
Public Service Endpoints
VPC Endpoints
- Interface Endpoints: An elastic network interface (ENI) with a private IP address that serves as an entry point for traffic destined to a supported service. Interface endpoints are powered by AWS PrivateLink.
- Gateway Endpoints: Available for Amazon S3 and DynamoDB, gateway endpoints are used within your VPC to allow direct, private connections to these services. They are implemented as a gateway in your VPC’s route table.
Global Endpoints (S3, CloudFront)
S3 Transfer Acceleration Endpoints
EBS – Segregated data storage
EFS – Shared file system across instances
Decoupling components solution – AWS SQS
- Loose Coupling: SQS allows different components of a system to communicate and operate independently. If one process fails or becomes overloaded, it does not directly impact the ability of other services to operate. This resilience is crucial for maintaining system reliability.
- Scalability: By decoupling components, you can scale them independently based on demand. For example, if the process that handles incoming orders needs more resources, it can be scaled without having to scale the entire system, leading to more efficient resource use.
- Reliability: SQS ensures message delivery with at least once delivery guarantee and supports message durability, keeping messages available until they are processed and deleted. This reliability is essential for critical applications where data loss cannot be afforded.
- Flexibility: With SQS, you can integrate different types of producer and consumer services that operate at varying speeds. For instance, a fast producer can continue to enqueue messages while a slower consumer processes them at its own pace.
- Simplified Architecture: Using SQS, you can simplify the architecture of your application. Producers only need to send messages to the queue, and consumers only need to poll the queue for new messages. This separation simplifies the development and maintenance of both producers and consumers.
- Cost-Efficiency: SQS’s pay-as-you-go pricing model means you pay only for what you use, with no upfront costs or minimum fees. This can be more cost-effective than provisioning dedicated resources for inter-component communication.
Configure an Amazon Simple Queue Service (Amazon SQS) queue as a destination for the jobs. Implement the compute nodes with Amazon EC2 instances that are managed in an Auto Scaling group. Configure EC2 Auto Scaling based on the size of the queue.
Note: Configuring EC2 Auto Scaling based on the size of the SQS queue is a smart way to dynamically adjust the number of EC2 instances in the compute layer according to the volume of jobs waiting to be processed. When the queue size grows, indicating a higher workload, the Auto Scaling group can automatically launch additional EC2 instances to handle the increased load. Conversely, when the queue size decreases, it can terminate instances to reduce costs. This approach ensures that the application scales dynamically in response to actual demand, improving both resiliency and scalability.
This design allows the company to modernize their application in a way that is both cost-effective and capable of handling variable workloads with minimal manual intervention.
AWS SQS FIFO option: First-in-first-out delivery, message ordering is preserved
- First-in-first-out delivery
- Exactly-once processing