This article is appropriate for audiences who are new to AWS Cloud and it provides insights on how one can secure their AWS infrastructure at a beginner level. The content presented in this article is based on the knowledge that I have gained while researching on the AWS cloud.
Introduction to Cloud Security
Cloud infrastructure provides organizations with a variety of benefits in contrast with on-premises infrastructure. Some of the benefits include ease of use, flexibility, cost-effectiveness, reliability, scalability and high performance, and security. These benefits drive a lot of customers to setup their infrastructure/services on the Cloud. For example, setting up an SFTP or HTTPS server on-premises requires a lot of steps including the configuration of hardware, OS, storage, networking, firewall, and so on while the same setup on Cloud can be hosted as a simple service without the customer worrying about the hardware, OS or networking configuration.
While any Cloud service provider treats the Information Security of customers as a top priority, it still remains the responsibility of the customer to make sure that they are secure. Security is a core functional requirement that protects mission-critical information from accidental or deliberate theft, leakage, integrity compromise, and deletion. Security on Cloud is shared between both the customer and the Cloud service provider.
Amazon Web Services (AWS) delivers a scalable cloud computing platform with high availability and dependability, providing the tools that enable customers to run a wide range of applications. AWS treats with utmost importance in helping to protect the confidentiality, integrity, and availability of the customers’ systems and data.
Differences Between Cloud and On-premise Environments
The main difference between cloud and on-premise is based on where the environment resides. When the customer decides to set up an environment on-premise, the entire underlying infrastructure is hosted and maintained by the customer. Whereas, when the customer decides to set up an environment on the cloud, a third-party provider hosts the underlying infrastructure on behalf of the customer.
The key factor to decide whether to use cloud or on-premise or even hybrid architectures purely depends on the customer business model and use cases. For example, to host a static web service, a customer could prefer to host it on the cloud as it is simple and easy to maintain.
AWS Shared Responsibility Model
Under the AWS shared responsibility model, AWS provides a global secure infrastructure and foundation for compute, storage, networking and database services, as well as higher level services. AWS provides a range of security services and features that AWS customers can use to secure their assets. AWS customers are responsible for protecting the confidentiality, integrity, and availability of their data in the cloud, and for meeting specific business requirements for information protection.
Before covering the details of how AWS secures its resources, it is important to understand how security in the cloud is slightly different than the security in the on-premises data centers. When we move computer systems and data to the cloud, security responsibilities become shared between us and the cloud service provider. In this case, AWS is responsible for securing the underlying infrastructure that supports the cloud, and we are responsible for anything that we put on the cloud or connect to the cloud. This shared security responsibility model can reduce the operational burden in many ways, and in some cases may even improve the default security posture without additional action on our part.
The amount of security configuration work that we have to do varies depending on which services we select and how sensitive the data is. However, there are certain security features—such as individual user accounts and credentials, SSL/TLS for data transmissions, and user activity logging—that we should configure no matter which AWS service we use.
Security Responsibility Model Based on Different Services
As AWS offers a variety of different infrastructure and platform services, the shared responsibility is categorized into three types – infrastructure, container and abstracted services. Each category comes with a slightly different security ownership model based on how we interact and access the functionality.
- Infrastructure Services: This category includes compute services, such as Amazon EC2, and related services, such as Amazon EBS, Auto Scaling, and Amazon VPC. With these services, we can architect and build a cloud infrastructure using technologies similar to and largely compatible with on-premises solutions. We control the operating system, and also configure and operate any identity management system that provides access to the user layer of the virtualization stack.
- Container Services: Services in this category typically run on separate Amazon EC2 or other infrastructure instances, but sometimes we don’t manage the operating system or the platform layer. AWS provides a managed service for these application “containers”. We are responsible for setting up and managing network controls, such as firewall rules, and for managing platform-level identity and access management separately from IAM. Examples of container services include Amazon RDS, Amazon Elastic EMR and AWS Elastic Beanstalk.
- Abstracted Services: This category includes high-level storage, database, and messaging services, such as Amazon S3, Amazon S3 Glacier, Amazon DynamoDB, Amazon SQS, and Amazon SES. These services abstract the platform or management layer on which we can build and operate cloud applications. We access the endpoints of these abstracted services using AWS APIs, and AWS manages the underlying service components or the operating system on which they reside. We share the underlying infrastructure, and abstracted services provide a multi-tenant platform which isolates our data in a secure fashion and provides for powerful integration with IAM.
Refer the AWS documentation to learn more about the shared responsibility model for each service type.
Understand the relevant security responsibility model that is suitable for the service that you are using on AWS so that the necessary security controls can be enabled or configured.
Security Concepts that Beginners Should be Aware of
AWS Account Security Features
AWS provides a variety of tools and features that we can use to keep our AWS Account and resources safe from unauthorized use. This includes credentials for access control, HTTPS endpoints for encrypted data transmission, the creation of separate IAM user accounts, user activity logging for security monitoring, and Trusted Advisor security checks. We can take advantage of all of these security tools no matter which AWS services we select.
To help ensure that only authorized users and processes access the AWS Account and resources, AWS uses several types of credentials for authentication. These include passwords, cryptographic keys, digital signatures, and certificates. AWS also provides the option of requiring multi-factor authentication (MFA) to log into the AWS Account or IAM user accounts
The following table highlights the various AWS credentials and their uses.
|Passwords||AWS root account or IAM user account login to the AWS Management Console||A string of characters used to log into your AWS account or IAM account. AWS passwords must be a minimum of 6 characters and may be up to 128 characters.|
|Multi-Factor Authentication (MFA)||AWS root account or IAM user account login to the AWS Management Console||A six-digit single-use code that is required in addition to your password to log in to your AWS Account or IAM user account.|
|Access Keys||Digitally signed requests to AWS APIs (using the AWS SDK, CLI, or REST/Query APIs)||Includes an access key ID and a secret access key. You use access keys to digitally sign programmatic requests that you make to AWS.|
|Key Pairs||SSH login to EC2 instances CloudFront signed URLs||A key pair is required to connect to an EC2 instance launched from a public AMI.
The supported lengths are 1024, 2048, and 4096. If you connect using SSH while using the EC2 Instance Connect API, the supported lengths are 2048 and 4096. You can have a key pair generated automatically for you when you launch the instance or you can upload your own.
|X.509 Certificates||Digitally signed SOAP requests to AWS APIs
SSL server certificates for HTTPS
|X.509 certificates are only used to sign SOAP-based requests (currently used only with Amazon S3). You can have AWS create an X.509 certificate and private key that you can download, or you can upload your own certificate by using the Security Credentials page.|
Individual User Accounts
AWS provides a centralized mechanism called AWS Identity and Access Management (IAM) for creating and managing individual users within an AWS Account. A user can be any individual, system, or application that interacts with AWS resources, either programmatically or through the AWS Management Console or AWS Command Line Interface (CLI). Each user has a unique name within the AWS Account, and a unique set of security credentials not shared with other users. AWS IAM eliminates the need to share passwords or keys, and enables us to minimize the use of the AWS Account credentials.
With IAM, we define policies that control which AWS services the users can access and what they can do with them. We can grant users only the minimum permissions they need to perform their jobs.
When you open an AWS account, the identity that you begin with has access to all AWS services and resources in that account. You use this identity to establish less-privileged users and role-based access in the AWS Identity and Access Management (IAM) service. However, this initial account (known as the root user) isn’t intended for everyday tasks, and these credentials should be carefully protected using multi-factor authentication (MFA) and by deleting any access keys upon completion of the initial account setup.
For the root user, you should follow the best practice of only using this login to create another, initial set of IAM users and groups for longer-term identity management operations. These privileged IAM users – carefully monitored and constrained – can be used to assume roles in one or many accounts you own.
For all IAM users, you should apply appropriate policies enforcing the use of strong authentication. You should set a password policy on the AWS account that requires a minimum length and complexity for passwords associated with IAM users. You should also set a mandatory rotation policy requiring IAM users to change their passwords at regular intervals. For all IAM users with passwords permitting access to the AWS Management Console, you should also require the use of MFA.
Network ACLs and Security Groups
The AWS network has been architected to permit us to select the level of security and resiliency appropriate for the workload. To enable us to build geographically dispersed, fault-tolerant web architectures with cloud resources, AWS implements a world-class network infrastructure that is carefully monitored and managed.
Secure Network Architecture
Network devices, including firewall and other boundary devices, are in place to monitor and control communications at the external boundary of the network and at key internal boundaries within the network. These boundary devices employ rule sets, access control lists (ACL), and configurations to enforce the flow of information to specific information system services.
ACLs, or traffic flow policies, are established on each managed interface, which manage and enforce the flow of traffic. ACL policies are approved by Amazon Information Security. These policies are automatically pushed using AWS’s ACL- Manage tool, to help ensure these managed interfaces enforce the most up-to-date ACLs.
A network access control list is an optional layer of security for the VPC that acts as a firewall for controlling traffic in and out of one or more subnets. We could set up network ACLs with rules similar to the security groups in order to add an additional layer of security to the VPC.
AWS Security Groups
A security group acts as a virtual firewall for the EC2 instances to control incoming and outgoing traffic. Inbound rules control the incoming traffic to the instance, and outbound rules control the outgoing traffic from the instance. When we launch an instance, we can specify one or more security groups. If we don’t specify a security group, Amazon EC2 uses the default security group. We can add rules to each security group that allow traffic to or from its associated instances. We can modify the rules for a security group at any time. New and modified rules are automatically applied to all instances that are associated with the security group. When Amazon EC2 decides whether to allow traffic to reach an instance, it evaluates all of the rules from all of the security groups that are associated with the instance.
When we launch an instance in a VPC, we must specify a security group that’s created for that VPC. After we launch an instance, we can change its security groups. Security groups are associated with network interfaces. Changing an instance’s security groups changes the security groups associated with the primary network interface (eth0).
ACL vs Security Groups
- ACL function at subnet level and are configured separately for inbound and outbound traffic whereas the Security Groups function at instance level and rules can be created for both inbound and outbound traffic.
- ACL are not stateful and are evaluated in ascending numerical order. Security Groups are stateful and all the rules within an SG are processed before making a decision whether to allow or deny the traffic.
- ACL rules apply to all instances within a particular subnet but SG rules apply to all the instances which are associated with it.
- Configure proper ACLs to prevent unwanted traffic flow to critical resources.
- Enforce ACLs to block direct access from Public IP addresses to the protected subnets.
- When allowing Public access to a subnet/instance, try not to allow inbound access to wide range of ports unless necessary.
- When it is needed to block outbound traffic from any subnet, Outbound SG/ACL rule can be configured.
- Enforce strict rules for SGs/subnets that are associated with ELBs or public facing instances.
- Once in a while, make sure to review the rules of ACLs and SGs so that any misconfigurations can be identified and corrected.
Securing the Infrastructure
Amazon Virtual Private Cloud (VPC)
With Amazon VPC we can create private clouds within the AWS public cloud.
Each customer Amazon VPC uses IP address space, allocated by the customer. We can use private IP addresses (as recommended by RFC 1918) for the Amazon VPCs, building private clouds and associated networks in the cloud that are not directly routable to the Internet.
Amazon VPC provides not only isolation from other customers in the private cloud, it provides layer 3 (Network Layer IP routing) isolation from the Internet as well.
We can leverage Amazon VPC-IPSec or VPC-AWS Direct Connect to seamlessly integrate on-premises or other hosted infrastructure with Amazon VPC resources in a secure fashion. With either approach, IPSec connections protect data in transit, while BGP on IPSec or AWS Direct Connect links integrate the Amazon VPC and on-premises routing domains for transparent integration for any application, even applications that don’t support native network security mechanisms.
Security Zoning and Network Segmentation
Different security requirements mandate different security controls. It is a security best practice to segment infrastructure into zones that impose similar security controls.
While most of the AWS underlying infrastructure is managed by AWS operations and security teams, we can build our own overlay infrastructure components. Amazon VPCs, subnets, routing tables, segmented/zoned applications and custom service instances such as user repositories, DNS, and time servers supplement the AWS managed cloud infrastructure.
Usually, network engineering teams interpret segmentation as another infrastructure design component and apply network-centric access control and firewall rules to manage access. Security zoning and network segmentation are two different concepts, however: A network segment simply isolates one network from another, where a security zone creates a group of system components with similar security levels with common controls.
Network segments can be built using the access control methods such as Amazon VPC, Security Groups, Network ACLs, host-based firewalls, threat protection layer, access control at other layers like applications and services.
Traditional environments require separate network segments representing separate broadcast entities to route traffic via a central security enforcement system such as a firewall. The concept of security groups in the AWS cloud makes this requirement obsolete. Security groups are a logical grouping of instances, and they also allow the enforcement of inbound and outbound traffic rules on these instances regardless of the subnet where these instances reside.
Access Key Pair
Amazon EC2 uses public key cryptography to encrypt and decrypt login information. Public key cryptography uses a public key to encrypt a piece of data, and then the recipient uses the private key to decrypt the data. The public and private keys are known as a key pair. Public key cryptography enables us to securely access the instances using a private key instead of a password.
AWS provides different ways using which we can manage access to resources that require authentication to various AWS services. However, in order to access the operating system on EC2 instances we need a different set of credentials.
In the shared responsibility model, we own the operating system credentials but AWS helps us bootstrap the initial access to the operating system. When we launch a new Amazon EC2 instance from a standard AMI, we can access that instance using secure remote system access protocols, such as Secure Shell (SSH), or Windows Remote Desktop Protocol (RDP). We must successfully authenticate at the operating-system level before we can access and configure an Amazon EC2 instance to our requirements. After we are authenticated and have remote access into the Amazon EC2 instance, we can set up the operating system authentication mechanisms that we want, which might include X.509 certificate authentication, Microsoft Active Directory, or local operating system accounts.
To enable authentication to the EC2 instance, AWS provides asymmetric key pairs, known as Amazon EC2 key pairs. These are industry-standard RSA key pairs. Each user can have multiple Amazon EC2 key pairs, and can launch new instances using different key pairs. EC2 key pairs are not related to the AWS account or IAM user credentials. Those credentials control access to other AWS services; EC2 key pairs control access only to a specific instance.
- Use multiple Availability Zone deployments so you have high availability.
- Use security groups and network ACLs.
- Use IAM policies to control access.
- Use Amazon CloudWatch to monitor your VPC components and VPN connections.
- Use flow logs to capture information about IP traffic going to and from network interfaces in your VPC.
Assess your network segmentation and security zoning with the following questions:
- Do I control inter-zone communication? Can I use network segmentation tools to manage communications between security zones A and B? Usually access control elements such as security groups, ACLs, and network firewalls should build the walls between security zones. Amazon VPCs by default builds inter-zone isolation walls.
- Can I monitor inter-zone communication using an IDS/IPS/DLP/SIEM/NBAD system, depending on business requirements? Blocking access and managing access are different terms. The porous communication between security zones mandates sophisticated security monitoring tools between zones. The horizontal scalability of AWS instances makes it possible to zone each instance at the operating systems level and leverage host-based security monitoring agents.
- Can I apply per zone access control rights? One of the benefits of zoning is controlling egress access. It is technically possible to control access by resources such as Amazon S3 and Amazon SMS resources policies.
- Can I manage each zone using dedicated management channel/roles? Role-Based Access Control for privileged access is a common requirement. You can use IAM to create groups and roles on AWS to create different privilege levels. You can also mimic the same approach with application and system users. One of the new key features of Amazon VPC–based networks is support for multiple elastic network interfaces. Security engineers can create a management overlay network using dual homed instances.
- Can I apply per zone confidentiality and integrity rules? Per zone encryption, data classification, and DRM simply increase the overall security posture. If the security requirements are different per security zone, then the data security requirements must be different as well. And it is always a good policy to use different encryption options with rotating keys on each security zone.
- Have separate SSH keys per user and consider generating your own key pair instead of AWS key pair. This way, only your public key is shared with AWS.
- Do not share the private key and protect unauthorized access to private key by encrypting it.
- Restrict SSH sudo permissions to specific users.
- Change to SSH port.
We can use detective controls to identify a potential security threat or incident. They are an essential part of governance frameworks and can be used to support a quality process, a legal or compliance obligation, and for threat identification and response efforts. There are different types of detective controls. For example, conducting an inventory of assets and their detailed attributes promotes more effective decision making (and lifecycle controls) to help establish operational baselines. We can also use internal auditing, an examination of controls related to information systems, to ensure that practices meet policies and requirements and that we have set the correct automated alerting notifications based on defined conditions. These controls are important reactive factors that can help the organization identify and understand the scope of anomalous activity.
In AWS, there are a number of approaches to consider when addressing detective controls. The following sections describe how to use these approaches:
- Capture and analyze logs
- Integrate auditing controls with notification and workflow
Capture and Analyze Logs
In traditional data center architectures, aggregating and analyzing logs typically requires installing agents on servers, carefully configuring network appliances to direct log messages at collection points, and forwarding application logs to search and rules engines. Aggregation in the cloud is much easier due to two capabilities.
First, asset management is easier because assets and instances can be described programmatically without depending on agent health. For example, instead of manually updating an asset database and reconciling it with the real install base, we can reliably gather asset metadata with just a few API calls. This data is far more accurate and timelier than using discovery scans, manual entries into a configuration management database (CMDB), or relying on agents that might stop reporting on their state.
Second, we can use native, API-driven services to collect, filter, and analyze logs instead of maintaining and scaling the logging backend ourselves. Pointing our logs at a bucket in an object store, or directing events to a real-time log processing service, means that we can spend less time on capacity planning and availability of the logging and search architecture.
In AWS, a best practice is to customize the delivery of AWS CloudTrail and other service-specific logging to capture API activity globally and centralize the data for storage and analysis. We can direct CloudTrail logs to Amazon CloudWatch Logs or other endpoints so that we can obtain events in a consistent format across compute, storage, and applications.
For instance-based and application-based logging that doesn’t originate from AWS services, we can still use agent-based tools to collect and route events. We can use Amazon CloudWatch Logs to monitor, store, and access the log files from Amazon EC2 instances, AWS CloudTrail, Amazon Route 53, and other sources. Using services and features such as AWS CloudFormation, AWS Systems Manager, or Amazon EC2 user data, system administrators can ensure that instances always have agents installed.
Integrate Auditing Controls with Notification and Workflow
Security operations teams rely on the collection of logs and the use of search tools to discover potential events of interest, which may indicate unauthorized activity or unintentional change. However, simply analyzing collected data and manually processing information is insufficient to keep up with the volume of information flowing from modern, complex architectures. Analysis and reporting alone don’t facilitate the assignment of the right resources to work an event in a timely fashion. A best practice for building a mature security operations team is to deeply integrate the flow of security events and findings into a notification and workflow system such as a ticketing system, a bug/issue system, or other security information and event management (SIEM) system. This takes the workflow out of email and static reports, allowing to route, escalate, and manage events or findings. Many organizations are now integrating security alerts into their chat/collaboration and developer productivity platforms.
This best practice applies not only to security events generated from log messages depicting user activity or network events, but from changes detected in the infrastructure itself. The ability to detect change, determine whether a change was appropriate, and then route this information to the correct remediation workflow is essential in maintaining and validating a secure architecture.
In AWS, routing events of interest and information reflecting potentially unwanted changes into a proper workflow is done using Amazon CloudWatch Events. This service provides a scalable rules engine designed to broker both native AWS event formats (such as CloudTrail events), as well as custom events that we can generate our self. We build rules that parse events, transform them if necessary, and then route such events to targets such as an AWS Lambda function, Amazon Simple Notification Service (Amazon SNS) notification, or other targets.
Detecting change and routing this information to the correct workflow can be accomplished using AWS Config rules. AWS Config detects changes to in-scope services and generates events that can be parsed using AWS Config rules for rollback, enforcement of compliance policy, and forwarding of information to systems, such as change management platforms and operational ticketing systems.
Reducing the number of security misconfigurations introduced into a production environment is critical, so the more quality control and reduction of defects that we perform in the build process, the better it is. Modern continuous integration and continuous deployment (CI/CD) pipelines should be designed to test for security issues whenever possible. Using Amazon Inspector, we can perform configuration assessments for known common vulnerabilities and exposures (CVEs), assess the instances against security benchmarks, and fully automate the notification of defects. Amazon Inspector runs on production instances or in a build pipeline, and it notifies developers and engineers when findings are present. We can access findings programmatically and direct the team to backlogs and bug-tracking systems.
As we now understand that logging and change management plays important role in detection of any abnormal patterns, threats and misconfigurations, it is required to enable logging for network, endpoint and changes made for detection controls to function as expected.
We can use any of AWS’s services for logging and event management, change management or use any other third party SIEM like Splunk.
Even with extremely mature preventive and detective controls, any organization should still put processes in place to respond to and mitigate the potential impact of security incidents. The architecture of the workload strongly affects the ability of the teams to operate effectively during an incident, to isolate or contain systems, and to restore operations to a known good state. Putting in place the tools and access ahead of a security incident, then routinely practicing incident response through game days, will help us ensure that the architecture can accommodate timely investigation and recovery.
In AWS, there are a number of different approaches to consider when addressing incident response. The Clean Room section describes how to use these approaches.
In every incident, maintaining situational awareness is one of the most important principles. By using tags to properly describe the AWS resources, incident responders can quickly determine the potential impact of an incident. For example, tagging instances and other resources with an owner or work queue in a ticketing system allows the team to engage the right people more quickly. By tagging systems with a data classification or a criticality attribute, the impact of an incident can be estimated more accurately.
During an incident, the right people require access to isolate and contain the incident, and then perform forensic investigation to identify the root cause quickly. In some cases, the incident response team is actively involved in remediation and recovery as well. Determining how to get access for the right people during an incident delays the time it takes to respond, and can introduce other security weaknesses if access is shared or not properly provisioned while under pressure. Determine the access that the team members need ahead of time, and then regularly verify that the access is functional – or easily triggered—when needed.
In AWS we can use the power of the APIs to automate many of the routine tasks that need to be performed during an incident and subsequent investigations. For example, we can isolate an instance by changing the security groups associated with an instance or removing it from a load balancer. Architecting the workload using Auto Scaling potentially allows the instance under investigation to be removed from production without affecting the availability of the applications.
Forensics often requires capturing the disk image or “as-is” configuration of an operating system, we can use EBS snapshots and the Amazon EC2 APIs to capture the data and state of systems under investigation. Storing snapshots and related incident artifacts in Amazon S3 ensures that the data will be available and retained as appropriate.
During an incident, before the root cause has been identified and the incident has been contained, it can be difficult to conduct investigations in an untrusted environment. Unique to AWS, security practitioners can use CloudFormation to quickly create a new, trusted environment in which to conduct deeper investigation. The CloudFormation template can pre-configure instances in an isolated environment that contains all the necessary tools forensic teams need to determine the cause of the incident. This cuts down on the time it takes to gather necessary tools, isolate systems under examination, and ensures that the team is operating in a clean room.
Several key AWS services and features that are critical to a mature incident response process:
- IAM should be used to grant appropriate authorization to incident response teams in advance.
- AWS CloudFormation to automate the creation of trusted environments for conducting deeper investigations.
- AWS CloudTrail provides a history of AWS API calls that can assist in response, and trigger automated detection and response systems.
- Amazon CloudWatch Events to trigger different automated actions from changes in AWS resources including CloudTrail.
- AWS Step Functions to coordinate a sequence of steps to automate an incident response process.
AWS has inbuilt tools for Security management and response and it also allows integration of third-party Security tools. The topics introduced in this document does not include overall procedures or steps to stay secure and always remember that security is not a one-time task.
Please refer AWS documentation for more information regarding all the available services and controls.