Part 3: Data protection in AWS
This is the third in a five-part blog series that provides a checklist for proactive security and forensic readiness in the AWS cloud environment. This post relates to protecting data within AWS.
Data protection has become all the rage for organisations that are processing personal data of individuals in the EU, because the EU General Data Protection Regulation (GDPR) deadline is fast approaching.
AWS is no exception. The company is providing customers with services and resources to help them comply with GDPR requirements that may apply to their operations. These include granular data access controls, monitoring and logging tools, encryption, key management, audit capability and, adherence to IT security standards (for more information, see the AWS General Data Protection Regulation (GDPR) Center, and Navigating GDPR Compliance on AWS Whitepaper). In addition, AWS has published several privacy related whitepapers, including country specific ones. The whitepaper Using AWS in the Context of Common Privacy & Data Protection Considerations, focuses on typical questions asked by AWS customers when considering privacy and data protection requirements relevant to their use of AWS services to store or process content containing personal data.
This blog, however, is not just about protecting personal data. The following list provides guidance on protecting any information stored in AWS that is valuable to your organisation. The checklist mainly focuses on protection of data (at rest and in transit), protection of encryption keys, removal of sensitive data from AMIs, and, understanding access data requests in AWS.
The checklist provides best practice for the following:
- How are you protecting data at rest?
- How are you protecting data at rest on Amazon S3?
- How are you protecting data at rest on Amazon EBS?
- How are you protecting data at rest on Amazon RDS?
- How are you protecting data at rest on Amazon Glacier?
- How are you protecting data at rest on Amazon DynamoDB?
- How are you protecting data at rest on Amazon EMR?
- How are you protecting data in transit?
- How are you managing and protecting your encryption keys?
- How are you ensuring custom Amazon Machine Images (AMIs) are secure and free of sensitive data before publishing for internal (private) or external (public) use?
- Do you understand who has the right to access your data stored in AWS?
IMPORTANT NOTE: Identity and access management is an integral part of protecting data, however, you’ll notice that the following checklist does not focus on AWS IAM. I have created a separate checklist on IAM best practices here.
|1. How are you protecting data at rest?||
- Define polices for data classification, access control, retention and deletion
- Tag information assets stored in AWS based on adopted classification scheme
- Determine where your data will be located by selecting a suitable AWS region
- Use geo restriction (or geoblocking), to prevent users in specific geographic locations from accessing content that you are distributing through a CloudFront web distribution
- Control the format, structure and security of your data by masking, making it anonymised or encrypted in accordance with the classification
- Encrypt data at rest using server-side or client-side encryption
- Manage other access controls, such as identity, access management, permissions and security credentials
- Restrict access to data using IAM policies, resource policies and capability policies
|2. How are you protecting data at rest on Amazon S3?||
- Use bucket-level or object-level permissions alongside IAM policies
- Don’t create any publicly accessible S3 buckets. Instead, create pre-signed URLs to grant time-limited permission to download the objects
- Protect sensitive data by encrypting data at rest in S3. Amazon S3 supports server-side encryption and client-side encryption of user data, using which you create and manage your own encryption keys
- Encrypt inbound and outbound S3 data traffic
- Amazon S3 supports data replication and versioning instead of automatic backups. Implement S3 Versioning and S3 Lifecycle Policies
- Automate the lifecycle of your S3 objects with rule-based actions
- Enable MFA Delete on S3 bucket
- Be familiar with the durability and availability options for different S3 storage types – S3, S3-IA and S3-RR.
|3. How are you protecting data at rest on Amazon EBS?||
- AWS creates two copies of your EBS volume for redundancy. However, since both copies are in the same Availability Zone, replicate data at the application level, and/or create backups using EBS snapshots
- On Windows Server 2008 and later, use BitLocker encryption to protect sensitive data stored on system or data partitions (this needs to be configured with a password as Amazon EC2 does not support Trusted Platform Module (TPM) to store keys)
- On Windows Server, implement Encrypted File System (EFS) to further protect sensitive data stored on system or data partitions
- On Linux instances running kernel versions 2.6 and later, you can use dmcrypt and Linux Unified Key Setup (LUKS), for key management
- Use third-party encryption tools
|4. How are you protecting data at rest on Amazon RDS?
(Note: Amazon RDS leverages the same secure infrastructure as Amazon EC2. You can use the Amazon RDS service without additional protection, but it is suggested to encrypt data at application layer)
- Use built-in encryption function that encrypts all sensitive database fields, using an application key, before storing them in the database
- Use platform level encryption
- Use MySQL cryptographic functions – encryption, hashing, and compression
- Use Microsoft Transact-SQL cryptographic functions – encryption, signing, and hashing
- Use Oracle Transparent Data Encryption on Amazon RDS for Oracle Enterprise Edition under the Bring Your Own License (BYOL) model
|5. How are you protecting data at rest on Amazon Glacier?
(Note: Data stored on Amazon Glacier is protected using server-side encryption. AWS generates separate unique encryption keys for each Amazon Glacier archive, and encrypts it using AES-256)
- Encrypt data prior to uploading it to Amazon Glacier for added protection
|6. How are you protecting data at rest on Amazon DynamoDB?
(Note: DynamoDB is a shared service from AWS and can be used without added protection, but you can implement a data encryption layer over the standard DynamoDB service)
- Use raw binary fields or Base64-encoded string fields, when storing encrypted fields in DynamoDB
|7. How are you protecting data at rest on Amazon EMR?||
- Store data permanently on Amazon S3 only, and do not copy to HDFS at all. Apply server-side or client-side encryption to data in Amazon S3
- Protect the integrity of individual fields or entire file (for example, by using HMAC-SHA1) at the application level while you store data in Amazon S3 or DynamoDB
- Or, employ a combination of Amazon S3 server-side encryption and client-side encryption, as well as application-level encryption
|8. How are you protecting data in transit?||
- Encrypt data in transit using IPSec ESP and/or SSL/TLS
- Encrypt all non-console administrative access using strong cryptographic mechanisms using SSH, user and site-to-site IPSec VPNs, or SSL/TLS to further secure remote system management
- Authenticate data integrity using IPSec ESP/AH, and/or SSL/TLS
- Authenticate remote end using IPSec with IKE with pre-shared keys or X.509 certificates
- Authenticate remote end using SSL/TLS with server certificate authentication based on the server common name(CN), or Alternative Name (AN/SAN)
- Offload HTTPS processing on Elastic Load Balancing to minimise impact on web servers
- Protect the backend connection to instances using an application protocol such as HTTPS
- On Windows servers use X.509 certificates for authentication
- On Linux servers, use SSH version 2 and use non-privileged user accounts for authentication
- Use HTTP over SSL/TLS (HTTPS) for connecting to RDS, DynamoDB over the internet
- Use SSH for access to Amazon EMR master node
- Use SSH for clients or applications to access Amazon EMR clusters across the internet using scripts
- Use SSL/TLS for Thrift, REST, or Avro
|9. How are you managing and protecting your encryption keys?||
- Define key rotation policy
- Do not hard code keys in scripts and applications
- Securely manage keys at server side (SSE-S3, SSE-KMS) or at client side (SSE-C)
- Use tamper-proof storage, such as Hardware Security Modules (AWS CloudHSM)
- Use a key management solution from the AWS Marketplace or from an APN Partner. (e.g., SafeNet, TrendMicro, etc.)
|10. How are you ensuring custom Amazon Machine Images (AMIs) are secure and free of sensitive data before publishing for internal (private) or external (public) use?||
- Securely delete all sensitive data including AWS credentials, third-party credentials and certificates or keys from disk and configuration files
- Delete log files containing sensitive information
- Delete all shell history on Linux
|11. Do you understand who has the right to access your data stored in AWS?||
- Understand the applicable laws to your business and operations, consider whether laws in other jurisdictions may apply
- Understand that relevant government bodies may have rights to issue requests for content, each relevant law will contain criteria that must be satisfied for the relevant law enforcement body to make a valid request.
- Understand that AWS notifies customers where practicable before disclosing their data so they can seek protection from disclosure, unless AWS is legally prohibited from doing so or there is clear indication of illegal conduct regarding the use of AWS services. For additional information, visit Amazon Information Requests Portal.
For more details, refer to the following AWS resources:
Go back to the introduction AWS Cloud: Proactive Security & Forensic Readiness five-part best practice
Read Part 1 – Identity and Access management in AWS: best-practice checklist
Read Part 2 – Infrastructure level protection in AWS: best-practice checklist
Read Part 3 – Data protection in AWS: best-practice checklist
Read Part 4 – Detective Controls in AWS: best-practice checklist
Read Part 5 – Incident Response in AWS: best-practice checklist
Let us know in the comments below if we have missed anything in our checklist.
DISCLAIMER: Please be mindful that this is not an exhaustive list. Given the pace of innovation and development within AWS, there may be features being rolled out as these blogs were being written. Also, please note that this checklist is for guidance purposes only. For more information, or to request an in-depth security review of your cloud environment, please contact us.
Neha Thethi is a senior information security analyst at BH Consulting. She is an AWS Certified Solutions Architect – Associate and holder of the SANS GIAC Certified Incident Handler (GCIH). Neha has published papers, spoken at conferences, written blogs and delivered webinars about challenges of conducting forensics in the cloud environment. She has helped clients develop incident response plans and conducted several digital forensic investigations for cloud environments including AWS and Microsoft Azure.
Editor: Gordon Smith