RubyGems - oak - Versions diffs - 0.4.1 → 0.4.2 - Mend

oak 0.4.1 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

data/ENCRYPTION.md ADDED

@@ -0,0 +1,893 @@
+# OAK: Encryption-in-OAK
+OAK is a serialization and envelope format which encodes simple Ruby
+objects as strings.  It bundles together a variety of well-understood
+encoding libraries into a succinct self-describing package.
+OAK v3 was first described in [OAK: The Object ArKive](DESIGN.md).
+Since 2017-09-13, OAK has been used by ALI for volatile caches in
+Redis, and for durable Correspondence bodies in S3.
+In Q4 2017 I evaluated, then set aside, the possibility of adding
+encryption features to OAK.  The motive then was to encrypt our
+volatile caches in Redis, for which our hosting provider offers no
+encryption-at-rest.  This plan was eventually scrapped because we
+decided we didn't need it and because I learned enough about modern
+encryption to see that my plan was off track.
+In Q3 2018 I am updating and simplifying that plan to to support
+encryption of secrets.
+Author: [JHW](https://github.com/jhwillett)
+Advisors: Isaac, Rishav, Chris L.
+Things get tricky with symmetric encryption. The _identity_ of our
+encryption keys must be communicated from `OAK.encode` to
+`OAK.decode`, but they cannot be explicitly present in the OAK
+string itself.
+Absent encryption, `OAK.decode` is nice unary pure function on OAK
+strings.  But to support decryption, `OAK.decode` cannot _only_ look
+at the OAK string to effect a decode.  It must also have a
+side-channel for secrets.  It degenerates to a binary function which
+also must be passed a table of available encryption keys.
+In anticipation of key migration, OAK works with a dictionary of
+multiple named keys.  `OAK.encode` records the encryption key (if
+any) in the OAK string, and `OAK.decode` uses the key to select the
+proper secrets from the keychain table.
+Furthermore, sound encryption practice with streaming modes demands we
+include random noise at the start of each encrypted stream.  Hence,
+`OAK.encode` degrades from a pure function to be nondeterministic.
+Here's a sneak preview of some OAK encryption:
+```
+$ export TOE_KEYS=foo,bar                                      # set up a key chain with 1 keys
+$ export TOE_KEY_foo=oak_3CNB_3725491808_52_RjFTQTMyX0qAlJNbIK4fwYY0kh5vNKF5mMpHK-ZBZkfFarRjVPxS_ok
+$ export TOE_KEY_bar=oak_3CNB_201101230_52_RjFTQTMyXxbYlRcFH8JgiFNZMbnlFTAfUyvJCnXgCESpBmav_Etp_ok
+$ echo 'Hello!' | bin/oak.rb --format none                     # OAK_3 with naked interior
+oak_3CNN_2640238464_12_F1SU6_Hello!_ok
+$ echo 'Hello!' | bin/oak.rb --format none --force-oak-4       # OAK_4 with naked interior sneak preview
+oak_4_N25_CN2640238464_F1SU6_Hello!_ok
+$ echo 'Hello!' | bin/oak.rb                                   # OAK_3 defaults to base64
+oak_3CNB_2640238464_16_RjFTVTZfSGVsbG8h_ok
+$ echo 'Hello!' | bin/oak.rb               --force-oak-4       # OAK_4 defaults to base64
+oak_4_B34_Q04yNjQwMjM4NDY0X0YxU1U2X0hlbGxvIQ_ok
+$ echo 'Hello!' | bin/oak.rb --key-chain TOE --key foo         # OAK_4 encrypted
+oak_4foo_B71_HlcPvmphFuA2gj1GsMBFzZuaHT1YMvq7EOcsBIO7DNtxwszsD4M4p-ZuYc5Z7oq2tl12SA0_ok
+$ echo 'Hello!' | bin/oak.rb --key-chain TOE --key foo         # OAK_4 encryption is nondeterministic
+oak_4foo_B71_TcLpBTydPhfImx7Uorg_EQPPn2q01AHjHZaXCiGimEoJA2nJZB9nhJP9Bt8_Itv7Kvn0kKs_ok
+$ echo 'Hello!' | bin/oak.rb --key-chain TOE --key foo | bin/oak.rb --key-chain TOE --mode decode-lines
+Hello!
+```
+Here is a quick parse of some OAK strings:
+```
+$ echo 'Hello!' | bin/oak.rb --format none                  # OAK_3 with naked interior
+oak_3CNN_2640238464_12_F1SU6_Hello!_ok
+oak_3                                                       # OAK ver 3
+     C                                                      # checksum Crc32
+      N                                                     # compression None
+       N                                                    # format None
+         2640238464                                         # checksum value (F1SU6_Hello!)
+                    12                                      # 12 data bytes  (F1SU6_Hello!)
+                       F1SU6_Hello!                         # data FRIZZY, 1 UTF-8 str, 6 chars, "Hello!"
+                                    ok                      # end of sequence
+$ echo 'Hello!' | bin/oak.rb --format none --force-oak-4    # OAK_4 with naked interior
+oak_4_N25_CN2640238464_F1SU6_Hello!_ok
+oak_4                                                       # OAK ver 4 w/ no encryption key
+      N                                                     # format None
+       25                                                   # 25 data bytes (CN2640238464_F1SU6_Hello!)
+          C                                                 # checksum Crc32
+           N                                                # compression None
+            2640238464                                      # checksum value
+                       F1SU6_Hello!                         # data FRIZZY, 1 UTF-8 str, 6 chars, "Hello!"
+                                    ok                      # end of sequence
+$ echo 'Hello!' | bin/oak.rb                                # OAK_3 defaults to base64
+oak_3CNB_2640238464_16_RjFTVTZfSGVsbG8h_ok
+oak_3                                                       # OAK ver 3
+     C                                                      # checksum Crc32
+      N                                                     # compression None
+       B                                                    # format Base64
+         2640238464                                         # checksum value (F1SU6_Hello!)
+                    16                                      # 16 data bytes
+                       RjFTVTZfSGVsbG8h                     # data: base64("F1SU6_Hello!")
+                                        ok                  # end of sequence
+$ echo 'Hello!' | bin/oak.rb               --force-oak-4    # OAK_4 defaults to base64
+oak_4_B34_Q04yNjQwMjM4NDY0X0YxU1U2X0hlbGxvIQ_ok
+oak_4                                                       # OAK ver 4 w/ no encryption key
+      B                                                     # format Base64
+       34                                                   # 34 data bytes Q04u...vxIQ
+          Q04yNjQwMjM4NDY0X0YxU1U2X0hlbGxvIQ                # data: base64("CN2640238464_F1SU6_Hello!")
+                                             ok             # end of sequence
+$ echo 'Hello!' | bin/oak.rb --key-chain TOE --key foo      # OAK_4 encrypted
+oak_4foo_B71_HlcPvmphFuA2gj1GsMBFzZuaHT1YMvq7EOcsBIO7DNtxwszsD4M4p-ZuYc5Z7oq2tl12SA0_ok
+oak_4foo                                                    # OAK ver 4 encrypted with key "foo"
+         B                                                  # format Base64
+          71                                                # 71 data bytes HlcP...2SA0
+             HlcPvmphFuA2g...                               # base64 of encrypted data
+```
+The header fields are authenticated, even the ones which are presented
+in plaintext:
+```
+oak_4foo_B                                                  # authenticated-but-plaintext part
+          71                                                # in-between part
+             HlcPvmphFuA2g...                               # authenticated-and-encrypted part
+```
+## OAK Encryption History
+* Proposed Q4 2017: [oak-openssl-ciphers](https://github.com/ProsperWorks/ALI/pull/5434 )
+    * Initial support for symmetric key encryption. Not integrated or active.
+    * Introduced OAK_4 for encryption but preserves read+write for OAK_3.
+    * Sought to expose all algorithms supported by OpenSSL::Cipher.
+      * By failing to curate algorithms, less-educated users are put
+        in a position of making expert decisions.
+      * Neglected differences between modes of operation.
+    * Sought to be deterministic.
+      * Neglected risks reusing initialization vectors.
+    * Did not merge.
+* Proposed Q3 2018: [oak-openssl-ciphers-redux](https://github.com/ProsperWorks/ALI/pull/9335/files)
+    * Still introduces OAK_4 for encryption but preserves read+write for OAK_3.
+    * Narrows choices to just AES-256-GCM with random IV.
+    * Authenticates all headers.
+    * Encrypts all headers not required for decryption.
+    * Split out into smaller PRs:
+        * [oak-openssl-ciphers-redux-part-i](https://github.com/ProsperWorks/ALI/pull/9560)	plan docs
+        * [oak-openssl-ciphers-redux-part-ii](https://github.com/ProsperWorks/ALI/pull/9561)	api syntax
+        * [oak-openssl-ciphers-redux-part-iii](https://github.com/ProsperWorks/ALI/pull/9562)	corruption tests
+        * [oak-openssl-ciphers-redux-part-iv](https://github.com/ProsperWorks/ALI/pull/9563)	main implementation
+        * [oak-openssl-ciphers-redux-part-v](https://github.com/ProsperWorks/ALI/pull/9572)	bin/oak.rb cli
+## JHW Revisits Encryption-in-OAK 2018-07-15
+[oak-openssl-ciphers](https://github.com/ProsperWorks/ALI/pull/5434)
+was originally prepared against
+[minor_2017_10_mystic](https://github.com/ProsperWorks/ALI/pull/5930)
+and presented in Arch Review 2017-09-18.  It never merged because
+feedback and further research raised many questions.  In particular,
+the IV is much more delicate than I originally understood.
+Per expert recommendations, GCM and CBC are the two more viable stream
+modes.  Of them, GCM is much more sensitive to accidental IV reuse.
+So much so, that GCM is not recommended in the absence of a fully
+automated IV management.
+_**At ProsperWorks' current level of organization I believe the only
+credible option is CBC or GCM with a random IV selected for every
+message.**_
+Therefore _**encrypted OAK will be nondeterministic in the plaintext**_.
+This is a bummer but I see no way to avoid it without compromising
+security.
+Also, today I see no point in supporting anything other than AES.  All
+of AES-128, AES-192, and AES-256 are probably adequate for our needs,
+but if we support just AES-256 then we don't have to answer any thorny
+questions.  There too much security downside in letting the caller
+pick any old block cipher or mode of operation which is supported by
+OpenSSL.  This outweighs any ambition to future-proof OAK by offering
+open-ended support.
+GCM not only encrypts, but authenticates.  It is an
+[AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption) and we
+can authenticate all the headers, including those which are
+transmitted in plaintext.
+Therefore, _**OAK_4 will support only AES-256-GCM with a random IV
+selected each time a message is encrypted**_.  OAK_4 keys will be 32
+byte random binary strings.  OAK_4 IVs will be 12 byte binary strings
+which are encoded into each OAK string.  OAK_4 will use no salt other
+than the random IV for each encryption
+_**OAK_4 will allow compression within encryption.**_
+_**OAK_4 will encrypt all OAK header fields except those which are
+necessary to support decryption.**_ Yes, [Kerckhoffs's
+Principle](https://en.wikipedia.org/wiki/Kerckhoffs%27s_principle),
+but also [Precautionary
+Principle](https://en.wikipedia.org/wiki/Precautionary_principle).  To
+be transmitted plain: the format and version identifiers "oak_4", the
+format code (base64 or none), the name of the key used, and the
+redundancy check for the _encrypted_ message.
+OAK_4 will also support authentication via GCM.  "oak_4", the
+encryption key name, and the format flag will be authenticated but
+transmitted plain.  All the encrypted fields are also authenticated.
+We can save space by skipping redundancy flags in encrypted OAK_4
+sequences.
+## Appendix: Excerpts from Best Practices Research
+[https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation) (sentences rearranged some here to group subtopics better)
+An initialization vector (IV) or starting variable
+(SV)[[5]](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#cite_note-ISO-10116-5)
+is a block of bits that is used by several modes to randomize the
+encryption and hence to produce distinct ciphertexts even if the same
+plaintext is encrypted multiple times, without the need for a slower
+re-keying
+process.[[6]](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#cite_note-HUANG-6)
+An initialization vector has different security requirements than a
+key, so the IV usually does not need to be secret. However, in most
+cases, it is important that an initialization vector is never reused
+under the same key.
+For CBC and CFB, reusing an IV leaks some information about the first
+block of plaintext, and about any common prefix shared by the two
+messages. ...
+In CBC mode, the IV must, in addition, be unpredictable at encryption
+time; in particular, the (previously) common practice of re-using the
+last ciphertext block of a message as the IV for the next message is
+insecure (for example, this method was used by SSL 2.0). If an
+attacker knows the IV (or the previous block of ciphertext) before he
+specifies the next plaintext, he can check his guess about plaintext
+of some block that was encrypted with the same key before (this is
+known as the TLS CBC IV attack).
+For OFB and CTR, reusing an IV completely destroys security.  This can
+be seen because both modes effectively create a bitstream that is
+XORed with the plaintext, and this bitstream is dependent on the
+password and IV only. Reusing a bitstream destroys security
+[https://esj.com/Articles/2008/07/01/8-Best-Practices-for-Encryption-Key-Management-and-Data-Security.aspx?Page=2](https://esj.com/Articles/2008/07/01/8-Best-Practices-for-Encryption-Key-Management-and-Data-Security.aspx?Page=2)
+* Step 1: Eliminate as much collection and storage of sensitive data
+  as possible - if you don't really need it, get rid of it (or never
+  collect it in the first place);
+* Step 2: Encrypt, hash, or mask the remaining sensitive data at rest
+  and in transit.
+* Best Practice #1: Decentralize encryption and decryption
+* Best Practice #2: Centralize key management with distributed execution
+* Best Practice #3: Support multiple encryption standards
+* Best Practice #4: Centralize user profiles for authentication and access to keys
+* Best Practice #5: Do not require decryption/re-encryption for key rotation or expiration
+* Best Practice #6: Keep comprehensive logs and audit trails
+* Best Practice #7: Use one solution to support fields, files, and databases
+* Best Practice #8: Support third-party integration
+[https://cloud.google.com/security/encryption-at-rest/default-encryption/](https://cloud.google.com/security/encryption-at-rest/default-encryption/)
+April 2017
+* Google uses several layers of encryption to protect customer data at
+  rest in Google Cloud Platform products.
+* Google Cloud Platform encrypts customer content stored at rest,
+  without any action required from the customer, using one or more
+  encryption mechanisms. There are some minor exceptions.
+* Data for storage is split into chunks, and each chunk is encrypted
+  with a unique data encryption key. These data encryption keys are
+  stored with the data, encrypted with ("wrapped" by) key encryption
+  keys that are exclusively stored and used inside Google's central
+  Key Management Service. Google's Key Management Service is redundant
+  and globally distributed.
+* Data stored in Google Cloud Platform is encrypted at the storage
+  level using either AES256 or AES128.
+* Google uses a common cryptographic library, Keyczar, to implement
+  encryption consistently across almost all Google Cloud Platform
+  products. (The open-sourced version of Keyczar has known security
+  issues, and is NOT the version used internally at Google.) Because
+  this common library is widely accessible, only a small team of
+  cryptographers needs to properly implement and maintain this tightly
+  controlled and reviewed code.
+* Google uses the Advanced Encryption Standard (AES) algorithm to
+  encrypt data at rest. AES is widely used because (1) [both AES256
+  and AES128 are recommended by the National Institute of Standards
+  and Technology (NIST) for long-term storage
+  use](http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar1.pdf)
+  (as of November 2015), and (2) AES is often included as part of
+  customer compliance requirements.
+* Data stored across Google Cloud Storage is encrypted at the storage
+  level using AES, in [Galois/Counter Mode
+  (GCM)](http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/gcm/gcm-spec.pdf)
+  in almost all cases. This is implemented in the [BoringSSL
+  library](https://boringssl.googlesource.com/boringssl/) that Google
+  maintains. This library was forked from OpenSSL for internal use,
+  after [many flaws were exposed in
+  OpenSSL](https://www.openssl.org/news/vulnerabilities.html). In
+  select cases, AES is used in Cipher Block Chaining (CBC) mode with a
+  hashed message authentication code (HMAC) for authentication; and
+  for some replicated files, AES is used in Counter (CTR) mode with
+  HMAC. (Further details on algorithms are provided [later in this
+  document](https://cloud.google.com/security/encryption-at-rest/default-encryption/#googles_common_cryptographic_library).)
+  In other Google Cloud Platform products, AES is used in a variety of
+  modes.
+* In addition to the storage system level encryption described above,
+  in most cases data is also encrypted at the storage device level,
+  with at least AES128 for hard disks (HDD) and AES256 for new solid
+  state drives (SSD), using a separate device-level key (which is
+  different than the key used to encrypt the data at the storage
+  level). As older devices are replaced, solely AES256 will be used
+  for device-level encryption.
+* At the time of this document's publication, Google uses the
+  following encryption algorithms for encryption at rest for DEKs and
+  KEKs. These are subject to change as we continue to improve our
+  capabilities and security.
+    * Symmetric Encryption
+        * **AES-GCM (256 bits) (preferred)**
+        * AES-CBC
+        * AES-CTR (128 and 256 bits)
+        * AES-EAX (128 and 256 bits)
+    * Symmetric Signatures
+        * HMAC-SHA256 (preferred)
+        * HMAC-SHA512
+        * HMAC-SHA1
+[http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar1.pdf](http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar1.pdf)
+NIST Special Publication 800-131A Revision 1
+Transitions: Recommendation for Transitioning the Use of Cryptographic Algorithms and Key Lengths
+November 2015
+* **The use of AES-128, AES-192, AES-256 and three-key TDEA is acceptable.**
+[http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf](http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf)
+NIST Special Publication 800-38D
+Recommendation for Block Cipher Modes of Operation: Galois/Counter
+Mode (GCM) and GMAC
+November, 2007
+* This Recommendation specifies the Galois/Counter Mode (GCM), an
+  algorithm for authenticated encryption with associated data, and its
+  specialization, GMAC, for generating a message authentication code
+  (MAC) on data that is not encrypted. **GCM and GMAC are modes of
+  operation for an underlying approved symmetric key block cipher.**
+* GCM is constructed from an approved symmetric key block cipher with
+  a block size of 128 bits, such as the Advanced Encryption Standard
+  (AES) algorithm that is specified in Federal Information Processing
+  Standard (FIPS) Pub. 197 [2]. **Thus, GCM is a mode of operation of
+  the AES algorithm.**
+* ..If the GCM input is restricted to data that is not to be
+  encrypted, the resulting specialization of GCM, called GMAC, is
+  simply an authentication mode on the input data. In the rest of this
+  document, statements about GCM also apply to GMAC.
+* **GCM provides stronger authentication assurance than a
+   (non-cryptographic) checksum or error detecting code;** in
+   particular, GCM can detect both 1) accidental modifications of the
+   data and 2) intentional, unauthorized modifications.
+* ...The underlying block cipher shall be approved, the block size
+  shall be 128 bits, and **the key size shall be at least 128 bits**.
+* ...For IVs, it is recommended that implementations restrict support
+  to the length of **96 bits, to promote interoperability, efficiency,
+  and simplicity of design**.
+    * JHW checked 96 == OpenSSL::Cipher.new('aes-256-gcm').random_iv.size * 8
+    * JHW checked 96 == OpenSSL::Cipher.new('aes-128-gcm').random_iv.size * 8
+* ... **The IVs in GCM must fulfill the following "uniqueness"
+  requirement**: The probability that the authenticated encryption
+  function ever will be invoked with the same IV and the same key on
+  two (or more) distinct sets of input data shall be no greater than
+  2^32.
+* Compliance with this requirement is crucial to the security of
+  GCM. Across all instances of the authenticated encryption function
+  with a given key, if even one IV is ever repeated, then the
+  implementation may be vulnerable to the forgery attacks that are
+  described in Ref [5] and summarized in Appendix A. **In practice,
+  this requirement is almost as important as the secrecy of the key. **
+[http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38c.pdf](http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38c.pdf)
+NIST Special Publication 800-38C
+Recommendation for Block Cipher Modes of Operation: The CCM Mode for
+Authentication and Confidentiality
+May 2004
+[http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html)
+Amazon EBS encryption handles key management for you. Each newly
+created volume is encrypted with a unique 256-bit key. Any snapshots
+of this volume and any subsequent volumes created from those snapshots
+also share that key. These keys are protected by AWS key management
+infrastructure, which implements strong logical and physical security
+controls to prevent unauthorized access. Your data and associated keys
+are encrypted using the industry standard AES-256 algorithm.
+You cannot change the CMK that is associated with an existing snapshot
+or encrypted volume. However, you can associate a different CMK during
+a snapshot copy operation (including encrypting a copy of an
+unencrypted snapshot) and the resulting copied snapshot use the new
+CMK.
+The AWS overall key management infrastructure is consistent with
+National Institute of Standards and Technology (NIST) 800-57
+recommendations and uses cryptographic algorithms approved by Federal
+Information Processing Standards (FIPS) 140-2.
+Each AWS account has a unique master key that is stored separately
+from your data, on a system that is surrounded with strong physical
+and logical security controls. Each encrypted volume (and its
+subsequent snapshots) is encrypted with a unique volume encryption key
+that is then encrypted with a region-specific secure master key. The
+volume encryption keys are used in memory on the server that hosts
+your EC2 instance; they are never stored on disk in plaintext.
+[http://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html](http://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html)
+You have three mutually exclusive options depending on how you choose
+to manage the encryption keys:
+* Use Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3) -
+  Each object is encrypted with a unique key employing strong
+  multi-factor encryption. As an additional safeguard, it encrypts the
+  key itself with a master key that it regularly rotates. Amazon S3
+  server-side encryption uses one of the strongest block ciphers
+  available, 256-bit Advanced Encryption Standard (AES-256), to
+  encrypt your data. For more information, see [Protecting Data Using
+  Server-Side Encryption with Amazon S3-Managed Encryption Keys
+  (SSE-S3)](http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html).
+* Use Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS) -
+  Similar to SSE-S3, but with some additional benefits along with some
+  additional charges for using this service. There are separate
+  permissions for the use of an envelope key (that is, a key that
+  protects your data's encryption key) that provides added protection
+  against unauthorized access of your objects in S3. SSE-KMS also
+  provides you with an audit trail of when your key was used and by
+  whom. Additionally, you have the option to create and manage
+  encryption keys yourself, or use a default key that is unique to
+  you, the service you're using, and the region you're working in. For
+  more information, see [Protecting Data Using Server-Side Encryption
+  with AWS KMS-Managed Keys
+  (SSE-KMS)](http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html).
+* Use Server-Side Encryption with Customer-Provided Keys (SSE-C) - You
+  manage the encryption keys and Amazon S3 manages the encryption, as
+  it writes to disks, and decryption, when you access your
+  objects. For more information, see [Protecting Data Using
+  Server-Side Encryption with Customer-Provided Encryption Keys
+  (SSE-C)](http://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html).
+[https://tools.ietf.org/html/rfc5084#section-2](https://tools.ietf.org/html/rfc5084#section-2)
+Using AES-CCM and AES-GCM Authenticated Encryption in the Cryptographic Message Syntax (CMS)
+November 2007
+Status: Proposed Standard
+The reuse of an **AES-CCM** or **AES-GCM** nonce/key combination destroys
+the security guarantees.  As a result, it can be extremely difficult
+to use AES-CCM or AES-GCM securely when using statically configured
+keys.  **For safety's sake, implementations MUST use an automated key
+management system**.
+JHW Note: If we want to avoid building or buying or renting a KMS, we
+should hold back from AES-CCM or AES- GCM for now.  AES-CBC is still a
+credible choice and even recommended by some.
+[https://tools.ietf.org/html/rfc4107](https://tools.ietf.org/html/rfc4107)
+Guidelines for Cryptographic Key Management
+June 2005
+Status: BEST CURRENT PRACTICE
+* When symmetric cryptographic mechanisms are used in a protocol, the
+  presumption is that automated key management is generally but not
+  always needed.  If manual keying is proposed, the burden of proving
+  that automated key management is not required falls to the proposer.
+* There is not one answer to that question; circumstances differ.  **In
+  general, automated key management SHOULD be used.** Occasionally,
+  relying on manual key management is reasonable; we propose some
+  guidelines for making that judgment.
+* Automated key management and manual key management provide very
+  different features.
+    * In particular, the protocol associated with an automated key
+      management technique will confirm the liveness of the peer,
+      protect against replay, authenticate the source of the
+      short-term session key, associate protocol state information
+      with the short-term session key, and ensure that a fresh
+      short-term session key is generated.
+    * For some symmetric cryptographic algorithms, implementations
+      must prevent overuse of a given key.  An implementation of such
+      algorithms can make use of automated key management when the
+      usage limits are nearly exhausted, in order to establish
+      replacement keys before the limits are reached, thereby
+      maintaining secure communications.
+    * Examples of automated key management systems include IPsec IKE
+      and Kerberos. S/MIME and TLS also include automated key
+      management functions.
+* Key management schemes should not be designed by amateurs; it is
+  almost certainly inappropriate for working groups to design their
+  own.
+* In general, automated key management SHOULD be used to establish
+  session keys.
+* Automated key management MUST be used if any of these conditions hold:
+    * A party will have to manage n^2 static keys, where n may become large.
+    * Any stream cipher (such as RC4
+      [[TK](https://tools.ietf.org/html/rfc4107#ref-TK)], AES-CTR
+      [[NIST](https://tools.ietf.org/html/rfc4107#ref-NIST)], or
+      AES-CCM [[WHF](https://tools.ietf.org/html/rfc4107#ref-WHF)]) is
+      used.
+    * An initialization vector (IV) might be reused, especially an
+      implicit IV.  Note that random or pseudo-random explicit IVs are
+      not a problem unless the probability of repetition is high.
+    * Large amounts of data might need to be encrypted in a short
+      time, causing frequent change of the short-term session key.
+    * Long-term session keys are used by more than two
+      parties. Multicast is a necessary exception, but multicast key
+      management standards are emerging in order to avoid this in the
+      future. Sharing long-term session keys should generally be
+      discouraged.
+    * The likely operational environment is one where personnel (or
+      device) turnover is frequent, causing frequent change of the
+      short-term session key.
+* Manual key management may be a reasonable approach in any of these
+  situations:
+    * The environment has very limited available bandwidth or very
+      high round-trip times.  Public key systems tend to require long
+      messages and lots of computation; symmetric key alternatives,
+      such as Kerberos, often require several round trips and
+      interaction with third parties.
+    * The information being protected has low value.
+    * The total volume of traffic over the entire lifetime of the
+      long-term session key will be very low.
+    * The scale of each deployment is very limited.
+* Note that assertions about such things should often be viewed with
+  skepticism. The burden of demonstrating that manual key management
+  is appropriate falls to the proponents -- and it is a fairly high
+  hurdle.
+* Systems that employ manual key management need provisions for key
+  changes.  There MUST be some way to indicate which key is in use to
+  avoid problems during transition.  Designs SHOULD sketch plausible
+  mechanisms for deploying new keys and replacing old ones that might
+  have been compromised.  If done well, such mechanisms can later be
+  used by an add-on key management scheme.
+* Lack of clarity about the parties involved in authentication is not
+  a valid reason for avoiding key management.  Rather, it tends to
+  indicate a deeper problem with the underlying security model.
+* When manual key management is used, long-term shared secrets MUST be
+  unpredictable "random" values, ensuring that an adversary will have
+  no greater expectation than 50% of finding the value after searching
+  half the key search space.
+JHW Note: RFC-4107 talks a lot about session keys.  I don't know how
+the session concept applies to our encryption at rest use cases, so I
+am not sure how to interpret some of this.
+[https://tools.ietf.org/html/bcp106](https://tools.ietf.org/html/bcp106)
+Randomness Requirements for Security
+June 2005
+* /dev/random returns bytes from the pool but blocks when the
+  estimated entropy drops to zero. As entropy is added to the pool
+  from events, more data becomes available via /dev/random.  Random
+  data obtained from such a /dev/random device is suitable for key
+  generation for long term keys, if enough random bits are in the pool
+  or are added in a reasonable amount of time.
+    * **Random data obtained from ... /dev/random ... is suitable for
+       key generation for long term keys**
+* /dev/urandom works like /dev/random; however, it provides data even
+  when the entropy estimate for the random pool drops to zero.  This
+  may be adequate for session keys or for other key generation tasks
+  for which blocking to await more random bits is not acceptable.  The
+  risk of continuing to take data even when the pool's entropy
+  estimate is small in that past output may be computable from current
+  output, provided that an attacker can reverse SHA-1.  Given that
+  SHA-1 is designed to be non-invertible, this is a reasonable risk.
+    * **/dev/urandom ... may be adequate for session keys or for other
+       key generation tasks for which blocking to await more random
+       bits is not acceptable.**
+* To obtain random numbers under Linux, Solaris, or other UNIX systems
+  equipped with code as described above, all an application has to do
+  is open either /dev/random or /dev/urandom and read the desired
+  number of bytes.
+[https://www.feistyduck.com/library/openssl-cookbook/online/ch-openssl.html#openssl-recommended-configuration](https://www.feistyduck.com/library/openssl-cookbook/online/ch-openssl.html#openssl-recommended-configuration)
+The design principles for all configurations here are essentially the
+same as those from the previous section, but I am going to make two
+changes to achieve better performance. First, I am going to put
+128-bit suites on top of the list. Although 256-bit suites provide
+some increase in security, for most sites the increase is not
+meaningful and yet still comes with the performance penalty. Second, I
+am going to prefer HMAC-SHA over HMAC-SHA256 and HMAC-SHA384
+suites. The latter two are much slower but also don't provide a
+meaningful increase in security.
+...
+The following is my default starting configuration, designed to offer
+strong security as well as good performance:
+- ECDHE-ECDSA-AES128-GCM-SHA256
+- ECDHE-ECDSA-AES256-GCM-SHA384
+- ECDHE-ECDSA-AES128-SHA
+- ECDHE-ECDSA-AES256-SHA
+- ECDHE-ECDSA-AES128-SHA256
+- ECDHE-ECDSA-AES256-SHA384
+- ECDHE-RSA-AES128-GCM-SHA256
+- ECDHE-RSA-AES256-GCM-SHA384
+- ECDHE-RSA-AES128-SHA
+- ECDHE-RSA-AES256-SHA
+- ECDHE-RSA-AES128-SHA256
+- ECDHE-RSA-AES256-SHA384
+- DHE-RSA-AES128-GCM-SHA256
+- DHE-RSA-AES256-GCM-SHA384
+- DHE-RSA-AES128-SHA
+- DHE-RSA-AES256-SHA
+- DHE-RSA-AES128-SHA256
+- DHE-RSA-AES256-SHA256
+JHW Note: ^^^ Of course that recommendation is for a web site, not for
+cold storage.
+[http://ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL/Cipher.html](http://ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL/Cipher.html)
+Ruby docs for OpenSSL::Cipher
+* You should never use ECB mode unless you are absolutely sure that
+  you absolutely need it
+* Always create a secure random IV for every encryption of your
+  [Cipher](http://ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL/Cipher/Cipher.html)
+* If the
+  [OpenSSL](http://ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL.html)
+  version used supports it, an Authenticated Encryption mode (such as
+  GCM or CCM) should always be preferred over any unauthenticated
+  mode.
+* Currently,
+  [OpenSSL](http://ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL.html)
+  supports AE only in combination with Associated Data (AEAD) where
+  additional associated data is included in the encryption process to
+  compute a tag at the end of the encryption. This tag will also be
+  used in the decryption process and by verifying its validity, the
+  authenticity of a given ciphertext is established.
+* This is superior to unauthenticated modes in that it allows to
+  detect if somebody effectively changed the ciphertext after it had
+  been encrypted. This prevents malicious modifications of the
+  ciphertext that could otherwise be exploited to modify ciphertexts
+  in ways beneficial to potential attackers.
+* If no associated data is needed for encryption and later decryption,
+  the
+  [OpenSSL](http://ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL.html)
+  library still requires a value to be set - "" may be used in case
+  none is available. An example using the GCM (Galois Counter Mode)...
+[https://tools.ietf.org/html/rfc4880#page-6](https://tools.ietf.org/html/rfc4880#page-6)
+OpenPGP Message Format
+November 2007
+Status PROPOSED  STANDARD
+* OpenPGP wraps it all in Radix-64 aka ASCII Armor.
+    * header (e.g. "-----BEGIN PGP MESSAGE-----")
+    * armor headers
+    * armored data
+    * checksum
+    * footer
+* To encrypt, OpenPGP generates a new "session key" for each message,
+  which is encrypted with the recipient's public key.
+    * The encrypted session key is sent with the message.
+The (unencrypted) session key is used to symmetrically encrypt the
+(usually compressed) message.
+* To authenticate, OpenPGP generates a hash of the message, encrypts
+  it with the sender's private key.
+    * The encrypted message hash is sent with the message.
+* OpenPGP implements, and recommends, compress-then-encrypt!
+    * "OpenPGP implementations SHOULD compress the message after
+      applying the signature but before encryption."
+    * "... Furthermore, compression has the added side effect that
+      some types of attacks can be thwarted by the fact that slightly
+      altered, compressed data rarely uncompresses without severe
+      errors.  This is hardly rigorous, but it is operationally
+      useful. ..."
+* Asymmetric include RSA, Elgamal, DSA.
+* Symmetric include plain, IDEA, TripleDES, CAST5, Blowfish, AES-128,
+  -192, -256, Twofish
+* No IV but super-salty.
+    * "OpenPGP CFB mode uses an initialization vector (IV) of all
+      zeros, and prefixes the plaintext with BS+2 octets of random
+      data, such that octets BS+1 and BS+2 match octets BS-1 and BS.
+      It does a CFB resynchronization after encrypting those BS+2
+      octets."
+* Compression include none, ZIP, ZLIB, BZIP2
+* Hashes include MD5, SHA-1, SHA256, SHA512, others
+* GnuPG is a compliant implementation.
+* OpenPGP is slammed in
+  [https://blog.cryptographyengineering.com/2014/08/13/whats-matter-with-pgp/](https://blog.cryptographyengineering.com/2014/08/13/whats-matter-with-pgp/)
+  on:
+    * Key management
+    * Format
+    * Defaults
+[https://www.apple.com/business/docs/iOS_Security_Guide.pdf](https://www.apple.com/business/docs/iOS_Security_Guide.pdf)
+iOS Security
+10 March 2017
+* When an iOS device is turned on, its application processor
+  immediately executes code from read-only memory known as the Boot
+  ROM. This immutable code, known as the hardware root of trust, is
+  laid down during chip fabrication, and is implicitly trusted. The
+  Boot ROM code contains the Apple Root CA public key, which is used
+  to verify that the iBoot bootloader is signed by Apple before
+  allowing it to load.
+* Every iOS device has a dedicated AES 256 crypto engine built into
+  the DMA path between the flash storage and main system memory,
+  making file encryption highly efficient.
+* The device's unique ID (UID) and a device group ID (GID) are AES
+  256-bit keys fused (UID) or compiled (GID) into the application
+  processor and Secure Enclave during manufacturing. No software or
+  firmware can read them directly;
+* Additionally, the Secure Enclave's UID and GID can only be used by
+  the AES engine dedicated to the Secure Enclave. The UIDs are unique
+  to each device and aren't recorded by Apple or any of its suppliers.
+* The UID allows data to be cryptographically tied to a particular device.
+* Apart from the UID and GID, all other cryptographic keys are created
+  by the system's random number generator (RNG) using an algorithm
+  based on CTR_DRBG. System entropy is generated from timing
+  variations during boot, and additionally from interrupt timing once
+  the device has booted. Keys generated inside the Secure Enclave use
+  its true hardware random number.
+* **Every time a file on the data partition is created,** Data
+  Protection creates **a new 256-bit key** (the "per-file" key) and
+  gives it to the hardware AES engine, which uses the key to encrypt
+  the file as it **is written to flash memory using AES CBC
+  mode**. ... **The initialization vector (IV) is calculated with the
+  block offset into the file, encrypted with the SHA-1 hash of the
+  per-file key.** The **per-file key is wrapped** with one of several
+  class keys, depending on the circumstances under which the file
+  should be accessible. Like all other wrappings, this is performed
+  **using NIST AES key wrapping, per RFC 3394**. The wrapped per-file
+  key is stored in the file's metadata.
+* The metadata of all files in the file system is encrypted with a
+  random key, which is created when iOS is first installed or when the
+  device is wiped by a user.
+* ...this key isn't used to maintain the confidentiality of data;
+  instead, it's designed to be quickly erased on demand
+[https://tools.ietf.org/html/rfc3394](https://tools.ietf.org/html/rfc3394)
+Advanced Encryption Standard (AES) Key Wrap Algorithm
+September 2002
+* The AES Key Wrap algorithm will probably be adopted by the USA for
+  encryption of AES keys.
+* NIST has assigned the following object identifiers to identify the
+  key wrap algorithm...
+    * id-aes128-wrap
+    * id-aes192-wrap
+    * id-aes256-wrap
+[https://tools.ietf.org/html/rfc529](https://tools.ietf.org/html/rfc5297)[7](https://tools.ietf.org/html/rfc5297)
+Synthetic Initialization Vector (SIV) Authenticated Encryption
+Using the Advanced Encryption Standard (AES)
+October 2008
+* The nonce-based authenticated encryption schemes described above are
+  susceptible to reuse and/or misuse of the nonce.  Depending on the
+  specific scheme there are subtle and critical requirements placed on
+  the nonce.
+* ... many applications obtain access to cryptographic functions via
+  an application program interface to a cryptographic library.
+* These libraries are typically not stateful and any nonce,
+  initialization vector, or counter required by the cipher mode is
+  passed to the cryptographic library by the application.
+* Putting the construction of a security-critical datum outside the
+  control of the encryption engine places an onerous burden on the
+  application writer who may not provide the necessary cryptographic
+  hygiene.
+* Perhaps his random number generator is not very good or maybe an
+  application fault causes a counter to be reset.  The fragility of
+  the cipher mode may result in its inadvertent misuse.  Also, if
+  one's environment is (knowingly or unknowingly) a virtual machine,
+  it may be possible to roll back a virtual state machine and cause
+  nonce reuse thereby gutting the security of the authenticated
+  encryption scheme.
+[https://www.schneier.com/books/cryptography_engineering/](https://www.schneier.com/books/cryptography_engineering/)
+Cryptography Engineering: Design Principles and Practical Applications
+© 2010 Ferguson, Schneier, Kohno
+I cite here the book, not the website.
+* Chapter 3, Section 3.5.6, p59
+  * "Despite these cryptographic advances, **AES is still what we
+    recommend**. It is fast.  All known attacks are theoretical, not
+    practical... It is also the official standard, sanctioned by the
+    U.S. government."
+* Chapter 3, Section 3.5.7, p60
+  * "Note that **we advocate the use of 256-bit keys for systems with
+    a design strength of 128 bits**."
+  * "To emphasize our desire for 128 bits of security, and thus our
+    quest for a secure block cipher, **we will use AES with 256-bit
+    keys throughout the rest of this book**."  But once there is a
+    clear consensus of how to respond to the new cryptanalytic
+    results against AES, we will likely replace AWS with another
+    block cipher with 256-bit keys."
+* Chapter 4, Section 4.5, p70
+  * "As with OFB mode, **you must make absolutely sure never to reuse
+    a single key/nonce combination**.  This is a disadvantage that is
+    often mentioned for CTR, but CBC has exactly the same problem.
+    If you use the same IV twice, you start leaking data about the
+    plaintexts.  **CBC is a bit more robust, as it is more likely to
+  * "The real question is whether you can ensure that the nonce is
+    unique.  If there's any doubt, **you should use a mode like
+    random IV CBC mode, where the IV is generated randomly and
+    outside of the application developer's control**.
+* Chapter 4, Section 4.7, p71
+  * "Nonce generation turns out to be a really hard problem in many
+    systems, so we do not recommend exposing to application
+    developers any mode that uses nonces. ... so **if you're
+    developing an application and need to use an encryption mode,
+    play it safe and use random IV CBC mode**.
+* Chapter 5, Section 5.5, p87
+  * "In the short term, **we recommend using one of the newer SHA
+    hash function family members - SHA-224, SHA-256, SHA-385, or
+    SHA-512**.  Moreover we suggest you choose a hash function from
+    the SHA(sub d) family, or **use SHA-512 and truncate the output
+    to 256 bits**.  In the long run, **we will very likely recommend
+    the winner of the SHA-3 competition**."
+  * JHW Note: [https://en.wikipedia.org/wiki/SHA-3](https://en.wikipedia.org/wiki/SHA-3) SHA-3 released August 5, 2015
+* Chapter 6, Introduction, p89
+  * "**Encryption** prevents Eve from reading the messages but **does
+    not prevent her from manipulating the messages**.  This is where
+    the MAC comes in."
+* Chapter 6, Section 6.6, p95
+  * "As you may have gathered from the previous discussion, **we
+    would choose HMAC-SHA-256**: the HMAC construction using SHA-256
+    as a hash function.  Most systems use 64- or 96-bit MAC values,
+    and even that might seem like a lot of overhead.  As far as we
+    know, there is no collision attack on the MAC value if it is
+    used in the traditional manner, so **truncating the results from
+    HMAC-SHA-256 to 128 bits should be safe**, given current
+    knowledge in the field"
+  * "GMAC is fast, but provides only at most 64 bits of security and
+    isn't suitable when used to produce short tags.  It also
+    requires a nonce, which is a common source of security problem
+    [sic -jhw]."
+* Chapter 6, Section 6.7, p97
+  * "This is where the Horton Principle comes in.  **You should
+    authenticate the meaning, not the message.** This means that **the
+    MAC should authenticate not only _m_, but also all the
+    information that Bob uses in parsing _m_ into its meaning**.
+    This would typically include data like protocol identifier,
+    protocol version number, protocol message identifier, sizes for
+    various fields, etc."
+  * "The Horton Principle is one of the reasons why **authentication
+    at lower protocol levels does not provide adequate
+    authentication for higher-level protocols**.  An authentication
+    system at the IP packet level cannot know how the e-mail program
+    is going to interpret the data.  This precludes it from checking
+    that the context in which the message is interpreted is the same
+    as the context in which the message was sent.  The only solution
+    is to have the e-mail program provide its own authentication of
+    the data exchanged - in addition to the authentication on the
+    lower levels of course."