Author: Ruchir Tewari

Bitcoin ransomware attacks – malware

Two ransomware attacks happened in March where the program encrypted files on computer systems in hospitals, in Kentucky and California, then demanded payment in bitcoin for access to the encryption key.

This month a similar attack happened on a hospital in Calgary.

Citrix recently reported a large percentage of enterprises are now buying bitcoin to protect against such an attack.

Backing up systems is important to recovery. Another type of countermeasure is VDI where the desktop is rendered from a remote VM where the information is protected.

Lexus bad-update details

A bad update on Lexus cars crashed its car entertainment system, affecting cars from California to Massachusetts. Details were reported  by SecurityLedger this week.

Users need to bring in their car to a Toyota/Lexus dealer to solve the problem via a forced reset and clear the bad data from the system. The bad behavior is due to incorrect handling of error data returned from third party web services.

“As more automakers embrace over-the-air software updates as a way to push out necessary fixes to vehicle owners, the prospect of unreliable and malicious updates causing real world disruptions has grown. In a March report to Congress (PDF), the U.S. Government Accountability Office (GAO) noted that modern vehicles feature many communications interfaces that are vulnerable to attack, but that measures to address those threats are likely years away, as automakers work to design more secure in-vehicle systems.” – SecurityLedger quote.

Updates and secure updates have not been a well-solved problem in the software world. A backup is usually recommended, but not always possible. Bringing the same solutions to cars and IoT seems like a bad idea.  The need for secure OTA auto updates to work has been noted, for e.g. at the “Five Star Automotive Cyber Safety Program” here and here. Yet it has not been a prominent part of the automotive manufacturers’ lexicon.

As we move towards autonomous cars and a corresponding increase in complexity, these problems would need to be solved in more elegant way.

ICSA Internet of Things Security Certification Requirements

ICSA recently announced an Internet of Things testing and certification program. It has six components (highlights in brackets) –

  1. cryptography (FIPS 140-2 crypto algos by default, secure PRNGs)
  2. communications (PKI auth, all traffic must be authorized)
  3. authentication (secure auth, protect auth data, no privilege escalation)
  4. physical security (tamper detection, defense, disable)
  5. platform security (secure boot, secure remote upgrade, DoS defense)
  6. alert/logging (log upgrades, attacks, tampering, admin access)

Their IoT security requirements framework is found here.

This is a great list. I think another dimension to think about is usability of the security – many products come with security options buried so deep in documentation or UI, that a regular user may not configure the device securely and leave it more open than intended – this has historically been true of a variety of webcams, SCADA systems, wifi routers and other devices.

WebServices Composition with AWS

Some interesting diagrams on composition of a device data processing pipeline with AWS are at –http://aws-de-media.s3.amazonaws.com/images/jmetzner_Hackday_Berlin.pdf

The services listed are:
Amazon Cognito: Identity and Security. Gets token with role for API access by a certain user.
Amazon Kinesis: Massive data ingestion. Uses token auth, but token signing can be a easiest.
AWS Lambda: Serverless Data Compute. Supposed to save on EC2 instance costs ( at the expense of lock-in ).
Amazon S3: Virtually unlimited storage. This is what really makes AWS tick.
Amazon Redshift: Petabyte-scale data analysis
It does not say what data goes to S3 and what data goes to the database.
On Redshift, here’s a comment from Nokia:
http://www.cio.com/article/2860383/data-warehousing/7-amazon-redshift-success-stories.html#slide3” where their volume of data “literally broke the database”, prompting them to look for more scalable solutions.
There is a tension between “servers” and “services”, which goes back to IAAS vs PAAS distinction. PAAS can be faster to develop with reduced focus on server maintenance. However the number of PAAS concepts to deal with is neither small nor particularly inviting, as instead of a single server, one now has to deal with multiple services, each has to be authenticated, priced,  guarded for possible misuse and each has the potential for surprises. A key to simplicity is how composable the services are.

Security Competition Open Sourced

Facebook made a Capture The Flag (CTF) cybersecurity competition open source and available this week  at https://github.com/facebook/fbctf .

There are several other CTF projects on github. I like that this approach to cybersecurity gets one thinking like an attacker. The problem is that the attack surface in highly connected systems is not obvious or easily modeled.

How about CTF competitions for IoT Security? There was one in March – http://www.wamda.com/memakersge/2016/04/challenges-possibilities-iot-big-data

capture_the_flah_competition_on_day_3_at_the_innovation_zone_courtesy_ftw

 

Reactive Microservices

I came across CQRS and attended a talk by James Roper on Lagom for micro services called “Rethinking REST” this week. The idea put forth in the talk was that REST services being synchronous are not ideal for microservices. Microservices should not be blocking , so something that emphasizes this async aspect would be preferable to REST.

How is this notion of async services different from a pub-sub model ? One way it goes beyond that is by proposing polyglot persistence. Different microservices should use their own persistence layer that is optimal – relational, nosql, timeseries, event log.

Lagom architecture is based on the book Reactive Services Architecture and that also suggests CQRS. The book proposes service isolation and that composition of systems with microservices should be done asynchronously via message passing.

A quote from the book on why bulkheads failed to save the Titanic:

“The Titanic did use bulkheads, but the walls that were supposed to isolate the compartments did not reach all the way up to the ceiling. So when 6 out of its 16 compartments were ripped open by the iceberg, the ship started to tilt and water spilled over from one compartment to the next, until all of the compartments were filled with water and the Titanic sank, killing 1500 people.”

The suggestion is that for higher availability there should be stricter isolation. The individual (micro)services may fail but the overall system should not be affected. Looking at it this way I think requires one to examine more closely the system design and its invariants.

Take CQRS as an example. In a query system, data is being read and not being modified; a query is a search operation on accumulated data – potentially a very large set with speed (availability) demands. Whereas in a command system, there is more real-time,  perhaps collaborative aspect which leaves most of the data untouched, but creates some new data which needs to be recorded (upload this image, send this message). Why should these two very different operations be served by the same backend ?

One of the systems that has done separation of concerns remarkably well is  AWS. Let’s take the separation of S3 and the database (say Dynamodb). Neither filesystem vendors nor database vendors came up with simple services to solve the problem of an app trying to upload an object and update a database that the object has been uploaded. The failure modes of each service are exposed to the client and the client is not forced to upload files to S3 via a server (bottleneck). S3 offers a read-write consistency for new uploads and eventually consistency for updates. Here the storage object, its universally addressable name, and its properties (backup, encryption, access, versioning, retention policies) are an invariant -a client does not need to fiddle with them apart from being assured of a certain level of service. One can call it an ingestion system.  More on that here.

The takeaway is that microservices should do one thing and do it very well, in a highly available, non-blocking manner. REST services can certainly be non-blocking, but they can also be blocking as described here, which is a problem.

A cornerstone of asynchronous services is messaging. Here’s a talk on Riak, a masterless database and messaging system, running on CloudFoundry, at GE – https://youtu.be/aYaz17qf7cE?t=2161 . It is followed by talks on microservices.

Nvidia nvGraph and Tesla P100 GPU

“The latest version of NVIDIA’s parallel computing platform gives developers direct access to powerful new Pascal features, including unified memory and NVLink. Also included in this release is a new graph analytics library — nvGRAPH — which can be used for robotic path planning, cyber security and logistics analysis, expanding the application of GPU acceleration into the realm of big data analytics.”

nvGRAPH supports three widely-used algorithms:

Page Rank is most famously used in search engines, and also used in social network analysis, recommendation systems, and for novel uses in natural science when studying the relationship between proteins and in ecological networks.

Single Source Shortest Path is used to identify the fastest path from A to B through a road network, and can also be used for a optimizing a wide range of other logistics problems.

Single Source Widest Path is used in domains like IP traffic routing and traffic-sensitive path planning.

In addition, the nvGRAPH semiring Sparse Matrix Vector Multiplication (SPMV) operations can be used to build a wide range of innovative graph traversal algorithms.

A paper on how to represent cyber attacks as graphs – http://csis.gmu.edu/noel/pubs/2015_IEEE_HST.pdf references the CAPEC, which is a collection of vulnerabilities for such a graph study.

A graphdb represents data using nodes, edges and properties.

graphdatabase_propertygraph

It allows querying with a graph traversal language such as Gremlin. Blazegraph offers a GPU accelerated graphdb. Here’s a graph of machine learning papers from Research Front Maps.

rfmap1

MapD, a GPU accelerated db recently got funding from Nvidia and others.

Elliptic Curve Crypto – an algebraic introduction

A brief review of algebraic concepts leading to Elliptic Curve Cryptography (ECC):

Algebraic Group: A set G with a single binary operator (+) that maps any 2 elements of G to another element of G, where the operator supports closure, identity, inverse and associative properties. If the + operator is commutative, G is called an Abelian group or a Commutative group. [matrix multiplication and quaterion multiplication are not commutative; the symmetric group Sn for n>=3 is non abelian – because transformation order of permutations matters; S5 is not solvable https://youtu.be/ebc8EdBNA-E https://youtu.be/h6JBiYeBfC4 https://m.youtube.com/watch?v=oM1LD80E7XU . I mention these because the notion of a group became useful and well-defined when non-Abelian groups such as S5 were encountered]For ECC the groups involved are commutative or Abelian.

Algebraic Ring: A set R with a two binary operators (+) and (*) that map any 2 elements of G to another element of R. Every Ring is an Abelian group with respect to (+). If it is Abelian with respect to (*) then it is called a Commutative Ring. Note the absence of inverse or division requirement. [In Zp below, if p were not prime, then one would have a Ring instead of a Field because inverses would not always exist.]

Algebraic Field: A set F with a two binary operators (+) and (*) that map any 2 elements of F to another element of F; where (+) and (*) are both commutative; and where each element of F has a unique inverse with respect to (*), which is also in F. The set of Rationals form a Field, the set of Integers do not. The group properties are followed wrt (+) and (*) individually and the respective groups are the additive group and the multiplicative group.

Polynomial: A construct consisting of repeated applications of (+) and (*) on elements of a Field (which by definition of a Field returns another element of the Field ). The set of polynomials forms a Ring.

Closed Field: If the roots of every polynomial on a Field belong to the Field, it is a Closed Field. The set of Complex numbers form a Closed Field, the set of Reals do not. [A Field extension can be used to extend a non-Closed Field to a Closed Field].

Algebraic Finite Field: A Field with finite number of elements. It is not a Closed Field, but satisfies other properties of a Field, specifically that multiplication is invertible. The number of elements of the Finite Field is called its order. An example is the modulus field, obtained by modulo division by a prime number P. Note that exponentiation on a finite field generates a distribution of all the numbers in the sequence less than the order – for large values of an exponent it produces an highly irregular distribution. This makes the inverse operation of discrete logarithm a hard problem. The inverse of a specific element ‘a’ of a finite field of order n is a^(n-2), by Fermat’s little theorem.

Galois Field: Same as above, a finite Field. Finite fields have order pk, where p is a prime and k an integer>=1. All finite fields of a given order are isomorphic. In a field of order pk, adding p copies of any element always results in zero; that is, the characteristic of the field is p. If a finite field had a non-prime order with different factors that would imply that different elements of the field could be added to themselves a different number of times to give 0, but each element of a field must have the same characteristic . We met a  GF (28) in AES S-boxes.

Elliptic-Curve (over reals): A cubic equation, arises in calculation of arclength of an ellipse and naturally has a group property, where two elements on the curve map to a third element also on the curve. See here. The group property is used to define EC-addition (A+B), EC-doubling (A+A) and EC-scalar-multiplication (A+A+..A = kA). This is the group operation on Elliptic curves, a reasonable explanation of why it exists is here. These two operations are defined in more detail- http://www.coindesk.com/math-behind-bitcoin/ . Lets observe that they are distinct from, but dependent on the underlying field properties. The EC operations are maps on tuples or points.  In history of elliptic curves it notes that whereas ellipses are quadratic, elliptic curves are cubic or quartic, attributing the name linkage to Wallis. It also notes the tangent-secant method of  Diophantus in using two rational solutions of a cubic to find a third rational solution (later proved to be an Abelian Algebraic Group: points A and B on a curve map to a third point C, and one defines Group addition with A+B =-C, where the – is reflection around x-axis. https://crypto.stackexchange.com/questions/53974/when-adding-two-points-on-an-elliptic-curve-why-flip-over-the-x-axis makes the argument using symmetry of A,B,C; important thing is this ‘addition’ is not a linear operation). For a tangent, look at Frey Elliptic curves and Tate’s review with its discussion on isogenies.

Elliptic-Curve over a Field: The set of points (tuples) belonging to a field F that satisfy an elliptic curve equation (y*y = x*x*x+a*x+b) with the (+) and (*) defined according to the field.  Again the name elliptic curve arises because this equation comes up when calculating the arclength of the ellipse; the actual plot of this over different fields does not look like a curve let alone an ellipse. What’s interesting is that the Group property noted above is retained over different Fields. This lecture discusses how consideration of whether the area of a right triangle can be 1, over a field of rationals gave rise to elliptic curves – https://www.youtube.com/watch?v=6eZQu120A80.

Elliptic-Curve over a Finite Field: Tuples drawn from a Finite Field F, that satisfy the elliptic curve equation (y^2 = x^3+ax+b).

Elliptic-Curve Crypto: Built on the hardness of solving the discrete logarithm problem: Given an elliptic curve E defined over a finite field Fq of order(=size) n, a point P ∈ E(Fq ), and a point Q ∈ E(Fq ), find the integer k ∈ [0,n −1] such that Q = k P. The integer k is called the discrete logarithm of Q to the base P, denoted by k = logP Q. This is asking for an inverse operation. Introduction to ECC by Christof Paar – https://www.youtube.com/watch?v=vnpZXJL6QCQ .

In the Bitcoin implementation of ECDSA, the finite field is a prime modulus field Zp and the EC equation is y^2 (mod p) = x^3+7 (mod p).  Zp uses the prime modulus p=2^256 – 2^32 – 977 = 115792089237316195423570985008687907853269984665640564039457‌584007908834671663.  A 256 bit integer has (2^10)^25.6 or 3*25.6~77 digit numbers. So two 256 bit numbers are mapped to a third 256 bit number. Sounds good for encryption- but how is this inverted? It is not: one of the first numbers, the scalar multipler k, is the private key.  This 256-bit scalar private key is multiplied with a Generator Point G to arrive at a public key, with a 256bit x and y coordinates for a a 512-bit public key. From the kG, the private key k cannot be derived; the strength depends on this irreversibility. The “curve” on the finite field is highly discontinuous and an example is shown here. sepc256k1 is the name of the curve y^2=x^3+7, a list of other curves is here – https://neuromancer.sk/std/secg/

So how do we generate and verify a signature – an ECDSA ? This math is a layer on top of the preceding maps for generating keypairs. See https://en.m.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm#Correctness_of_the_algorithm for details and some implementation weaknesses. Openssl operations on EC based keypairs are described here – https://wiki.openssl.org/index.php/Command_Line_Elliptic_Curve_Operations. The size of the keys for EC are smaller than the size of RSA keys for equivalent security. Since the size of a signature is the same as the size of the key in bits, this means the EC signatures are shorter in size than RSA signatures. A 256 bit EC key takes 32bytes, compared to 512bytes for a (roughly) comparable 4096 bit RSA key.

An ECDSA signature of a message m is a pair of integers (r,s) where r does not depend on the message and s depends on the hash of the message, e, and on r.

The signature depends on a parameter k which should be random or deterministically generated via (say) RFC6979.

Other elliptic curves on finite fields, including on F(2^m) are described in http://www.secg.org/sec2-v2.pdf.

The cryptol verification software has been used to find bugs in the implementation of a Elliptic Curve Crypto implementations. Implementation errors can dramatically impact security properties of crypto that appears mathematically sound.

Note that a Group sounds like a simpler structure than a Field, but a Galois Group is a more complex structure than a Galois Field. https://www.encyclopediaofmath.org/index.php/Permutation_group https://math.stackexchange.com/questions/45893/how-to-find-the-galois-group-of-a-polynomial

S-boxes

In several crypto systems, there are P-boxes and S-boxes. P-boxes are permutations of input to outputs. What are S-boxes ? I’ve known that they are static maps, used for “confusion” of input to output, have something to do with Galois fields, and that small changes in them affect the crypto properties of a cryptosystem; so they are analyzed by cryptographers in great detail.  Their design is a bit of a black box. Even in cryptol documentation, they are described as a kind of gift from above.

S-box stands for a substitution box.  The goal of the S-box is that for a small change of input there should be a large change in the output – an avalanche.  This property is clearly not true of pure transposition, substitution, vignere pad ciphers  – where a change of an input letter changes exactly one output letter. Because this property is not true, the simple crypto systems can be broken easily by analyzing output/input pairs.

A P-box can be described by a multiplication of the input (symbols, not sequence) with a matrix which is the identity matrix  with its rows transposed (scrambled) according to the permutation order.  So the P-box transform is an invertible linear transform. Can the S-box be another linear transform ? Suppose it were linear. One could give it unit input vectors and determine the rows of the transform, or use Gauss elimination with known input/output pairs.

So one would want this to be non-linear.  In fact, the AES S-box is a sophisticated permutation map – for each byte another unique byte is mapped in a lookup table . Isn’t a permutation linear ?  The key is that the bits of an output byte are not linearly related to the bits of the input byte. Looking at the Substitution-Permutation Network design makes it clear that this property will break Gauss elimination techniques on input/output pairs .

With that, here’s a reference which discusses the finite field math and the transform f_affine( g_finite_field_inverse( input_x ) ) which comprises the S-box design of AES – http://csrc.nist.gov/archive/aes/rijndael/Rijndael-ammended.pdf . The design goal is explicitly to maximize the non-linearity and beat differential attacks which analyze differences of output with input.

The design considers bits as coefficients of a polynomial (1111 = x^3+x^2+x^1+x^0), bytes as degree 7 polynomials and defines an invertible multiplication of two bytes as a map to another byte. Irreducible polynomials over a field are not factorizable over the field. A polynomial presupposes multiplication and addition of elements (multiplication is distinct from addition, and specifically is not repeated addition – adding a number to itself gives zero). For a modulo(prime) finite group, multiplication can be defined using the same modulo operator that defines the group (In G7, 3*5 = 15%7 = 1). A polynomial representation of a byte is a polynomial of (max) degree 7 with coefficients 0 or 1. The multiplication that maps (byte * byte -> byte) can be defined by taking the product polynomial (a polynomial of max degree 14)  then taking modulus via a irreducible byte polynomial of degree 7 or higher (in analogy with modulo group above) to give back a polynomial of max degree less than 7, i.e. a byte. Note the modulus used in AES is a specific polynomial of degree 8, m(x), which is not itself a byte. The division is done with polynomial long division. For more on GF(28) see http://www.cs.utsa.edu/~wagner/laws/FFM.html.

The irreducibility of m(x) of degree 8 implies that the GCD of the product of any two bytes p(x)=a(x).b(x) and m(x) is 1, so m(x) does not cleanly divide any p(x), so the division p(x)/m(x) always has a non-zero remainder for non-zero p(x), hence a non-zero product according to above definition of multiplication. If m(x) were degree 7, then x*m(x)%m(x) = x, so x would map to itself, which is not good. The GCD=1 also implies that each element has a multiplicative inverse. The m(x) chosen is a pentanomial with non-zero terms (8,4,3,1,0) or 100011011 or 0x11B. This design yields modulus math on bytes. So you can raise a byte to the power of n and get back another (well-scrambled) byte.

What a cool secret language. Raising to a power is non-linear. But why choose n=-1 ?

The choice of n=-1 is described in the Rijndael paper here with the reference to Nyberg’94 paper here which carries the line “The author’s attention to the mapping x ->x^-1 was drawn by C. Carlet. He observed that the high nonlinearity property (i) was actually proven in the work of Carlitz and Uchiyama.” The non-linearity  is aimed at defeating linear cryptanalysis where a few input-output pairs can be used to guess the key. The design of the scheme in Nyberg paper is aimed at defeating differential cryptanalysis, by flattening out the variation in the output, so it is close to noise.

Yet, if the letter “s” is common, the corresponding S-box letter (from the lookup table) will also be common. This S-box must be only part of an encryption scheme.

The Nyberg paper has the following line: “However,the high nonlinear order of the inversion mapping and .. comes into effect if these mappings are combined with appropriately chosen linear or affine permutations which may vary from round to round and depend on the secret key.”  A counter paper is here –On Exact Algebraic [Non-]Immunity of S-boxes Based on Power Function

A clear explanation of nonlinearity of a function F as the minimum hamming distance between bytes reachable by linear functions and bytes reachable by the non-linear function F is found in The Design Of S-Boxes by Jennifer Miuling Cheung (2010) p.13.  For two input bits, there are 8 linear functions (1, x1, x2, x1+1, x2+1, x1+x2, x1+x2+1) and 4 input sequences of 00,01,10,11; the output of each of the 8 functions on the 4 input sequences gives 8 output vectors of length 4. But there are 16 vectors of length 4; of these the linear functions reach only 8, the remaining 8 unreachable by linear functions are are termed nonlinear.

This paper, discusses constant time implementations of AES S-Boxes. Instead of a lookup table, they rely on doing the actual multiplication in the finite field: “During the offline phase, we precompute values H,X ·H,X2 ·H,…,X127 ·H. Based on this precomputation, multiplication of an element D with H can be computed using a series of xors conditioned on the bits of D”

AES Pipeline

AES divides the input into chunks of 128 bits. 128 bits is 16 bytes. 16 bytes can be arranged in a 4×4 matrix and are represented in this two dimensional form by AES. This matrix is called the “state”.

Using cryptol, the AES algorithm is described with the following functions

AESRound : (RoundKey, State) -> State

AESRound (rk, s) = AddRoundKey (rk, MixColumns (ShiftRows (SubBytes s)))

The final round leaves out the MixColumns transform.

AESFinalRound : (RoundKey, State) -> State
AESFinalRound (rk, s) = AddRoundKey (rk, ShiftRows (SubBytes s))

The inverse transform is as follows

AESInvRound : (RoundKey, State) -> State

AESInvRound (rk, s) =
InvMixColumns (AddRoundKey (rk, InvSubBytes (InvShiftRows s)))

AESFinalInvRound : (RoundKey, State) -> State

AESFinalInvRound (rk, s) = AddRoundKey (rk, InvSubBytes (InvShiftRows s))

ASN1 Types

ASN1 is a data interoperability format that is widely used in directory, security and network management systems. Data is stored in triplets of TLV – type, length, value (in BER, DER encoding rules). TLV allows a format that is efficient, recursive and self-describing.

The type system is the interesting aspect. A “type” is a descriptor for the data, that the TLV triplet holds. The “type” is stored as a sequence of one or more bytes. This sequence can be as small  as a single byte or be as large as needed (unlimited length). In case of a single byte “type”, the bits 7,8 represent the class of the type (4 classes exist), bit 6 represents whether type is single atomic data element or nested, and bits 5-1 encode the tag of the type.  This single byte type can hold tags from 0 to 30. If the type is outside this range 0-30, the 5 to 1 bits are set to 1, and the actual tag starts in the following bytes of the now multi-byte type. In case of multi-byte “types”, the most significant bit of each byte must be 1, except for the last byte, which must be 0.

There are atomic types and component types.

Atomic types include OBJECT IDENTIFIER types and various strings (bits, ascii, octet), integers, null .

Component types include the ordered SEQUENCE and the unordered SET, both of which types can contain one or more occurrences of different types of data. SEQUENCE OF and SET OF are component types which contain zero or more occurrences of the same types of data.

There is potential for ambiguity as to whether an Object Identifier (OID) in the tagged notation is described as a multi-byte type since the OID is itself multi-byte. It is not, it is described as a single byte ASN1 type with tag = 06 as described in the tag table here and clarified by this Microsoft example of an OID encoding. So the OID value sits in the value field of the TLV triplet, not in the type.

An example of the encoding for RSA private key in PKCS#1 is here.

oid-info.com allows lookup of Object Identifiers. Here is a tree display for RSA private key which has OID 1.2.840.113549.1.1.1 – http://oid-info.com/cgi-bin/display?oid=1.2.840.113549.1.1.1+&submit2=Tree+display

What are examples of multi-byte ASN1 types ?  The EMV format used in payments and smartcards use two-byte types.

The “abstract” in the name came from a contrast to “transfer syntax notation” which is the on-wire format. The “abstract syntax” maps to “transfer syntax” via encoding rules.

For comparison, consider the XDR scheme used in SunRPC. Here the types (metadata) are not included within the protocol as tags, but defined externally in a .x file which is an input to an rpcgen compiler. Protobuf and capnproto also use external medata in a .proto file. ASN1 now supports Packed Encoding Rules (PER) which remove the tag information for greater compactness and efficiency.  Finally, while ASN1 continues to be used for highly structured information, the rapid growth of JSON/REST protocols in the identity space has been interesting.

The ‘DROWN’ attack on SSL and its extension to QUIC

The DROWN attack was recently shown to cause HTTPS sites supporting SSLv2 to be vulnerable to MITM attacks. An attack extension to QUIC is discussed in section 7 of their paper, along the same lines referenced earlier, which makes it relevant to current discussions of TLS.

Some background. Consider a simple encryption scheme that encrypts a message M to a ciphertext C. Should it always encrypt M to C ? One might think yes. But if yes, then an attacker could create his own extensive tables of encryptions of messages (Mi->Ci) , and then lookup the encrypted (intercepted) message Ci and infer what Mi was. Second, even if there is not a lookup hit, then the maps so collected could be used to deterministically modify Ci, to match another valid Mi.  This latter property is called malleability of encryption. Malleability may be desirable in some cases (homomorphic encryption), but in general it weakens encryption.

So determinism is bad and we want to make the encryption non-deterministic. We can do so by adding some random bytes to beginning of the message every time we encrypt. After decryption, we remove those random bytes and get back our message. These random bytes are called an initialization vector, or also called ‘padding‘.

The way padding is added, is important. If the attacker is able to infer that certain messages do not decrypt to the expected padding format (via error messages returned from the server), then he can narrow the range of valid messages. A ‘padding oracle’ is a server that freely answers whether an encrypted message is correctly padded or not. This is the basis of the Bleichenbacher attack.

The SSLv2 servers are using an old form of the RSA which is unpadded, deterministic and malleable. This property is the basis of many attacks, that involve a protocol downgrade (e.g. a PFS cipher to RSA cipher, or TLS to SSL) , or attack a known weak protocol. The Breach attack used the fact that compression changed the size of a message in a predictable way to infer bytes of a key. Compression is disabled in TLS1.2.

But why should this attack be possible with newer protocols like QUIC ? QUIC does not even support PKCS#1 v1.5 encryption.

QUIC achieves low latency by caching TLS state on the client. If any part of the protocol is using a deterministic behaviour it can be exploited. The Jager paper makes the following observations. 1. The server authentication is performed only in the connect phase, not in the repeat phase.  and 2. The signed SCFG message is made independent of the client request, which makes it possible for the server or an attacker to precompute it. So, there is exploitable determinism.

Certain fixes have been proposed in QUIC v31, to switch the server signature from a static signature of the server config to a signature of the server config with a hash of the client hello message. This is claimed to eliminate the DROWN vulnerability.

Erlang ‘for’ and ‘if’

When thinking of immutable state, I imagine a large

input->[processingbox]->output

box being broken into a chain of multiple smaller mini-boxes,

input0->[pbox0]->output0 -> [pbox1]->output1 ->[pbox2]->output3

with each individual box accomplishing a change in its input to its output, without keeping any internal mutable state.

Which reminds me of Linear-Time-Invariant systems. As long as each part of the system is linear and time-invariant, a large system can be composed of simpler parts and still be analyzed by deriving a composite transfer function. The composite system can be studied for its stability and loss/gain characteristics and adjustments made in the system design.

I suspect a similar payoff occurs with functional programming. But it is not so clearly stated or visible in the ability to analyze large programs. Erlang is essentially a version of event driven programming. The conciseness of expression is encouraging, but I hope to arrive at good examples of program composition.

Meanwhile the ‘if’ statement in erlang is a curveball. In addition to the odd syntax, and statements like true->false, one finds one cannot place a log statement anywhere inside a set of statements to see what’s going on. Punctuation rules. The trick is to place the log a comma before the last expression before the semicolon.

if
   Val >= Left andalso Val =< Right ->
   io:format("==> ok~p<~p<~p\n\n",[Left,Val,Right]),
   true;

   true ->
   io:format("==>not ok ~p<~p<~p\n\n",[Left,Val,Right]),
   false
end.

The semicolons indicate different phrases which are pattern-matching options. The commas are continued statements within the same phrase. The last statement in the phrase is the return value. The last phrase is typically a catch-all, in this case returning a value of false. The case statement is more commonly used than the if. The -> denotes  ‘lhs condition is mapped to rhs’ .

The ‘for’ statement does not exist at all and below are alternatives.

foreach(Fun, List)->ok, %% just applies a function Fun to each element of a list
foldl(Fun, Accumulator, List)-> Acc1,  %% fold list left to right into an accumulator
foldl(Fun, Accumulator, List)-> Acc1 %% fold list right to left into an accumulator
map(Fun, List1) -> List2  %% maps a list to a new list

The = sign in erlang is used for (a) immutable assignment of a term to a value and (b) for comparing if a previously assigned term or expression of terms is the same as another term or value – this is called pattern matching. One cannot reassign a term to a new value, so the math notion of equality always holds. The = used in C/Java could be called mutable assignment, every variable is hiding within it mutable state. A term in erlang does have an internal state of bound or unbound. On a successful pattern match any unbound variable becomes bound.

1> [H|T] = [1,2,3].

2> io:write(H) .

To generate a list of numbers, like range(n) in python, there is lists:seq(1,n). io:fwrite(“~p”, [lists:seq(1,20)]).

The philosophy behind some of these decisions is discussed here https://news.ycombinator.com/item?id=13499377. An objective is to make it impossible to write runaway loops – to make the runtime system always be able to recover and reallocate its cycles. A related aspect is to let-it-fail and not write a lot of exception handling, defensive code.

My erlang work was done while modifying RabbitMQ (written in erlang) to support features for certificate based authentication for MQTT, while consulting at GE Predix.

Erlang OTP (for Open Telecom Platform) is a layer on top of erlang language and proposes design principles for organizing code including ideas around failure handling with supervision trees. These form the basis of the actor model, where code is structured as isolated processes that communicate by sending and receiving messages. The actor can crash but is supervised by another actor. More in https://www.brianstorti.com/the-actor-model/ .

Tail recursion in erlang with an accumulator, is used as a replacement for loops – https://learnyousomeerlang.com/recursion

‘Tail recursion is a way to transform a linear process (it grows as much as there are elements) to an iterative one (there is not really any growth).’

Whatsapp talk on Engineering for Scale mentions Erlang allows hot-swapping and loose coupling because of message queues everywhere, and after fixing a bug looking for places where contagion spread, meaning there was not enough loose coupling there – https://developers.facebook.com/videos/f8-2016/a-look-at-whatsapp-engineering-for-success-at-scale

Bytecode VMs

I came to be looking at different language VMs recently- Erlang BEAM and the Smalltalk VM. Smalltalk VM was a bytecode based stack VM (just like Java VM), and likely influenced the VM of Erlang. 

Java bytecode verification checks (and their vulnerabilities in bytecode loaders) are well known, but the security characteristics of other VMs and even languages running on JVM is less well known (clojure/scala ?).

BEAM is the Erlang VM (Bogdan/Björn’s Erlang Abstract Machine). It is the successor of JAM (Joe’s Abstract Machine) which was inspired by Prolog WAM. Details on it are found in the Hitchhiker’s Tour of BEAM, with ways to crash the BEAM VM such as creating too many atoms (which never get deleted). BEAM can run on bare-metal on XEN – http://erlangonxen.org . The format is described as based on EA IFF 85 – Standard for Interchange Format Files with a FOR1 starting 4 bytes. Here’s the full 6k lines of beam_load.c .

The code-to-BEAM-bytecode processing pipeline is described here as including a preprocessing step (see -E, -S, -P options to erlc). An interesting problem of peeking and pattern-matching inflight messages is discussed here. I find it interesting to think what would happen if one froze an erlang VM to see all inflight messages – like putting a breakpoint in the kernel. The way to get a stacktrace is erlang:get_stacktrace()

The hot-swapping capability is worrying, coming from the objective-c world that had code-signing to lock down changes. Erlang does have a strong process isolation model.

This article on Dart makes the case for non-bytecode VMs with the following  line “When your VM’s input is program source code, it can rely on the grammar of the language itself to enforce important invariants”. Sounds relevant to byte-code VMs as well.