Category: systems

Traffic limits with HAProxy stick-tables to protect against DDoS attacks

Websites need a traffic rate limiting feature to keep their backend safe from abusive or malfunctioning clients. This requires the ability to track user sessions of a particular type and/or from a given IP address. HAProxy is a HTTP proxy which can be configured as a reverse proxy to protect a website. It uses an in-memory table to store state related to incoming HTTP connections, indexed by a key such as the client IP address. This table is called a stick-table and is enabled using the ‘stick-table’ directive in the haproxy config file.

The stick-table directive allows specifying the key, the size of the table, the duration an entry (key) is kept in seconds and various counts such as currently active connections, connection rate, http request rate, http error rate etc.

Stick tables are very useful for rate-limiting traffic and tagging traffic that meets certain criteria such as a high connection or error rate with a header which can be used by the backend to log the traffic. They can be used to provide detection of and protection against DDoS attacks.

HAProxy receives client requests in its frontend and sends those requests to servers in its backend.  The config file has corresponding frontend and backend sections.

The origin of this rate-limiting feature request along with an example is at https://blog.serverfault.com/2010/08/26/1016491873/ . Serverfault is a high traffic website so it is a good indication if the feature works for them.

Here’s an example.

frontend http
    bind *:2550

stick-table type ip size 200k expire 10m store gpc0

# check the source before tracking counters, that will allow it to
# expire the entry even if there is still activity.
acl whitelist src 192.168.1.154
acl source_is_abuser src_get_gpc0(http) gt 0
use_backend ease-up-y0 if source_is_abuser
tcp-request connection track-sc1 src if ! source_is_abuser

acl is_test1 hdr_sub(host) -i test1.com
acl is_test2 hdr_sub(host) -i test2.com

use_backend test1  if is_test1
use_backend test2  if is_test2

backend test1 
stick-table type ip size 200k expire 30s store conn_rate(100s),bytes_out_rate(60s) 
acl whitelist src 192.168.1.154

# values below are specific to the backend
tcp-request content  track-sc2 src
acl conn_rate_abuse  sc2_conn_rate gt 3
acl data_rate_abuse  sc2_bytes_out_rate  gt 20000000

# abuse is marked in the frontend so that it's shared between all sites
acl mark_as_abuser   sc1_inc_gpc0 gt 0
tcp-request content  reject if conn_rate_abuse !whitelist mark_as_abuser
tcp-request content  reject if data_rate_abuse mark_as_abuser

server local_apache localhost:80

Note that the frontend and backend sections have their own stick-table sections.

A general strategy would be to allow enough buffer for legitimate traffic to pass in, drop abnormally high traffic and flag intermediate risk traffic to the backend so it can either drop it or log the request for appropriate action, including potentially adding the IP to an abusers list for correlation, reverse lookup and other analysis. These objectives are achievable with stick-tables.

An overview of the HAProxy config file with the sections global, defaults, frontend, backend is here.

Stick tables use elastic binary trees-

https://github.com/haproxy/haproxy/blob/master/include/types/stick_table.h

https://github.com/haproxy/haproxy/blob/master/src/stick_table.c

https://wtarreau.blogspot.com/2011/12/elastic-binary-trees-ebtree.html

Related, for analysis of packet captures in DDoS context, a useful tool is python dpkt – https://mmishou.wordpress.com/2010/04/13/passive-dns-mining-from-pcap-with-dpkt-python .

 

 

Javascript Timing and Meltdown

In response to meltdown/spectre side-channel vulnerabilities, which are based on fine grained observation of the CPU to infer cache state of an adjacent process or VM, a mitigration response by browsers was the reduction of the time resolution of various time apis, especially in javascript.

The authors responded with alternative sources of finding fine grained timing, available to browsers. An interpolation method allows obtaining of a fine resolution of 15 μs, from a timer that is rounded down to multiples of 100 ms.

The javascript  high resolution time api is still widely available and described at https://www.w3.org/TR/hr-time/ with a reference to previous work on cache attacks in Practical cache attacks in JS

A meltdown PoC is at https://github.com/gkaindl/meltdown-poc, to test the timing attack in its own process. The instruction RDTSC returns the Time Stamp Counter (TSC), a 64-bit register that counts the number of cycles since reset, and so has a resolution of 0.5ns on a 2GHz CPU.

int main() {
 unsigned long i;
 i = __rdtsc();
 printf("%lld\n", i);
}

Ethereum Security and the DAO Solidity Attack

The basics of Ethereum are described in the Gavin Wood paper. A list of keywords in Solidity are described in this file from its source, which includes “address”, “contract”, “event”, “mapping” and “wei” ( 1 Eth= 10^18 Wei). This list does not include “gas”, which is a mechanism described in Wood’s paper to combat abuse. Interestingly the paper says “The first example of utilising the proof-of-work as a strong economic signal to secure a currency was by Vishnumurthy et al [2003]”, aka the Karma paper.

The karma paper talks about solving a cryptographic puzzle as enabling one to join the network and be assigned a bank set: “the node certifies that it completed this computation by encrypting challenges provided by its bank-set nodes with its private key. Thus each node is assigned an id beyond its immediate control, and acquires a public-private key pair that can be used in later stages of the protocol without having to rely on a public-key infrastructure”. Crypto puzzles for diverse problems have been proposed before, a survey and comparison is at https://pdfs.semanticscholar.org/d8b9/a0309cef8c309541876c9c2c5ad5c16c3b7a.pdf

The DAO attack had 3 components, a hacker, a malicious contract and a vulnerable contract. The malicious contract was used to withdraw funds from the vulnerable contract so that it never got to the code to decrement its balance. Oddly enough the gas mechanism which is supposed to limit (runaway) computation did not kick in to stop this repeated remittance.

A few weeks before the DAO attack someone had pointed out to me that security of solidity was a bit of an open problem. My feeling was contracts should be layered above the value exchange mechanism, not built into it. Bitcoin based protocols with the simpler OP_RETURN semantics appeared more solid. Later around October’16 at an Ethereum meetup, Fred Ehrsam made the comment that most new projects would be using Ethereum instead of bitcoin. But Bitcoin meetups had more real-world use cases being discussed. The technical limitations exist, which are being addressed by forks such as SegWit2x this November. Today saw a number of interesting proposals with Ethereum, including Dharma, DataWallet and BloomIDs. Security would be a continuing  concern with the expanding scope of such projects.

Sequence diagram of the attack, showing a double-withdraw –

More on Ethereum –

It uses the Keccac-256 hash function.

It supports the Ethereum Virtual Machine. The EVM is formally specified in section 9 of Gavin Wood paper. Some properties –

  • computation on EVM is intrinsically tied to a parameter ‘gas’ which limits the amount of computation.
  • stack based machine, with max stack depth of1024
  • 256-bit word size. 256-bit stack item size. 256-bit chosen to facilitate Keccak-256 hash and Elliptic Curve computations
  • machine halts on various exceptions including OOG (out of gas)
  • memory model is a word-addressed byte-array
  • program code is store in a ‘virtual ROM’ accessed via a specialized instruction
  • runs state transition function

The state transition function takes as input – state σ, remaining gas g, and a tuple I.
The execution agent must provide in the execution environment, certain information, that are contained in the tuple I:
Ia, the address of the account which owns the code that is executing.
Io, the sender address of the transaction that originated this execution.
Ip, the price of gas in the transaction that originated this execution.
Id, the byte array that is the input data to this execution; if the execution agent is a transaction,
this would be the transaction data.
Is, the address of the account which caused the code to be executing; if the execution agent is a
transaction, this would be the transaction sender.
Iv , the value, in Wei, passed to this account as part of the same procedure as execution; if the
execution agent is a transaction, this would be the transaction value.
Ib, the byte array that is the machine code to be executed.
IH , the block header of the present block.
Ie, the depth of the present message-call or contract-creation (i.e. the number of CALLs or
CREATEs being executed at present).
The execution model defines the function Ξ, which can compute the resultant state σ′, the remaining gas g′, the suicide list s, the log series l, the refunds r and the resul- tant output, o, given these definitions:
(115) (σ′, g′, s, l, r, o) ≡ Ξ(σ, g, I)

The official code is written in Golang (~640,000 lines). A code walkthrough is at https://www.mslinn.com/blog/2018/06/13/evm-source-walkthrough.html

The table of Opcodes is at https://github.com/ethereum/go-ethereum/blob/master/core/vm/jump_table.go

But what is a formal notion of a contract (specifically on the EVM which aims to generalize contracts) ? Is it any function. Is it a function of a specific template, expecting certain inputs/outputs. Can these functions be composed ?

The notions of contracts are delved in https://nakamotoinstitute.org/formalizing-securing-relationships/

“The contractual phases of search, negotiation, commitment, performance, and adjudication constitute the realm of smart contracts.”

Example – a vending machine. The machine ‘takes in coins and dispenses change and product according to the displayed price’, it is ‘a contract with bearer: anybody with coins can participate in an exchange with the vendor’.

So one can think of a smart contract as a digital vending machine. That’s actually a useful mental model. With a digital vending machine (DVM ?), the buyer can make a choice of what product buyer wants (of possibly very many – not limited to a physical machine), and be served that product upon due payment without interaction with another person or agency.

Linux Bridges and Firewalls for Containers

Hubs and Repeaters work on bits on Layer1. Bridges and Switches work on frames, the protocol data unit for Layer2. A Hub supports a star configuration – like a splitter it simply forwards traffic received on one segment to all the others. A Repeater amplifies signal for longer distances. A Bridge is a smarter type of Hub, usually implemented in software, which looks at the destination MAC address of the packet before forwarding to the segment with MAC matching that destination, thereby avoiding unnecessary network traffic on all the other segments. It does not look at IP addresses. A Switch is a reincarnation of a Bridge in hardware with several ports.  The basic operation of a Bridge/Switch is L2 forwarding, which includes ARL (Address Resolution Logic) table learning and ageing.  An L3 Router switches packets based on the IP address instead of the frame address.

A linux bridge from a VM to the Host, does not need an IP address, but it is a good idea if you want to talk to the host for management or set firewall rules on it.

To create a bridge br0 between two interfaces eth0 and eth1, one runs commands: brctl addbr br0, brctl stp br0 on, brctl addif br0 eth0, brctl addif br0 eth1.

Since a bridge is an L2 construct, how can one filter IP addresses with it. A quote: An Ethernet interface cannot usefully have an IP address if it is also attached to a bridge. However it is possible to achieve the same effect by binding an address to the bridge itself:

ifconfig br0 192.168.0.1

An example of firewall configuration on a bridge.

Troubleshooting bridges:

brctl showmacs br0

Docker creates a few bridges by default. A primer on Docker Container networking and the bridges it creates are here and  here. Here are three common ones.

bridge: docker0 bridge . This is the default, allowing all containers to talk to each other

host: container listens to host, without any isolation.

Some problems with host iptables rules reveal themselves in working with docker.

A look at firewalld, ebtables, ipset, nftables at https://www.slideshare.net/albertspijkers/red-hat-enterprise-linux-next-generation-firewall . The command line tools talk to the kernel over a netlink socket.

A more extensible approach is with Berkeley Packet Filter (BPF), the classical one now called cBPF, the extended successor called eBPF.   A very basic problem is filtering HTTP or DNS packets based on the destination hostname, rather than the IP address.  This is discussed in an example by Cloudflare here.

NVidia Tiny Linux Kernel and TrustZone

The NVidia Tiny Linux Kernel (TLK), is 23K lines of BSD licensed code, which supports multi-threading, IPC and thread scheduling and implements TrustZone features of a Trusted Execution Environment (TEE). It is based on the Little Kernel for embedded devices.

The TEE is an isolated environment that runs in parallel with an operating system, providing security. It is more secure than an OS and offers a higher level of functionality than a SE, using a hybrid approach that utilizes both hardware and software to protect data. Trusted applications running in a TEE have access to the full power of a device’s main processor and memory, while hardware isolation protects these from user installed apps running in a main operating system. Software and cryptographic isolation inside the TEE protect the trusted applications contained within from each other. A paper describing the separation with alternatives for virtualizing the TEE appeared at https://ipads.se.sjtu.edu.cn/lib/exe/fetch.php?media=publications:vtz.pdf

TrustZone was developed by Trusted Foundations Software which was acquired by Gemalto. Giesecke & Devrient developed a rival implementation named Mobicore. In April 2012 ARM, Gemalto and Giesecke & Devrient combined their TrustZone portfolios into a joint venture Trustonic, which was the first to qualify a GlobalPlatform-compliant TEE product in 2013.

A comparison with other hardware based security technologies is found here. Whereas a TPM is exclusively for security functions and does not have access to the CPU,  the TEE does have such access.

Attacks against TrustZone on Android are described in this blackhat talk. With a TEE exploit,  “avc_has_perm” can be modified to bypass SELinux for Android. By the way, Access Vectors in SELinux are described in this wonderful link. “avc_has_perm” is a function to check the AccessVectors allows permission.

Neural Network learning

Notes from Autonomous Driving Udacity course.

Weights

 

 

Embedded Neural Nets

A key problem for embedded neural networks is reduction of size and power consumption.

The hardware on which the neural net runs on can be a dedicated chip, an FPGA, a GPU or a CPU. Each of these consumes about 10x the power of the previous choice. But in terms of upfront cost, the dedicated chip costs the highest, the CPU the lowest. An NVidia whitepaper compares GPU with CPU on speed and power consumption. (It discusses key  neural networks like AlexNet. The AlexNet was a breakthrough in 2012 showing a neural network to be superior to other image recognition approaches by a wide margin).

Reducing the size of the neural network also reduces its power consumption. For NN size reduction, pruning of the weak connections in the net was proposed in “Learning both Weights and Connections for Efficient Neural Networks” by Song Han and team at NVidia and Stanford. This achieved a roughly 10x reduction in network size without loss of accuracy. Further work in “Deep Compression” achieved a 35x reduction.

Today I attended a talk on SqueezeNet by Forrest Iandola. His team at Berkeley modified (squeezed) the original architecture, then applied the Deep Compression technique above to achieve a 461x size reduction over the original, to 0.5Mb. This makes it feasible for mobile applications. This paper also references the V.Badrinarayan’s work on SegNet – a different NN architecture, discussed in a talk earlier this year.

The Nervana acquisition by Intel earlier this year was for a low power GPU like SOC chip with very high memory bandwidth.

Spark, Storm, Ayasdi, Hadoop

The huge amount of data that IOT systems will generate will call for analyses of different types. A brief review of some systems and what they are used for.

Apache Spark: Uses distributed memory abstractions for primarily in-memory processing. Built with Scala. Useful for finding data clusters and for detecting statistical anomalies by looking for distance from the cluster. Comes with a machine learning system on top. Does not come with its own file system (use nfs/hdfs). Useful for complex processing, where state needs to be maintained in the event stream for correlations. Described as ‘batch processing with micro-streaming analysis’, but looks headed to cover streaming analyses as well.

Apache Storm: Real-time Streaming data analysis. Developed at Twitter, written in Clojure. Unlike Hadoop which has two layers (map, reduce), Storm can have N layers and a flexible topology consisting of Spouts (data source units) and Bolts (data processing units). Storm has been superceded by Heron in terms of performance. IBM Streams is a commercial offering also for stream processing..

Ayasdi: Topological data processing allows one to discover what interesting features of the data are, without knowing what to look for in advance. This is in contrast to most systems where one needs to know what one is looking for. Claims insight discovery.

Hadoop: Used for batch processing of a large amounts of data, using map/reduce primitives. Comes with HDFS. Cloudera (and others) have made significant improvements to it with an interactive SQL interface and usability improvements for BI (Impala).

InfluxDB: Time-series db for events and metrics. Optimized for writes and claims to scale to IOT workloads.

ZooKeeper: A coordination service for distributed applications.

Amazon S2N and OpenSSL

In the last few years a number of OpenSSL vulnerabilities have come to light.  Heartbleed was a critical one which was exploited in the field. It basically allowed a client to send a malicious heartbeat to the server and get back chunks of server memory – which can contain passwords. It was estimated that two thirds of the servers in the world had the vulnerability. The fix was to upgrade OpenSSL, revoke existing server certs and request new SSL server certs.

Heartbleed previously triggered OpenBSD to fork OpenSSL to LibreSSL and Google to fork OpenSSL to BoringSSL.

Amazon S2N is a TLS/SSL implementation that is 6000 lines of code – so it is small, compact, fast and its correctness can be more easily verified. It uses only crypto functions from openssl and reimplements the SSL layer. This is a healthy direction for IOT and for certification of SSL, for example FIPS. S2N is short for Signal to Noise.

A timing attack was recently identified against it and has since been mitigated.

Note that two factor auth solutions would actually solve the problem presented by Heartbleed. There are several solutions in this area – Authy, Clef, Google Authenticator, Duo, Okta, Oracle Authenticator, ..

Docker Container Security

A block diagram of docker is below and a description of docker daemon is here. The docker client commands talk to the docker-daemon to start one of the containers in the docker registry, or to start a process described in the command line as a new docker container. Docker provides a simple interface to linux container technology which is a lightweight VM.

docker-containers-vms

A few problems with this. Who has access to the docker-daemon to control the containers ? How is integrity of the containers ensured ? How is the host protected from the code running in the containers ?

Docker recently announced a few security features in Nov DockerCon

  • to lock down the container in a registry with the container image signed with a key from hardware device Yubikey; see here for a description of original issue where image checksums were not verified by docker daemon
  • to scan the official container images for vulnerabilities
  • to run containers with a userlevel namespace instead of one that allows root access to the host. This protects the host OS as explained here. The userlevel namespace feature has been available in LXC for over an year, but not in docker.

For access control to the docker daemon there is activity with a design doc here.

Twistlock is a container security and monitoring tool that attempts a comprehensive approach – access control to the containers, runtime scanning of files for malware signatures, vulnerability scanning, looking at network packets, so on. A recent meetup on Dec 1 discussed this product. It features integration with Kerberos and LDAP.

In terms of the kernel,  processes from all containers share the same kernel, the same networking layer. So what’s the level of isolation provided to container processes. This depends on vulnerabilities in the processes themselves – how many ports are open, whether injection attacks are possible etc. If two containers are running processes and a process from the one attacks a process from another – for example memory scraping, then Twistlock can detect it only if it can identify the offending process as malware using signature matching.

A Dockerfile is used to specify a container image using commands to spec the base os, rpms, utilities and scripts. USER specifies the userid under which the following RUN, CMD or ENTRYPOINT instruction run. EXPOSE specs a port to be opened for external access. A docker image is built from the dockerfile and contains the actual bits needed for the container to run. The image can be loaded directly or pushed to a docker registry from  which it can be pulled to clients. 

Commands:

docker build -t <imgnametag> . # build image from Dockerfile in current directory

docker run -i -t <imgnametag> /bin/bash

docker login // registry 

docker push

docker pull

docker-compose [down|up] // docker-compose.yaml

docker images

docker export <container>

docker save <image> -o imgtag.tar

“Computer Detective in the Cloud”

Although light on details, this is an application of AI for securing against credit card fraud in real time using cloud computing.

AI has been in the news a few times this month – Google (TensorFlow), Facebook (new milestones in AI), Microsoft releasing Cortana (Nadella welcomes our AI overlords) and mention of an AI spring from IBM and Salesforce.

Machine learning has also been applied to spam detection, intrusion detection, malicious file detection, malicious url detection, insurance claims leakage detection, activity/behaviour based authentication, threat detection and data loss prevention.

Worth noting that these successes are typically in narrow domains with narrow variations of what is being detected. Intrusion detection is a fairly hard problem for machine learning because the number of variations of attacks is high. As someone said, we’ll be using signatures for a long time.

The previous burst of activity around neural networks in the late 80’s and early 90’s had subsided around the same time as the rise of the internet in the mid to late 90’s. Around 2009, as GPU’s made parallel processing more mainstream, there was a resurgence in activity – deeper, multilayer, networks looking at overlapping regions of images (similar to wavelets) lead to convolutional neural networks being developed. These have had successes in image and voice recognition. A few resources – GPU gems for general purpose computing, visualizing convolutional netscaffe deep learning framework.

Kafka Security

Kafka is a system for continuous, high throughput messaging of event data, such as logs, to enable near real-time analytics. It is structured as a distributed message broker with incoming-event producers sending messages to topics and outgoing-event consumers.  Motivations behind its development include decoupling producers and consumers from each other for flexibility, reducing time to process events and increasing throughput. Couple analogies to think of it are a sender using sendmail to send an email to an email address (topic);  or a message “router” which decides the destination for a particular message – except Kafka persists the messages until the consumer is ready for them. It is an intermediary in the log processing pipeline – there is no processing of data itself on Kafka – there are no reads for instance. In contrast to JMS, one can send batch messages to Kafka and individual messages do not have to be acknowledged.

A design thesis of Kafka is that sequential (contiguous) disk access is very fast and can be even faster than random memory access. It uses zero copy, and uses a binary protocol over TCP, not HTTP.  A quote from design link – “This combination of pagecache and sendfile means that on a Kafka cluster where the consumers are mostly caught up you will see no read activity on the disks whatsoever as they will be serving data entirely from cache”.  This along with the distributed design makes it faster than competing pub-sub systems.

A proposal for adding security to it has been underway, for enterprise use, to control who can publish and subscribe to topics – https://cwiki.apache.org/confluence/display/KAFKA/Security . A talk on Kafka security by HortonWorks on integrating Kerberos authentication, SSL encryption with Kafka was given at a recent meetup. The slides are at – http://www.slideshare.net/harshach/kafka-security.

Of interest was an incident where the SSL patch caused the cluster to become unstable and increase latencies on a production cluster. The issue was debugged using profiling. Although SSL did increase latencies, this specific issue was narrowed to a bug unrelated to SSL in the same patch which had to do with zero copy.

How does IOT affect Identity and Access Management ? 

For the purpose of the IOT, an individual device can be abstracted as a specialized service which produces and consumes data. In addition, the device has certain capabilities to act on, or transform data on a discrete or continuous basis.

Who should have access to these services and capabilities ? It could be

  • other devices in proximity to the device
  • external services
  • certain users

Who gets access is a function of the identity of the devices, the identities of the entities accessing the service and policies governing access (which can include parameters such as location, time, role or more complex rules).

To determine access, a device should be capable of

  • identifying itself , its services and capabilities
  • obtaining authorization for the services and capabilities (before exercising them), and presenting these when requested. This authorization includes a signed access policy
  • updating or invalidating the access policy as time goes on

The access policies need to be applied to the data flows based on the identities and be rich enough to capture use cases of interest.

Identity of ‘Things’ in IOT

What’s the identity of the device ? There can be multiple identities based on whether the device is identifying itself to a user, to another device of the same type, or to other devices in the ecosystem that it is a part of (say a component of a car).

Having a unique device id and leveraging it for the services that are built on the device is a design choice. Consider the choices for iPhone and Android. In the iPhone the device id permeates the application layer; the application developer and can target his application for specific devices and must register the device for developing on it. This design choice allows the device to check the applications that are run on it are valid and their associated developer is registered with Apple. It strengthens the associations in the ecosystem of devices, developers, applications and users.

In Android the security certificates were at the JVM layer which allows self-signed certificates. Here the device id is not used as a strong identifier that is known to applications and developers. This is one reason the open system is more prone to malware.

A unique hardware identity is something to look for in IOT designs. Here’s an article from Intel/McAfee discussing EPID an immutable device ID that can be used for identifying and also anonymizing. https://blogs.mcafee.com/business/intels-iot-gateway-enhancements/

Update: On Nov 25, news came of a number of IOT devices using the same HTTPS certificate and SSH keys. See here. Large clusters of devices on the net are exposed on the internet this way.