Category: Uncategorized

Lacework Intrusion Detection System – Cloud IDS

Lacework Polygraph is a Host based IDS for cloud workloads. It provides a graphical view of who did what on which system, reducing the time for root cause analysis for anomalies in system behaviors. It can analyze workloads on AWS, Azure and GCP.

It installs a lightweight agent on each target system which aggregates information from processes running on the system into a centralized customer specific (MT) data warehouse (Snowflake on AWS)  and then analyzes the information using machine learning to generate behavioral profiles and then looks for anomalies from the baseline profile. The design allows automating analysis of common attack scenarios using ssh, privilege changes, unauthorized access to files.

The host based model gives detailed process information such as which process talked to which other and over what api. This info is not available to a network IDS. The behavior profiles reduce the false positive rates. The graphical view is useful to drill down into incidents.

It does not have an IPS functionality – as false positives with an IPS could negatively affect the system.

Cloud based network isolation tools like Aviatrix might make IPS scenarios feasible by limiting the effect of an IPS.

Stabilizing Cryptocurrency

Before I forget it’s name, BaseCoin is a project that attempts to stabilize a cryptocurrency, so it does not have wild fluctuations. 

Regular (Fiat) currencies are actively managed by Federal banks to be stable and are also stabilized by being the default currency for labor negotiations, employment contracts, retirement accounts etc which are slow moving changes.

More on crypto stabilization in this post – http://www.nadertheory.com/#!/main/articles/post-09-the-intelligent-cryptocurrency-investor .

Decentralized Identity Based on Blockchain

Sovrin project. Uses a Permissioned blockchain which allows it to do away with mining as an incentive and instead directly build a Distributed Ledger Technology which stores Distributed Identifiers (DIDs) and maps them to claims. Removal of mining frees up resources and increases network throughput. Interesting Key Management aspects, including revocation. Contrasts with Ethereum uPort – which is permissionless and public. Neat design, but will face adoption problem as it is unhitched from bitcoin/ethereum.

Click to access Sovrin–digital-identities-in-the-blockchain-era.pdf


Click to access The-Technical-Foundations-of-Sovrin.pdf

DPKI – Distributed PKI. Attempts to do reduce the weakness of a centralized certificate authority as compromising that cert authority affects each of its issued certificates. This concept is built out and used in Sovrin. https://danubetech.com/download/dpki.pdf

Remme. Remember me. An approach to SSL based logins. Modifies SSL.
Used an EmerCoin implementation as mvp and Ethereum blockchain. EmerCoin: https://emercoin.com/content/EMCSSL.pdf . Adoption problem here is change in behavior of each browser and mobile app. https://medium.com/@alexandermomot/passwords-are-crap-6ed4e97bffa7

Sidechains. Original proposal was to free up resources for when trust is established, to reuse blockchain technology and to establish a two-way peg between the sidechain and the blockchain. https://www.blockstream.com/sidechains.pdf
https://bitcoin.stackexchange.com/questions/32367/mining-on-pegged-side-chains

Coco Framework. https://github.com/Azure/coco-framework

HyperLedger – Linux based framework for developing blockchains software. Provides a DLT and uses Intel SGS extensions. (Intel+Microsoft+Linux foundation). Uses a replicated state machine model with each validating peer independently adding to its chain after reaching consensus on order of txns with other peers using Practical Byzantine Fault Tolerance or Proof of Elapsed Time. https://software.intel.com/en-us/blogs/2017/08/10/collaborating-with-microsoft-to-strengthen-enterprise-blockchains . Related – https://stackoverflow.com/questions/36844827/are-blocks-mined-in-hyperledger

A comparison link – https://stackoverflow.com/questions/41690983/different-between-various-blockchain-protocols

 

NVidia Volta GPU vs Google TPU

A Graphics Processing Unit (GPU) allows multiple hardware processors to act in parallel on a single array of data, allowing a divide and conquer approach to large computational tasks such as video frame rendering, image recognition, and various types of mathematical analysis including convolutional neural networks (CNNs). The GPU is typically placed on a larger chip which includes CPU(s) to direct data to the GPUs. This trend is making supercomputing tasks much cheaper than before .

Tesla_v100 is a System on Chip (SoC) which contains the Volta GPU which contains TensorCores, designed specifically for accelerating deep learning, by accelerating the matrix operation D = A*B+C, each input being a 4×4 matrix.  More on Volta at https://devblogs.nvidia.com/parallelforall/inside-volta/ . It is helpful to read the architecture of the previous Pascal P100 chip which contains the GP100 GPU, described here – http://wccftech.com/nvidia-pascal-specs/ .  Background on why NVidia builds chips this way (SIMD < SIMT < SMT) is here – http://yosefk.com/blog/simd-simt-smt-parallelism-in-nvidia-gpus.html .

Volta GV100 GPU = 6 GraphicsProcessingClusters x  7 TextureProcessingCluster/GraphicsProcessingCluster x 2 StreamingMultiprocessor/TextureProcessingCluster x (64 FP32Units +64 INT32Units + 32 FP64Units +8 TensorCoreUnits +4 TextureUnits)

The FP32 cores are referred to as CUDA cores, which means 84×64 = 5376 CUDA cores per Volta GPU. The Tesla V100 which is the first product (SoC) to use the Volta GPU uses only 80 of the 84 SMs, or 80×64=5120 cores. The frequency of the chip is 1.455Ghz. The Fused-Multiply-Add (FMA) instruction does a multiplication and addition in a single instruction (a*b+c), resulting in 2 FP operations per instruction, giving a FLOPS of 1.455*2*5120=14.9 Tera FLOPs due to the CUDA cores alone. The TensorCores do a 3d Multiply-and-Add with 7x4x4+4×4=128 FP ops/cycle, for a total of 1.455*80*8*128 = 120TFLOPS for deep learning apps.

3D matrix multiplication3d_matrix_multiply

The Volta GPU uses a 12nm manufacturing process, down from 16nm for Pascal. For comparison the Jetson TX1 claims 1TFLOPS and the TX2 twice that (or same performance with half the power of TX1). The VOLTA will be available on Azure, AWS and platforms such as Facebook.  Several applications in Amazon. MS Cognitive toolkit will use it.

For comparison, the Google TPU runs at 700Mhz, and is manufactured with a 28nm process. Instead of FP operations, it uses quantization to integers and a systolic array approach to minimize the watts per matrix multiplication, and optimizes for neural network calculations instead of more general GPU operations.  The TPU uses a design based on an array of 256×256 multiply-accumulate (MAC) units, resulting in 92 Tera Integer ops/second.

Given that NVidia is targeting additional use cases such as computer vision and graphics rendering along with neural network use cases, this approach would not make sense.

Miscellaneous conference notes:

Nvidia DGX-1. “Personal Supercomputer” for $69000 was announced. This contains eight Tesla_v100 accelerators connected over NVLink.

Tesla. FHHL, Full Height, Half Length. Inferencing. Volta is ideal for inferencing, not just training. Also for data centers. Power and cooling use 40% of the datacenter.

As AI data floods the data centers, Volta can replace 500 CPUswith 33 GPUs.
Nvidia GPU cloud. Download the container of your choice. First hybrid deep learning cloud network. Nvidia.com/cloud . Private beta extended to gtc attendees.

Containerization with GPU support. Host has the right NVidia driver. Docker from GPU cloud adapts to the host version. Single docker. Nvidiadocker tool to initialize the drivers.

Moores law comes to an end. Need AI at the edge, far from the data center. Need it to be compact and cheap.

Jetson board had a Tegra SoC chip which has 6cpus and a Pascal GPU.

AWS Device Shadows vs GE Digital Twins. Different focus. Availabaility+connectivity vs operational efficiency. Manufacturing perspective vs operational perspective. Locomotive may  be simulated when disconnected .

DeepInstinct analysed malware data using convolutional neural networks on GPUs, to better detect malware and its variations.

Omni.ai – deep learning for time series data to detect anomalous conditions on sensors on the field such as pressure in a gas pipeline.

GANS applications to various problems – will be refined in next few years.

GeForce 960 video card. Older but popular card for gamers, used the Maxwell GPU, which is older than Pascal GPU.

Cooperative Groups in Cuda9. More on Cuda9.

RSA World 2017

I was struck by the large number of vendors offering visibility as a key selling point. Into various types of network traffic. Network monitors, industrial network gateways, SSL inspection, traffic in VM farms, between containers, and on mobile.

More expected are the vendors offering reduced visibility – via encryption, TPMs, Secure Enclaves, mobile remote desktop, one way links, encrypted storage in the cloud etc. The variety of these solutions is also remarkable.

Neil Degrasse’s closing talk was so different yet strangely related to these topics, the universe slowly making itself visible to us through light from distant places, with insights from physicists and mathematics building up to the experiments that recently confirmed gravitational waves – making the invisible, visible. . This tenuous connection between security and physics left me misty eyed. What an amazing time to be living in.

Software Defined Networking Security

Software Defined Networking seeks to centralize control of a large network. The abstractions around computer networking evolved from the connecting nodes via switches, to applications that run on top with the OSI model, to the controllers that manage the network. The controller abstraction was relatively weak – this had been the domain of telcos and ISPs, and as the networks of software intensive companies like Google approached the size of telco networks, they moved to reinvent the controller stack.  Traffic engineering and security which were done in disparate regions were attempted to be centralized in order to better achieve economies of scale. Google adopted openflow for this, developed by Nicira, which was soon after acquired by VMWare; Cisco internal discussions concluded that such a centralization wave would reduce Cisco revenues in half, so they spun out Insieme networks for SDN capabilities and quickly acquired it back. This has morphed into the APIC offering.

The centralization wave is a bit at odds with the security and resilience of networks because of their inherent distributed and heterogenous nature. Distributed systems provide availability, part of the security CIA triad, and for many systems availability trumps security. The centralized controllers would become attractive targets for compromise. This is despite the intention of SDN, as envisioned by Nicira founder M. Casado, to have security as its cornerstone as described here. Casado’s problem statement is interesting: “We don’t have a ubiquitous and horizontal security layer that provides both context and isolation. Where do we normally put security controls? We put it in one of two places. We might put it in the physical infrastructure, which is great because you have isolation. If I have ACLs [access control lists] or a firewall or an IDS [intrusion detection system], I put it in a separate box and I put it away from the applications so that the attack surface is pretty small and it’s totally isolated… you have isolation, but you have no context. ..  Another place we put security is in the end host, an application or operating system. This suffers from the opposite problem. You have all the context that you need — applications, users, the data being accessed — but you have absolutely no isolation.” The centralization imperative comes from the need to isolate and minimize the trusted computing base.

In the short term, there may be some advantage to be gained by complexity reduction through centralized administration, but the recommendation of dumb switches that respond to a tightly controlled central brain, go against the tenets of compartmentalization of risk and if such networks are put into practice widely they can result in failures that are catastrophic instead of isolated.

What the goal should be is a distributed system which is also responsive.

Containers and Privileges

Cgroups limit how much you can do. Namespaces limit how much you can see.

Linux containers are based on cgroups and namespaces and can be privileged or unprivileged. A privileged container is one without the “User Namespace” implying it has direct visibility into all users of the underlying host. The User Namespace allows remapping the user identities in the container so even if the process thinks it is running as root, it is not.

Using cgroups to limit fork bombs from Jessie’s talk:

$ sudo su
# echo 2 > /sys/fs/cgroup/pids/parent/pids.max
# echo $$ > /sys/fs/cgroup/pids/parent/cgroups.procs  // put current pid
# cat /sys/fs/cgroup/pids/parent/pids.current
2
# (echo "foobar" | cat )
bash: for retry: No child processes

Link to the 122 page paper on linux containers security here.  Includes this quote on linux kernel attacks.

“Kernel vulnerabilities can take various forms, from information leaks and Denial of Service (DoS) risks to privilege escalation and arbitrary code execution. Of the roughly 400 Linux system calls, a number have contained privilege escalation vulnerabilities, as recently as of 2016 with keyctl(2). Over the years this included, but is not limited to: futex(2), vmsplice(2), mremap(2), unmap(2), do_brk(2), splice(2), and modify_ldt(2). In addition to system calls, old or obscure networking code including but not limited
to SCTP, IPX, ATM, AppleTalk, X.25, DECNet, CANBUS, Econet and NETLINK has contributed to a great number of privilege escalation vulnerabilities through various use cases or socket options. Finally, the “perf” subsystem, used for performance monitoring, has historically contained a number of issues, such as perf_swevent_init (CVE-2013-2094).”

Which makes the case for seccomp, as containers both privileged and unprivileged can lead to bad things –

““Containers will always (by design) share the same kernel as the host. Therefore, any vulnerabilities in the kernel interface, unless the container is forbidden the use of that interface (i.e. using seccomp)”- LXC Security Documentation by Serge Hallyn, Canonical”

The paper has several links on restricting access, including grsecurity, SELinux, App Armor and firejail. A brief comparison of the first three is here. SELinux has a powerful access control mechanism – it attaches labels to all files, processes and objects; however it is complex and often people end up making things too permissive, instead of taking advantage of available controls.  AppArmor works by labeling  files by pathname and applying policies to the pathname – it is recommended with SUSE/OpenSUSE, not CentOS.  Grsecurity policies are described here, its ACLs support process–based resource restrictions, including memory/cpu/files open, etc.

Blockchain ideas

I think of bitcoin as a self-securing system. Value is created by solving the security problem of verifying the last block of bitcoin transactions. This verification serves as a decentralized stamp of approval, and the verification step consists of hashing a nonce and a block of transactions through a one way hash function and arriving at a checksum with a certain structure (which is hard because hash is random and meeting the structure requirement is a low probability event).

What happens if parties collude to get greater hashing power and increase their share of mining ? This is what happened with GPU mining farms on bitcoin. It was one of the motivations behind Ethereum, which enables code to run as part of transactions, and for the hashing algorithm to be not easily parallelized over a GPU. But it is not economical to mine on desktops as the motivation seems to suggest.

The important aspect I think is the self-securing idea – how can a set of computational systems be designed so that they are incentivized to cooperate and become more secure as a result of that cooperation.

At a recent blockchain conference, some interesting topics of discussion were zero knowledge proofs, consensus algorithms,  greater network-member-ownership in a network with network-effects instead of a centralized rent collection, game theoretic system designs and various etherereum blockchain applications.

Update: this self securing self sustaining view is explored further here. https://medium.com/@jordanmmck/cryptocurrencies-as-abstract-lifeforms-9e35138d63ed

Ethical considerations in Autonomous Vehicles

A recent talk discussed ethics for autonomous vehicles, as an optimization problem. There can be several imperatives for an AV which are all “correct”, yet be in conflict for an autonomous vehicle which relies on hard coded logic.

For example: Follow Traffic safety rules. Stick to the lane. Avoid obstacles. Save most human lives. Save passengers.

How can a vehicle prioritize these ? Instead of a case by case design, the proposal is to cast it in an ethics framework based on optimization of various ideals and constraints with weighted coefficients. Then test the outcomes.

The optimization equation looks to minimize ( path_tracking + steering + traffic_laws ) subject to constraints ( avoid_obstacles ). The equations produce different behaviour when the coefficients are changed.

Another consideration is the Vehicle intent: is it fully in control or can the human override it. This affects the software assumptions and system design.

The talk was presented by Sarah Thornton, PhD. Stanford. A related discussion on safety is  here : Who gets the blame when driverless cars crash ?.

Somewhat related is the idea of computer vision itself operating correctly. There can be adversarial inputs as discussed in the paper Intriguing properties of neural networks which discusses blind spots. Generative Adversarial Models are a way to improve the generalization capabilities of a network by pitting generative against discriminative models. The European Conference on Computer Vision starts today: http://www.eccv2016.org/main-conference/

 

Neural Network Training and Inferencing on Nvidia

Nvidia just announced the Tesla P40 and P4 cards for Neural network inferencing applications. A review is at http://www.anandtech.com/show/10675/nvidia-announces-tesla-p40-tesla-p4. Comparing it to the Tesla P100 released earlier this year, the P40 is targeted to inferencing applications. Whereas the P100 was targeted to more demanding training phase of neural networks. P4o comes with the TensorRT (real time) library for fast inferencing (e.g. real time detection of objects).

Some of the best solutions of hard problems in machine learning come from neural networks, whether in computer vision, voice recognition, games such as Go and other domains. Nvidia and other hardware kits are accelerating AI applications with these releases.

What happens if the neural network draws a bad inference, in a critical AI application ? Bad inferences have been discussed in the literature, for example in the paper: Intriguing properties of neural networks.

There are ways to minimize bad inferences in the training phase, but not foolproof – in fact the paper above mentions that bad inferences are low probabalility yet dense.

Level 5 autonomous driving is where the vehicle can handle unknown terrain. Most current systems are targeting Level 2 or 3 autonomy. The Tesla Model S’ Autopilot is Level 2.

An answer is to pair it with a regular program that checks for certain safety constraints. This would make it safer, but this alone is likely insufficient either for achieving Level 5 operations, or for providing safely for them.

Automotive and Process Safety Standards

ISO 26262 is a standard for Automotive Electric/Electronic Systems safety, that is adopted by car manufacturers. Its V shape consists of two legs, the first comprising definition, analysis, design, architectural design, development and implementation. The second leg consists of verification and validation of the software, starting from unit tests to functional tests, safety tests and system-wide tests. Model based design is used to reduce the complexity. These models are now fairly complex. Model based design is the one of the value adds that Mentor Graphics automotive kit provides is help with achieving compliance with this standard.

ISO 26262 is derived from its parent, the IEC 61508 standard, which is titled Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems. This parent standard has variants for safety of automotive, railway, nuclear, manufacturing processes (refineries, petrochemical, chemical, pharmaceutical, pulp and paper, and power) and machinery related electrical control systems. An associated, upcoming standard is the SAE J2980.

An excellent talk today by MIT fellow Ricardo Dematos discussed more comprehensive approaches to automotive safety. This is building up from his work with safety research at MIT, AWS IoT and our SyncBrake entry for V2V safety at TechCrunch Disrupt 2015.

MICROS Point-of-Sale POS attacks

Oracle MICROS Point of Sale systems are again reported to be attacked. The support site of MICROS was infected by malware which was able to access usernames and passwords and send them to a remote server. This remote server was identified as one previously known to be used by ‘Carbanak’, a cybercrime group.

ZDNet reports that in the last year, dozens of machines at Starwood and Hilton hotels were impacted by malware, with the aim of poaching payment and card data, which can be used or sold on to the highest bidder. The attack on MICROS systems may be behind these.

An advisory by VISA here, discusses two previous malware threats attacking POS systems, Carbanak and MalumPOS.

This report by Symantec describes POS attacks as multi-stage attacks, with threats and mitigation strategies.

In August 2014, Department of Homeland Security had issued an advisory for ‘Backoff’ a POS malware, which affected a large number of systems in retailers including Target, PF Chang’s, Neiman Marcus, Michaels, Sally Beauty Supply and Goodwill Industries International.  PF Chang had a cyber insurance policy which covered direct damages, but not claims filed by Mastercard/Bank of America for credit card fraud reimbursements and reissuance via a “PCI DSS assessment”.

These trends are interesting for a few reasons. First, the recurrent attack are reason to accelerate the move to EMVs. Second, it gives rise to new architectures for payments. Third it draws attention to Blockchain technologies.

Verifone has a Secure Commerce Architecture which sends the payment data directly from the terminal system which receives the card, and the merchant (acquirer) bank, without touching the POS system. This reduces payment fraud and also makes certification for EMVs much easier.

sca_diagram

EMV stands for Europay, Mastercard, Visa. After the EMV deadline of Oct 1, 2015, the liability for credit card fraud shifted to whichever party is the least EMV-compliant in a fraudulent transaction. Automated fuel dispensers have until 2017 to make the shift to EMV. ATMs face two fraud liability shift dates: MasterCard’s in October 2016 and Visa’s in October 2017.

Because of EMV securing the terminals, there is predicted to be a rise in online payments fraud.  While counterfeit opportunity dwindles, the next top three types of credit card fraud, Account takeover, Card-not-present fraud, and (fake new) Application fraud are rising. The recent MICROS attack fits the pattern of attackers probing for and finding the weakest links in the payments chain.

ASN.1 buffer overflow puts networks at risk

An ASN.1 vulnerability has been found in a security advisory of 7/18 here. It has to do with length bounds checking in the LTV triplet.  A fix is provided but updating it on a large number of GSM devices is not practical. “It could be triggered remotely without any authentication in scenarios where the vulnerable code receives and processes ASN.1 encoded data from untrusted sources, these may include communications between mobile devices and telecommunication network infrastructure nodes, communications between nodes in a carrier’s network or across carrier boundaries, or communication between mutually untrusted endpoints in a data network.”

A discussion at Arstechnica here, brings up a real exploit against GSM base station software, that operates below the application layer and so can be exploited against a  large number of devices.

A quote from the paper – “When GSM radio stacks were implemented, attacks against end devices were not much of a concern. Hence checks on messages arriving over the air interface were lax as long as the stack passed interoperability tests and certifications. Open-source solutions such as OpenBTS (Base Transceiver Station)allow anyone to run their own GSM network at a fraction of the cost of carrier-grade equipment, using a sim- ple and cheap software-defined radio. This development has made GSM security explorations possible for a significantly larger set of security researchers. Indeed, as the reader will see in the following, insufficient verification of input parameters transmitted over the air interface can lead to remotely exploitable memory corruptions in the baseband stack.”

The cellular baseband stack of most smartphones runs on a separate processor and is significantly less hardened, if at all. GSM does not provide mutual authentication, there is no protection against fake BTSs.