Month: May 2019

Omnisci GPU based columnar database

Omnisci is a columnar database that reads a column into GPU memory, in compressed form, allowing for interactive queries on the data. A single gpu can load 10million to 50million rows of data and allows interactive querying without indexing. A demo was shown at the GTC keynote this year, by Aaron Williams. He gave a talk on vehicle analytics that I attended last month.

In the vehicle telemetry demo, they obtain vehicle telemetry data from an F1 game that has data output as UDP, 10s of thousands of packets a second – take the binary data off of UDP, and convert it to json and use it as a proxy for real telemetry data. The webserver refreshes every 3-4 seconds. The use case is analysis of increasing amounts of vehicle sensor data as discussed in this video and described in the detailed Omnisci blog post here.

Programming of the blocks is done through the UX with StreamSets which is an open source tool to build data pipelines –  a cool demo of streamsets for creating pipelines including kafka is here.

The vehicle analytics demo pipeline consisted of UDP to Kafka, Kafka to JSON, then JSON to OmniSci via pymapd .  Kafka serves as a message broker and also for playback of data.

Based on the the GPU loaded data, the database allows queries and stats on different vehicles that are running.

The entire system runs in the cloud on a VM supporting Nvidia GPUs, and can also be run on a local GPU box.

Lacework Intrusion Detection System – Cloud IDS

Lacework Polygraph is a Host based IDS for cloud workloads. It provides a graphical view of who did what on which system, reducing the time for root cause analysis for anomalies in system behaviors. It can analyze workloads on AWS, Azure and GCP.

It installs a lightweight agent on each target system which aggregates information from processes running on the system into a centralized customer specific (MT) data warehouse (Snowflake on AWS)  and then analyzes the information using machine learning to generate behavioral profiles and then looks for anomalies from the baseline profile. The design allows automating analysis of common attack scenarios using ssh, privilege changes, unauthorized access to files.

The host based model gives detailed process information such as which process talked to which other and over what api. This info is not available to a network IDS. The behavior profiles reduce the false positive rates. The graphical view is useful to drill down into incidents.

It does not have an IPS functionality – as false positives with an IPS could negatively affect the system.

Cloud based network isolation tools like Aviatrix might make IPS scenarios feasible by limiting the effect of an IPS.