Presented by

  • Arjen Lentz

    Arjen Lentz

    Arjen was born in Amsterdam, but now lives near Brisbane ("Meanjin") on Australia's East Coast with his wife Claire (the kids have flown the coop). In another handful of years, he'll have lived in Australia as long as he was in The Netherlands. Why did he move in the first place, you ask? Well, the subtropical climate, the space, and fewer "tall poppy" problems. Arjen started with programming in a heavily air-conditioned basement room of a local youth club, on a PDP-8/VT78 terminal with 8" floppies (BASIC and assembly), but soon he managed to get hold of an Acorn BBC/B, featuring a 6502 CPU running at a whopping 2 MHz. 32 KB RAM, permanent storage via cassette tape... Arjen has worked in the fields of programming (mostly C), databases (at MySQL he wrote part of the manual, and did some other stuff), training, consulting, and most recently information "cyber!" security. Along the way has also run his own company, made a lot of mistakes, and learnt many things.

Abstract

Haystack is part of the OpenActa project, which is under development. Haystack is a key-value store with interlinking, so effectively all related fields are indexed as well as timestamped. However, it is also write-once and immutable. When data is written to disk, it are also encrypted and compressed. All of this makes it ideal for storing log data; which, not by coincidence, is exactly what Haystack was designed for! In this talk, we will take a fairly detailed look at how Haystack works (concept more than code), as it uses a novel approach and it is important to understand how it is different from other log stores to consider when to use it, and use it efficiently. From a functional perspective, some aspects look a bit like a relational table, but write-once and all fields are indexed. Thus it is more structured than a basic key/value store or a data lake, but without the overhead a relational database brings. In short, it's worth having a look at, and perhaps you'll like it! Since this type of codebase is heavily into "data juggling", and it was a completely new project anyway, it was felt that using a memory-safe language was the most appropriate. The initial implementation of Haystack was done in Go, and it was discovered that while it was pretty fast, it's not as memory-efficient as we would have hoped. There is an intent to do a rewrite (with lessons learnt from the initial implementation) in Rust. The broader and long term scope of OpenActa is to provide a fully open source codebase for central log collection, storage, search and analysis. Haystack is merely a first step. Making the IT world more secure cannot be achieved just through expensive corporate offerings, as inevitably many small companies and other organisations find themselves unable to afford those tools. Therefore we must, and can, do better. Others are also developing initiatives, for instance in the field of log analysis queries (Sigma rules), and naturally we would not dream of duplicating any of that excellent work. In the best of open source toolsets, it is often a combination of components from different groups of people that come together and produces something that is much greater than the sum of its parts.