Chris Wensel · @cwensel
161 followers · 1145 posts · Server fosstodon.org

So Tessellate inherits lots of support for various data formats from Cascading
github.com/cwensel/cascading

Even though dropped Cascading support, we were able to port it over.

Now that Parquet is native to Cascading, it should be easier to add support.

This would allow to convert data as it arrives into Iceberg continuously for use in Athena or other data front-ends.

Anyone interested in a challenge?

#apacheparquet #ApacheIceberg #clusterless #aws #java

Last updated 1 year ago

Samrose · @samrose
40 followers · 12 posts · Server infosec.exchange

🌐 Announcing Matano + Suricata!

Suricata is a popular open source NIDS/NIPS engine used for network analysis and threat detection.

We just shipped out a new integration that allows you to easily push Suricata logs & alerts into a Matano Security Lake in your AWS account for realtime detection-as-code with Python and analysis using AWS Athena + SQL! 🚀

Interested in how to build your own Security Data Lake using Suricata logs?

Check out our blog post: matano.dev/blog/2023/01/12/sur 🔎

#opensource #infosec #networksecurity #suricata #OISF #intrustiondetection #intrusionprevention #ids #ips #nids #nips #cloudnative #cloudsecurity #rust #datalake #aws #awssecurity #ApacheIceberg #secops #security #siem #threatdetection #threathunting #DetectionAndResponse

Last updated 2 years ago

Samrose · @samrose
22 followers · 6 posts · Server infosec.exchange

I'm excited to announce that Matano is joining YCombinator's W23 Batch! 🚀

SIEM today is broken -- it's too expensive, doesn't scale, has poor support for correlation, causes vendor lock-in, is inflexible for detection engineering, the list goes on...

My brother Shaeq and I quit our jobs at AWS to solve this problem and build a better solution for security operations and analytics that fully utilizes the power of cloud and big data tech available today.

While the cybersecurity industry has been held back by legacy architectures tied to age-old vendor products, the data analytics industry has seen a ton of innovation through open source initiatives such as Apache Iceberg, Parquet, and Arrow delivering massive cost savings and performance breakthroughs.

We started Matano to close the gap between these two worlds by building an OSS platform to help security teams leverage the modern data stack (e.g. Spark, Athena, Snowflake) to efficiently analyze security data from all the disparate sources across an organization (Cloud/SaaS, Endpoint, Network, etc.).

Matano helps Detection & Response teams break free from their SIEM by deploying a vendor-agnostic Security Data Lake into their AWS account and giving them a platform to build detection-as-code using Python and SQL!

This is just the beginning in our mission to build the first open platform for threat hunting, detection & response, and cybersecurity analytics at petabyte scale.

I am super grateful to all of our early supporters for the help & joining in on this journey to reinvent SIEM. Let's goo!

ycombinator.com/launches/Hl0-m

#startup #ycombinator #opensource #cybersecurity #cloudsecurity #awssecurity #siem #threatdetection #secops #devsecops #aws #infosec #dfir #DetectionAndResponse #soc #ApacheIceberg #security #datalake #blueteam

Last updated 2 years ago

AdiPolak · @adipolak
179 followers · 28 posts · Server mastodon.online

Excited for my upcoming course with OReilly on implementing CI/CD concepts in data lakes. 3 hours of , hudi, , and adopting best practices. Its amazing how much more reliable and resilient our production system can become, once invested in the right tools and processes.

#dataengineering #lakefs #DeltaLake #apache #ApacheIceberg

Last updated 2 years ago

Joe Wood · @Joewood
34 followers · 57 posts · Server fosstodon.org

I've been reading a bit more deeply about the implementations of . First impressions are that (by necessity) it's an OLTP (service) based approach to a data architecture problem. Necessary because it's impossible to specify an interopable standard on a distributed network with a technology dependent Data Architecture solution. That said, I wonder if more work could be done with relays and build on top of S3 table specs like , providing a kind of cheap pub sub model.

#activitypub #ApacheIceberg

Last updated 2 years ago