Danilo Poccia · @danilop
787 followers · 946 posts · Server awscommunity.social

More info on this new capability 👉 Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog aws.amazon.com/blogs/big-data/

#aws #analytics #datalake

Last updated 1 year ago

Danilo Poccia · @danilop
782 followers · 932 posts · Server awscommunity.social

Amazon Redshift announces automatic mounting of AWS Glue Data Catalog 👉 You no longer have to create an external schema to use data lake tables in AWS Glue Data Catalog aws.amazon.com/about-aws/whats

#aws #analytics #datalake

Last updated 1 year ago

Phil · @psFried
0 followers · 5 posts · Server techhub.social

There's lots of fancy file formats to choose from when building a but we still went with gzipped JSON. Why? Because we prioritize moving data into purpose-built systems rather than querying it directly. This basic shift in approach has made a world of difference! Here's a thing I wrote about that:

opendatascience.com/choosing-a

#datalake

Last updated 1 year ago

Jérôme Darmont 🇺🇦 · @darmont
64 followers · 344 posts · Server social.sciences.re

Nouveau projet

Très heureux de co-porter avec Genoveva Vargas-Solar le projet pluridisciplinaire et international LETITIA (Lac de donnéEs, expérimenTation, vIe, Terre, curatIon, explorAtion), financé par la FIL.

eric.univ-lyon2.fr/jdarmont/?p

#datalake

Last updated 1 year ago

PyLadies Bot · @pyladies_bot
95 followers · 73 posts · Server botsin.space
VentureBeat :press: · @VentureBeat
57 followers · 57 posts · Server press.coop
mauvehed 🐿️ · @mauvehed
506 followers · 179 posts · Server defcon.social

This just arrived in the mail and I am very excited to dive in! Unlike previous books on the topic, this digs into the real meat of building a contextual score using a risk based approach and leveraging vulnerability and exploit data sets.

I am looking forward to further exploring the topic and using everything I can to improve my own working models and approach to this ever growing topic.

#vulnerabilitymanagement #riskmodeling #datalake

Last updated 2 years ago

CraigMullins · @craigmullins
32 followers · 863 posts · Server mas.to

RT @ventanaresearch
Data lakes are supporting multiple data sources, formats, analytics workloads and business functions, @ventanaresearch’s Data Lakes Dynamics Insights research shows. bit.ly/3SWquSQ 

#datalake #data

Last updated 2 years ago

Corey Smith 👨🏻 · @coreysmith
76 followers · 162 posts · Server mas.to

Green today! Discussed:

🏦 Bank stability
👰 Marital advice
👁️ Eye drop recall
🍀 Green pinchers
🖥️
👓 Headache glasses
⚡ Internet bandwidth
🏖️ plans
🏊 Go jump in a
🦸 is not dead
⏹️
🏆 certification
🔍 support discontinued
🌐 licensing

See you same time next week! 🥪🥗

#experienceedge #sitecore #azuresearch #sitecoremvp #gartnermagicquadrant #sitecorexp #datalake #springbreak #microservices #sitecorelunch

Last updated 2 years ago

Danilo Poccia · @danilop
672 followers · 329 posts · Server awscommunity.social

Automate schema evolution at scale with Apache Hudi in AWS Glue 👉 In this post, we show how to build a transactional data lake using Apache Hudi support for ACID transactions and CRUD operations aws.amazon.com/blogs/big-data/

#aws #analytics #opensource #datalake

Last updated 2 years ago

2meterdba | Reitse Eskens · @2meterdba
32 followers · 32 posts · Server mastodon.nl
Samrose · @samrose
40 followers · 12 posts · Server infosec.exchange

Matano is live on the front page of HackerNews!! 🔥

Come join the discussion on OSS, SIEM, and why we are helping orgs build on top of vendor-agnostic Security Data Lakes instead 🙂

news.ycombinator.com

#cybersecurity #security #oss #hackernews #cloudsecurity #DetectionAndResponse #threathunting #threatdetection #datalake #awssecurity #aws #siem #securitydatalake

Last updated 2 years ago

Samrose · @samrose
40 followers · 12 posts · Server infosec.exchange

🌐 Announcing Matano + Suricata!

Suricata is a popular open source NIDS/NIPS engine used for network analysis and threat detection.

We just shipped out a new integration that allows you to easily push Suricata logs & alerts into a Matano Security Lake in your AWS account for realtime detection-as-code with Python and analysis using AWS Athena + SQL! 🚀

Interested in how to build your own Security Data Lake using Suricata logs?

Check out our blog post: matano.dev/blog/2023/01/12/sur 🔎

#opensource #infosec #networksecurity #suricata #OISF #intrustiondetection #intrusionprevention #ids #ips #nids #nips #cloudnative #cloudsecurity #rust #datalake #aws #awssecurity #ApacheIceberg #secops #security #siem #threatdetection #threathunting #DetectionAndResponse

Last updated 2 years ago

Laboratoire ERIC · @labo_eric
9 followers · 7 posts · Server social.sciences.re

03-04/23 – Offre de stage : Conception et implémentation d’un lac de données de robotique agricole

Pour accompagner la transition agroécologique, les robots ont un rôle essentiel à jouer dans le domaine de l'agriculture intelligente. Ils sont capables d'effectuer des opérations agricole

eric.msh-lse.fr/03-04-23-offre

'emploi/thèse/stage

#datalake #agriculture #Offresd

Last updated 2 years ago

Martin_English · @martin_english
93 followers · 271 posts · Server mastodon.au

The latest The SAP Mentors Daily! paper.li/sapmentors/sapmentors Thanks to @janmusil@twitter.com @blackbox_europe@twitter.com @cichuck@twitter.com

#techstuff #datalake

Last updated 2 years ago

Samrose · @samrose
22 followers · 6 posts · Server infosec.exchange

I'm excited to announce that Matano is joining YCombinator's W23 Batch! 🚀

SIEM today is broken -- it's too expensive, doesn't scale, has poor support for correlation, causes vendor lock-in, is inflexible for detection engineering, the list goes on...

My brother Shaeq and I quit our jobs at AWS to solve this problem and build a better solution for security operations and analytics that fully utilizes the power of cloud and big data tech available today.

While the cybersecurity industry has been held back by legacy architectures tied to age-old vendor products, the data analytics industry has seen a ton of innovation through open source initiatives such as Apache Iceberg, Parquet, and Arrow delivering massive cost savings and performance breakthroughs.

We started Matano to close the gap between these two worlds by building an OSS platform to help security teams leverage the modern data stack (e.g. Spark, Athena, Snowflake) to efficiently analyze security data from all the disparate sources across an organization (Cloud/SaaS, Endpoint, Network, etc.).

Matano helps Detection & Response teams break free from their SIEM by deploying a vendor-agnostic Security Data Lake into their AWS account and giving them a platform to build detection-as-code using Python and SQL!

This is just the beginning in our mission to build the first open platform for threat hunting, detection & response, and cybersecurity analytics at petabyte scale.

I am super grateful to all of our early supporters for the help & joining in on this journey to reinvent SIEM. Let's goo!

ycombinator.com/launches/Hl0-m

#startup #ycombinator #opensource #cybersecurity #cloudsecurity #awssecurity #siem #threatdetection #secops #devsecops #aws #infosec #dfir #DetectionAndResponse #soc #ApacheIceberg #security #datalake #blueteam

Last updated 2 years ago

ankurkumar · @ankurkumar
43 followers · 69 posts · Server techhub.social

The Shape of Modern Data Architecture 👇
🔹 shows the end-to-end data process inc. data analytics lifecycle
🔹shows the data catalog at the center of the architecture and connected with every other component

alation.com/blog/a-data-archit

#DataEngineering #datalake

Last updated 2 years ago

CraigMullins · @craigmullins
23 followers · 149 posts · Server mas.to
Brett Flippin · @bflipp
85 followers · 399 posts · Server vmst.io

Woof, file compaction with 1.x is the only way to make it usable. 50-100x performance improvements depending on data and partition sizes. The default merge and write operations are incredibly inefficient. I understand its been greatly improved in 2.x. We're a couple months from upgrading the platform though.

#deltalake #datawarehouse #datalake #spark #pyspark #aws #awsglue

Last updated 2 years ago