❓❓❓HOW LAKEHOUSE TABLE FORMAT WORKS❓❓❓

1. Engine reads table format metadata
2. Builds list of files with relevant data based on metadata
3. Scans those files and executes query

#dataengineering #dataanalytics #bigdata #datalakehouse #apacheiceberg #apachehudi #deltalake

Last updated 1 year ago

Delta Lake · @deltalakeoss
56 followers · 72 posts · Server social.lfx.dev

πŸ”Ž Discover how simplifies the process of building data lakehouses and data pipelines at scale. With this practical guide, , , and will explore key data reliability challenges and learn to apply modern data engineering and management techniques. You'll also understand how ACID transactions bring reliability to data lakehouses at scale!

Check out Delta Lake: The Definitive Guide ➑️ lnkd.in/g3-RBeUz

#deltalake #dataengineers #datascientists #dataanalysts #opensource #oss #datalakes #lakehouse

Last updated 1 year ago

Delta Lake · @deltalakeoss
53 followers · 71 posts · Server social.lfx.dev

Liquid Clustering dynamically clusters data based on data patterns, which helps to avoid the over- or under-partitioning problems that can occur with Hive partitioning.

Liquid Clustering resulted in 2.5x faster clustering relative to Z-order. In the same trial, traditional Hive-style partitioning was an order of magnitude slower due to the expensive shuffle required for writing out many partitions.

Learn more πŸ‘‰ lnkd.in/gZ9AvE8X

#deltalake #opensource #oss #dataengineering

Last updated 1 year ago

Delta Lake · @deltalakeoss
53 followers · 69 posts · Server social.lfx.dev

3.0 features automatic support for competing Apache Iceberg and Hudi table formats allowing enterprise users to eliminate complicated integration work and focus on building truly open data lakehouses. πŸš€

We are excited to announce the *preview* release of Delta Lake 3.0.0. Check it out today πŸ‘‰ lnkd.in/eeYs44H4

#deltalake #opensource #dataaisummit #data

Last updated 1 year ago

Delta Lake · @deltalakeoss
50 followers · 67 posts · Server social.lfx.dev

In this blog post, Shingo OKAWA delves into the ecosystem and examines its characteristics. Additionally, Shingo explores how the PyO3 crate offers a straightforward example of managing /Rust FFI functionality and demonstrates how to examine the generated β€œglue code” produced by the PyO3 crate.

Read delta-rs as a Python/Rust FFI example Part 2 πŸ‘‰ lnkd.in/eH5w6493

#rust #python #deltalake #opensource #rustlang

Last updated 1 year ago

Delta Lake · @deltalakeoss
50 followers · 66 posts · Server social.lfx.dev

Session Spotlight ⭐ Tune into DoorDash's journey to migrate from a flaky system with 24-hour data delays, to standardizing a CDC streaming pattern across more than 150 databases to produce near real-time data in a scalable, configurable, and reliable manner.

Register for DAIS ➑️ dbricks.co/3lvO1hz
View the session catalog ➑️ bit.ly/3MLJ7Xa

Use code ETLINUX400 to save $400 off the regular price of the full conference pass!

#dataaisummit #etl #doordash #deltalake #opensource

Last updated 1 year ago

Delta Lake · @deltalakeoss
49 followers · 65 posts · Server social.lfx.dev

Online Meetup ⭐ TOMORROW, June 13 at 9:00 AM PDT
Learn more about Databricks Connect and Spark Connect so you can use Spark from anywhere! 🌐

Come join the awesome Simon Whiteley, CTO of Advancing Analytics to discuss with a panel including Martin Grund, Stefania Leone, and his partner in crime Denny Lee.

RSVP ➑️ meetup.com/data-ai-online/even

#deltalake #spark #dataengineering

Last updated 1 year ago

Delta Lake · @deltalakeoss
49 followers · 64 posts · Server social.lfx.dev

We are excited to share that bindings v0.10.0 is here! This release includes optimize , storage catalog, concurrent file compaction, and so much more. πŸ¦€πŸ

Check it out today! ➑ lnkd.in/e-BhV5qa

#deltars #python #zorder #datafusion #deltalake #rust #dataengineering #opensource #linuxfoundation #oss

Last updated 1 year ago

Ale Segura · @alesegura
469 followers · 206 posts · Server masto.ai

I attended the this week. I started with AWS ~6 months ago. It was gratifying to realise that I have learned a lot since then. I talked to a few experts and they told me I was in the right path and that the struggles I have with are not only mine (they simply don’t support well ). My perfectionist self was relieved 😌 I didn’t solve any of my problems but sometimes it helps to realise you are not as stupid as your programming struggles make u feel sometimes… πŸ˜…

#awssummitlondon #awsglue #deltalake

Last updated 1 year ago

Delta Lake · @deltalakeoss
48 followers · 63 posts · Server social.lfx.dev

ONE WEEK from today! ⭐ Join Robert Pack, Sr. Digital Expert Cloud Native Machine Learning Platform and Technology Principal at BASF as he discusses the relationship between process engineering and data engineering alongside D3L2 host, Denny Lee.

πŸ¦€ D3L2: How BASF achieves global sustainability with w/ Robert Pack
πŸ—“οΈ Thursday, June 15
πŸ•— 9:00 - 10:00β€―AM PDT

RSVP: community.linuxfoundation.org/

#deltalake

Last updated 1 year ago

Delta Lake · @deltalakeoss
47 followers · 61 posts · Server social.lfx.dev

In this edition of π™»πšŠπšœπš πš†πšŽπšŽπš” πš’πš— 𝚊 π™±πš’πšπšŽ...

βœ… Hear how Kubit uses Delta Sharing to power their product analytics platform
βœ… Learn about an exciting new contribution to the Dask community
βœ… Plus, 2 π™£π™šπ™¬ π™§π™šπ™‘π™šπ™–π™¨π™šπ™¨ from the Delta Lake and Delta Sharing projects!

Read along or watch us on YT! lnkd.in/e-UuBFPa

#deltalake #opensource #oss #linuxfoundation

Last updated 1 year ago

Delta Lake · @deltalakeoss
46 followers · 60 posts · Server social.lfx.dev

There are a variety of ways to create tables. You can create a Delta table by writing out a DataFrame with the Delta format, you can create an empty Delta table with , or you can convert an existing Parquet table to the Delta format. Very easy to jump in and start using Delta Lake.

cc Matthew Powers, CFA

#deltalake #sql #opensource #oss #linuxfoundation

Last updated 1 year ago

Delta Lake · @deltalakeoss
46 followers · 59 posts · Server social.lfx.dev

Databeans has a handbook written by their engineers that includes advice and recipes on data, particularly . For a while, this handbook was kept a secret, but they've recently chosen to share certain pages with you!

cc Houssem Eddine Dalhoumi, DataBeans

#deltalake #opensource #dataengineering #datalakes #databeans #linuxfoundation

Last updated 1 year ago

Delta Lake · @deltalakeoss
46 followers · 58 posts · Server social.lfx.dev

πŸ¦€ Watch D3L2: Discussing Rust, Ballista, Ray SQL, DataFusion with Andy Grove on YouTube: youtube.com/watch?v=NEL6DluUxg

#datafusion #raysql #ballista #opensource #deltalake

Last updated 1 year ago

Delta Lake · @deltalakeoss
46 followers · 57 posts · Server social.lfx.dev

πŸ“£ We are excited to announce the release of Delta Lake 2.4.0 on Apache Spark 3.4. Similar to Apache Sparkβ„’, we have released Maven artifacts for both Scala 2.12 and Scala 2.13! πŸŽ‰

Documentation: lnkd.in/eTD9ua_6
Python artifacts: lnkd.in/e65AeChW

⭐ View the release notes: lnkd.in/er2PDhjJ

#deltalake #opensource #oss #data #apachespark

Last updated 1 year ago

Delta Lake · @deltalakeoss
46 followers · 56 posts · Server social.lfx.dev

Join us on Thursday, May 25th for D3L2: Discussing Rust, Ballista, Ray SQL, DataFusion with Andy Grove! πŸ¦€

Andy Grove has been specializing in query engines and distributed systems. Among many of his accolades, he started the DataFusion and Ballista query engine projects and donated both to the Apache Software Foundation as part of the Apache Arrow project. He also donated the initial Rust implementation of Apache Arrow and recently created Ray-SQL.

RSVP: community.linuxfoundation.org/

#deltalake

Last updated 1 year ago

Delta Lake · @deltalakeoss
45 followers · 55 posts · Server social.lfx.dev

We are excited to announce the *preview release* of Delta Lake 2.4.0 on Apache Spark 3.4! 🎊 Similar to Apache Sparkβ„’, we have released Maven artifacts for both Scala 2.12 and Scala 2.13.

🌟Documentation: lnkd.in/eBwUGV-2
🌟Maven artifacts: lnkd.in/eex2VxMm
🌟Python artifacts: lnkd.in/eQf_B4eM

View the key features in this release: lnkd.in/eYkMGpTd

#deltalake #opensource #spark #oss #dataengineering

Last updated 1 year ago

Delta Lake · @deltalakeoss
45 followers · 54 posts · Server social.lfx.dev

Wondering if you should hop on the bandwagon? This video covers why Rust is exciting, especially for Python (data) developers! 😁

lnkd.in/ewVq84r2

#rust #opensource #python #dataengineering #deltalake #oss #developers

Last updated 1 year ago

Delta Lake · @deltalakeoss
45 followers · 53 posts · Server social.lfx.dev

Learn about the latest innovations with like and other open source Data + AI technologies such as Apache Sparkβ„’, , & Delta Sharing at !

πŸ“ San Francisco, CA
πŸ—“οΈ June 26 - 29, 2023

🌟 Save $400 off the regular price of the full conference pass using code ETLINUX400 (expires 6/2).

Register here: dbricks.co/3lvO1hz

#llms #dolly #deltalake #mlflow #dataaisummit #data #oss #ai

Last updated 1 year ago

Nicolas FrΓ€nkel · @frankel
762 followers · 698 posts · Server mastodon.top

Get a detailed overview of , , and as we discuss their data storage, processing capabilities, and deployment options dzone.com/articles/delta-hudi-

#deltalake #apachehudi #apacheiceberg #analytics #spark

Last updated 2 years ago