Bill Fellows · @billinkc
163 followers · 640 posts · Server dataplatform.social

I like that with I can define cluster timeout periods so that when I space off and leave it running, it shuts the cluster down so I don't burn through resources too quickly.

*starts cluster*
Looks away. Pets the dog, gets a glass of water, checks the garden, etc, sits back down to see my the cluster terminating due to idle settings.

Ha, I'm such a dumb bitch. Let's try this again.

*starts cluster*
Oh, out of water....

#databricks

Last updated 1 year ago

Tim Kellogg · @kellogh
940 followers · 3502 posts · Server hachyderm.io

The assistant is so bad. Seems like they're using a worthless and also doing a really bad job of stuffing the context

#databricks #ai #llm

Last updated 1 year ago

Tim Kellogg · @kellogh
934 followers · 3427 posts · Server hachyderm.io

Spark vs SQL: When should you choose Spark?

Spark: Big jobs that need to do table scans

SQL: Lots of small single-record reads and writes, like a CRUD app

Spark saturates compute and memory resources, so it can be very cheap for processing large amounts of data, whereas SQL optimizes for the noisy neighbor problem and tries to ensure that queries don't impact the performance of other concurrently running queries. Spark only allows one query at a time.

#databricks #data #dataviz #datascience

Last updated 1 year ago

Kedro · @kedro
62 followers · 56 posts · Server social.lfx.dev

New blog post: How to integrate Kedro and Databricks Connect 🔶

In this blog post, our colleague Diego Lira explains how to use Databricks Connect with Kedro for a development experience that works completely inside an IDE.

kedro.org/blog/how-to-integrat

Install it with

```
pip install databricks-connect
```

#kedro #python #pydata #datascience #databricks #dbx #spark #pyspark

Last updated 1 year ago

PipeRider · @piperider
162 followers · 43 posts · Server fosstodon.org

databricks is now a supported by PipeRider for dbt projects!

Now databricks users can get:

- Data Impact Reports
- Data profile comparisons
- Pull Request Impact summaries
- Lineage Diff

Check the PipeRider docs for a full list of supported data sources:

docs.piperider.io/reference/su

#dbt #piperider #databricks #datyaquality #dataimpact #dataengineering #analyticsengineering #opensource

Last updated 1 year ago

Kevin Chant · @kevchant
251 followers · 442 posts · Server techhub.social

Published yesterday. Covers big UI updates for two Data Platform services that have been introduced recently.

Contains couple of observations about Repos and .

kevinrchant.com/2023/08/04/big

#azure #databricks #Purview

Last updated 1 year ago

Nick Rankovic · @kibernick
101 followers · 403 posts · Server fosstodon.org

Remember kids, your job won't execute unless you give it a `if __name__ == "__main__":`

#python #databricks

Last updated 1 year ago

Kevin Chant · @kevchant
250 followers · 441 posts · Server techhub.social

Covers big UI updates for two Azure Data Platform services that have been introduced recently.

kevinrchant.com/2023/08/04/big

#azure #databricks #Purview

Last updated 1 year ago

DB-Engines · @DBEngines
21 followers · 12 posts · Server techhub.social

DB-Engines Ranking climbers of the month:
1.
2.
3.
db-engines.com/en/ranking

#dynamodb #snowflakedb #databricks

Last updated 1 year ago

Multiverse Mike · @multiverseofbadness
898 followers · 2513 posts · Server toot.wales

I guess the promising thing about work today is that our department is suddenly having some weird initiation script issue after exactly zero environmental or package changes, so all the work I'm supposed to be doing isn't getting done... 😂

#databricks

Last updated 1 year ago

dazfuller :rickwhoah: · @dazfuller
102 followers · 1206 posts · Server mstdn.social

Took some time today to rewrite a local build script I have for my data source. I use it for running tests locally against multiple Spark versions and building jar files to deploy to a test instance.

The current script is all in but I spend very little time in there anymore, so rewrote it as a script, and it’s so much cleaner and nicer to read

#nushell #powershell #databricks #Excel #ApacheSpark

Last updated 1 year ago

日本リージョンでもDatabricksアシスタントが使えるようになりました!
qiita.com/taka_yayoi/items/cb9

#qiita #databricks

Last updated 1 year ago

VentureBeat :press: · @VentureBeat
138 followers · 82 posts · Server press.coop

Hear from Naveen Rao of @MosaicML, a leading provider of LLMs for enterprises that was recently acquired by Databricks for $1.3 billion as he shares his success on the stage at 2:55 pm PST.

#vbtransform #mosaicml #databricks #venturebeat #llm #enterprise #ml #machinelearning #press

Last updated 1 year ago

Kedro · @kedro
56 followers · 45 posts · Server social.lfx.dev

New blog post: How to use Databricks managed Delta tables in a Kedro project 🔶

In this post our colleague Jannic Holzer explains how to use a newly-released dataset for managed Delta tables in Databricks within your Kedro project.

kedro.org/blog/managed-delta-t

Install it with

```
pip install "kedro-datasets[databricks.ManagedTableDataSet]"
```

#kedro #machinelearning #datascience #databricks #spark #python #pydata

Last updated 1 year ago

Shibaprasad Bhattacharya · @shibaprasad
110 followers · 927 posts · Server qoto.org

@josiah @orizuru Tbh, it is not a total presumption. I primarily work in and have faced the same issues.

The DevOps team told the same to one of my teammates. And the Plumber library had some authentication issues related to UMS2 IIRC.

We adapted databricks quickly and found a way to call the R scripts from a Python notebook.

Even on , the support of R is so poor!

#rstats #databricks

Last updated 1 year ago