Osunderdog · @Osunderdog
15 followers · 115 posts · Server allthingstech.social

Well, turns out I needed to upgrade from v0.7.1 to v0.8.1. Also had to tell duckdb to install mysql extension.

Badda Bing 5.5m records streaming at me. Wonderful.

#duckdb

Last updated 1 year ago

Osunderdog · @Osunderdog
15 followers · 113 posts · Server allthingstech.social

I like to follow through language training books. Currently reading a lot about . Anyway, at some point I jump off the curated path and try something on my own.

I end up stuck in the mud reading a bunch of technical documents. Today it is "How to get linked in to a rust example?"

crates.io/crates/duckdb

#rust #duckdb

Last updated 1 year ago

Andrea Borruso · @aborruso
237 followers · 224 posts · Server mastodon.uno

Per avere in sempre precaricata una o piรน estensioni, basta creare/modificare il fil ~/.duckdbrc

#duckdb

Last updated 1 year ago

Thomas Sandmann · @thomas_sandmann
366 followers · 1147 posts · Server genomic.social

Today I learned how to use Pagรจsโ€™s awesome {DelayedArray} @bioconductor package to handle gene expression data stored in parquet files. tomsing1.github.io/blog/posts/ This way, I can leverage familiar R tools and take
advantage of the language-agnostic parquet file format, querying very large gene expression datasets. Do you have experience with parquet files for biological data - please share the lessons you have learned!

#til #parquet #RStats #duckdb #Bioconductor

Last updated 1 year ago

Steve Purcell · @sanityinc
693 followers · 1059 posts · Server hachyderm.io

I figured out how to abuse to convert gzipped web server log files into compressed Parquet columnar storage files that can be queried efficiently later: gist.github.com/purcell/6e7240

#duckdb

Last updated 1 year ago

Steve Purcell · @sanityinc
693 followers · 1059 posts · Server hachyderm.io

@xenodium Yeah, you can also use duckdb to directly query sqlite DBs, compressed (or plain) CSV files, JSON, and more, even over http. It's a pretty great set of powertools, and their extensions to regular SQL are very interesting. There are even spatial extensions for it now. Worth checking out the hashtag.

#duckdb

Last updated 1 year ago

Steve Purcell · @sanityinc
693 followers · 1059 posts · Server hachyderm.io

@xenodium And just in case you're curious how these get counted, I wrote up something after recently blowing some time tinkering with the machinery as an excuse to test-drive . ๐Ÿ˜ github.com/melpa/melpa/tree/ma

#duckdb

Last updated 1 year ago

Thomas Sandmann · @thomas_sandmann
366 followers · 1147 posts · Server genomic.social

Today I learned how to store gene expression data in (multiple) parquet files, and query them as a single dataset from R with the {arrow}, {duckdb} or {sparklyr} packages. I am amazed by {duckdb}'s speed ๐Ÿš€ - even on my laptop! Here's a blog post with what I learned: tomsing1.github.io/blog/posts/

#til #RStats #duckdb #parquet #spark #compBio #rnaseq

Last updated 1 year ago

Steve Purcell · @sanityinc
668 followers · 989 posts · Server hachyderm.io

I have that "using delightful free software" feeling again today -- is hilariously good. The stats on MELPA.org ( packages) are based on processed web logs, which we've kept forever. We used to whittle the relevant data into a normalised sqlite DB file, which has passed 7GB (330M downloads). I'm switching this over to DuckDB and it's super fast and space-efficient for this use case, while being a simple code change. I'll share numbers soon. duckdb.org/

#duckdb #eMacs

Last updated 1 year ago

maurizio napolitano · @napo
144 followers · 77 posts · Server mastodon.uno

disponibile su GitHub la documentazione con cui ho generato i file per ogni regione d'Italia per edifici e luoghi distribuiti da Overture Maps Foundations arricchiti dai codici ISTAT per l'estrazione per comune o provincia.
Link dpwnload aggiornati e codice SQL usato qui
github.com/napo/overturemaps_i

#geopackage #duckdb #opendata #osm

Last updated 1 year ago

OK... Two days after loading my data on a file (144m+ CSV lines), I started to operate on it. Jesus Christ on a bike! It's super performatic!!!

I'll work more on DuckDB in the future. Woot!

#duckdb

Last updated 1 year ago

We can attest that is much more aesthetically pleasing to view and edit markdown docs interleaved with code blocks, and provides a unique document view experience.

Brief demo of our ๐Ÿ““ converted to .qmd doc using it. โฌ‡

Quarto Visual Editing docs:

๐Ÿ“ฐ quarto.org/docs/visual-editor/

Our Observable ๐Ÿ› ๏ธ and ๐Ÿ“š repository:

๐Ÿ“ฅ github.com/RandomFractals/obse

DuckDB Quarto doc:

๐Ÿ“ github.com/RandomFractals/obse

๐Ÿง™โ€โ™‚๏ธ ...

#QuartoPub #duckdbtools #datatables #JSNotebooks #datatools #ObservableJS #duckdb #visualeditor #quarto

Last updated 1 year ago

Andrea Borruso · @aborruso
227 followers · 203 posts · Server mastodon.uno

Il formato CSV puรฒ essere "grosso", brutto e cattivo.

Ci sono modi per pubblicarlo e descriverlo, e strumenti per elaborarlo, che lo fanno diventare "bello" e soprattutto "pronto".

Tips & tricks, ispirati da , Apache e OpenCoesione.

aborruso.github.io/posts/duckd

#duckdb #parquet

Last updated 1 year ago

Probably the most complete extension created for devs & data scientists using that includes all the DuckDB tree view objects, including main DB instance, system and temp database views with the corresponding main, information schema & pg_catalog schemas display, settings, extensions, data types, functions, and keywords display you will not find in or @motherduck or any other related DuckDB Tool/driver out there. ๐Ÿค—

๐Ÿ“ฐ github.com/RandomFractals/pro-

๐Ÿง™โ€โ™‚๏ธ ...

#prodatatools #dbeaver #vscode #sqltools #duckdb

Last updated 1 year ago

One of the main reasons to try our ๐Ÿ› ๏ธ in @code IDE are the 30+ custom view & metadata shortcut commands you can invoke from the standard VS Code Commands Palette while exploring those embedded DB files in your apps for your next EDA ...

๐Ÿ“ฐ github.com/RandomFractals/pro-

๐Ÿง™โ€โ™‚๏ธ ...

#prodatatools #dataproject #duckdb #duckdbpro

Last updated 1 year ago

Solving my own doubt...

As it doesn't have an `affected_rows()` method, you can use `len()` either on a common array via `resultset.fetch_all()` or on a Pandas dataframe via `resultset.df()`

#duckdb

Last updated 1 year ago

How can I get the number of rows returned on a Python query?

#duckdb

Last updated 1 year ago

Ploomber · @ploomber
32 followers · 37 posts · Server fosstodon.org

๐Ÿ”ง ๐Ÿš€ Performance Boost: Users converting DuckDB results to pandas DataFrames can now enjoy an optimized experience with our feature roll out fix.

JupySQL is now using native methods to convert to data frames from DuckDB when using native connections and SQLAlchemy to maximize performance

Your data manipulation tasks just got a whole lot faster!

Check it out now:
jupysql.ploomber.io/en/latest/

#duckdb #pandas

Last updated 1 year ago

Enquanto isso...

38 milhรตes de linhas de dados de 133 milhรตes em CSV carregando no . A รบnica forma que encontrei de fazer isso sem travar a mรกquina foi criar um Python para dar uma restringida e ir carregando ela mais lentamente, sem tentar socar goela abaixo.

#duckdb

Last updated 1 year ago

We took the latest for a spin today and can confirm that all of our old custom extensions work in that IDE, including new , , new , and soon to be released .

We'll post some walkthroughs on how to install and use some of our new in Azure Data Studio IDE soon.

Let us know if any of these would be of interest to your and your users.

๐Ÿ› ๏ธ randomfractals.github.io/pro-d

#datateams #datatools #prodatatools #datanotebookprotools #markdownsqltools #prqlcodelens #sqltools #duckdb #dataviz #vscode #azuredatastudio

Last updated 1 year ago