DuckDB Resources

A curated collection of valuable DuckDB resources to help you get the most out of this analytical database. Most of the links are sourced from the great awesome-duckdb project, thanks a lot!

Official Resources

Official documentation - Comprehensive DuckDB documentation
Official blog - Latest articles, news and updates
DuckDB clients - Client APIs for DuckDB
DuckDB documentation PDF - The documentation as a single PDF file
DuckDB documentation in Markdown - The documentation as a single Markdown file

Client APIs

DuckDB offers client APIs (also known as "drivers") for several languages, categorized by support tier.

Primary Support Tier

These clients are the first to receive new features and are covered by community support.

C - The foundational API maintained by the DuckDB team
Command Line Interface (CLI) - Interactive shell for DuckDB
Java (JDBC) - JDBC driver for Java applications
Go - Go SQL driver for DuckDB

Node.js (node-neo) - Modern Node.js driver
Python - Python client for DuckDB
R - R interface for DuckDB
WebAssembly (Wasm) - Run DuckDB in the browser

Secondary Support Tier

These clients receive new features but are not covered by community support.

ADBC (Arrow) - Arrow Database Connectivity
C# (.NET) - .NET driver for DuckDB
C++ - C++ API for DuckDB
Dart - DuckDB for Dart applications
Julia - DuckDB for Julia language

Node.js (deprecated) - Original Node.js API
ODBC - Open Database Connectivity driver
Rust - Rust client for DuckDB
Swift - Swift client for DuckDB

Tertiary Support Tier

These clients are maintained by third parties with no feature or support guarantees.

Common Lisp - DuckDB for Common Lisp
Crystal - Crystal language interface for DuckDB
Elixir - Elixir client for DuckDB
Erlang - Erlang binding to DuckDB

Pyodide - DuckDB in Python in the browser
Ruby - Ruby interface for DuckDB
Zig - Zig bindings for DuckDB

Extensions

DuckDB's functionality can be extended through extensions, which are organized into Core Extensions (maintained by the DuckDB team) and Community Extensions (contributed by the community).

Core Extensions

These extensions are maintained by the DuckDB team and can be installed via INSTALL <extension_name>.

arrow - Zero-copy data integration with Apache Arrow
autocomplete - Adds support for autocomplete in the shell
aws - Provides features that depend on the AWS SDK
azure - Adds filesystem abstraction for Azure blob storage
delta - Adds support for Delta Lake
excel - Adds support for Excel-like format strings
fts - Adds support for Full-Text Search Indexes
httpfs - Support for HTTP(S) or S3 connections
iceberg - Adds support for Apache Iceberg
icu - Support for time zones and collations using ICU

inet - Support for IP-related data types and functions
jemalloc - Overwrites system allocator with jemalloc
json - Adds support for JSON operations
mysql - Support for MySQL database connections
parquet - Support for reading and writing Parquet files
postgres - Support for PostgreSQL connections
spatial - Geospatial functionality and processing
sqlite - Support for SQLite database files
tpcds - TPC-DS data generation and query support
tpch - TPC-H data generation and query support
vss - Support for vector similarity search queries

Community Extensions

These extensions are contributed by the community and can be installed via INSTALL <extension_name> FROM community.

avro - Read Apache Avro files
bigquery - Google BigQuery integration
blockduck - Live SQL queries on blockchain
cache_httpfs - Read cached filesystem for httpfs
capi_quack - Hello world example from C/C++ C API template
chsql - ClickHouse SQL dialect macros for DuckDB
chsql_native - ClickHouse native client & file reader
cronjob - DuckDB HTTP cronjob extension
crypto - Cryptographic hash functions and HMAC
datasketches - Apache DataSketches for approximate analytics
duckpgq - Graph workloads supporting SQL/PGQ standard
evalexpr_rhai - Evaluates Rhai scripting language in SQL
flockmtl - LLM & RAG extension for analytics and semantic analysis
fuzzycomplete - Fuzzy string matching for autocompletion
geography - Global spatial data processing on the sphere
gsheets - Read and write Google Sheets using SQL
h3 - Hierarchical hexagonal indexing for geospatial data
hdf5 - Read HDF5 files from DuckDB
hostfs - Navigate and explore the filesystem using SQL
http_client - DuckDB HTTP client extension
httpserver - DuckDB HTTP API server and query interface

lindel - Linearization/Delinearization, Z-Order, Hilbert curves
magic - libmagic/file utilities ported to DuckDB
netquack - Parse, extract, and analyze domains, URIs, and paths
open_prompt - Interact with LLMs with a simple extension
pcap_reader - Read PCAP files from DuckDB
pivot_table - Provides a spreadsheet-style pivot_table function
prql - Support for PRQL, the Pipelined Relational Query Language
psql - Support for PSQL, a piped SQL dialect for DuckDB
pyroscope - DuckDB Pyroscope extension for continuous profiling
quack - Provides a hello world example demo
rusty_quack - Hello world demo from Rust-based extension template
scrooge - Financial data aggregation and scanners
shellfs - Use shell commands for input and output
sheetreader - Fast XLSX file importer
substrait - Allows conversion and execution of Substrait query plans
tsid - DuckDB Time-Sortable ID generator
ulid - ULID data type for DuckDB (timestamped UUID-like identifiers)
webmacro - Load DuckDB Macros from the web
zipfs - Read files within zip archives

Learning Resources

Links to talks, videos, books and podcasts

Talks & Videos

DuckCon #6 playlist
DuckCon #5 playlist
DuckCon #4 playlist
DuckCon #3 playlist
DuckCon #2 playlist
DuckDB: Crunching data anywhere from laptops to servers @ GOTO Amsterdam 2024 - Gábor Szárnyas
In-Process Analytical Data Management with DuckDB @ PyData Amsterdam - Hannes Mühleisen
DuckDB: The Power of a Data Warehouse in your Python Process @ PyData Yerevan - Gábor Szárnyas
DuckDB: Bringing analytical SQL directly to your Python shell @ EuroPython - Pedro Holanda
DuckDB keynote @ Data + AI Summit 2023 - Hannes Mühleisen
DuckDB: Bringing Analytical SQL Directly To Your Python Shell @ FOSDEM - Pedro Holanda
DuckDB Extensions @ DuckCon - Pedro Holanda & Sam Ansmink

Developing Systems in Academia: The Good, the Bad, and the not-so-Ugly Duckling @ CIDR - Hannes Mühleisen
DuckDB An Embeddable Analytical Database @ FOSDEM - Hannes Mühleisen
DuckDB tutorials playlist by Learn Data with Mark - Mark Needham
DuckDB tutorials playlist by MotherDuck - Mehdi Ouazza
Nextflow and database uses: powering data engineering, exploring DuckDB, and beyond - Edmund Miller
Why should you care about DuckDB? @ Dublin DuckDB meetup - Mihai Bojin
Exploring Monte Carlo Simulations With DuckDB @ Dublin DuckDB meetup - James McNeill
DuckDB and recommenders: a lightning fast synergy @ Dublin DuckDB meetup - Khalil Muhammad

Podcasts

Developer Voices: Implementing Hardware-Friendly Databases - Hannes Mühleisen
The Geek Narrator: DuckDB Internals - Mark Raasveldt
Software Engineering Daily: DuckDB - Hannes Mühleisen
Data Engineering Podcast: Move Your Database To The Data - Hannes Mühleisen
The Analytics Engineering Podcast: The Personal Data Warehouse - Jordan Tigani

Books

DuckDB in Action - Book by Manning Publications
Getting Started with DuckDB - Practical guide for data workflows

Cloud & Serverless

AWS Lambda Layers for DuckDB - Run DuckDB in AWS Lambda functions
Serverless DuckDB - Use DuckDB as API with Amazon API Gateway and AWS Lambda
Serverless Parquet Repartitioner - Use DuckDB to repartition data in S3-based Data Lakes
DuckDB as API in Docker - A TypeScript-based Docker image containing DuckDB, and a Hono framework REST API with JSON or streaming Arrow responses

Tools based on DuckDB

SQL Workbench - SQL Workbench for running queries on local or remote data, data visualizations, and sharing queries via URLs
Rill Data - Tool for transforming data sets into powerful, opinionated dashboards using SQL
Ibis Project - A DataFrame API for interacting with DuckDB and other compute engines
Boiling Data - Serverless data analytics overlay on top of S3 Data Lakes
Hex Dataframe SQL - Hex's Dataframe SQL cells powered by DuckDB
Mode - Uses DuckDB for their in-memory data engine
VulcanSQL - Data API framework for creating REST APIs by writing SQL templates
Tad - A fast, free, cross-platform tabular data viewer application
Honeycomb Maps - A browser-based geospatial analysis tool leveraging DuckDB-Wasm
Malloy - Experimental language for describing data relationships and transformations
Evidence - Generate reports using SQL and markdown
Huey - Blazing-fast & intuitive pivot tables on Parquet, CSV, JSON files
DatalakeStudio - Load, explore, transform datasets and expose them via API
Spice.ai - A unified SQL query interface and portable runtime
Definite - Analytics platform with managed DuckDB, ELT, and BI
Amphi ETL - Low-code data pipelines for structured and unstructured data
Quackpipe - Serverless OLAP API/UI with ClickHouse API compatibility
UniverSQL - Implementation of Snowflake API for running queries locally
Whereabouts - Fast, accurate, open-source geocoding in Python
sqlglot - Python transpiler for 23 different SQL dialects
yato - The smallest DuckDB SQL orchestrator on Earth
SQLMesh - Next-generation data transformation and modeling framework
Duck-UI - Web-based interface for interacting with DuckDB

SQL Clients

Harlequin - The DuckDB IDE for your terminal
qStudio - A free SQL tool specialized for data analysts
DBeaver - Universal database access and development tool
DataGrip - Paid SQL IDE by JetBrains
Duckling - A fast viewer for CSV/Parquet files and DuckDB/SQLite
SQL DATA LENS - A lightweight, commercial SQL IDE
Dataflare - Simple easy-to-use database manager