Using an ODBC driver

The odbc package provides a DBI-compliant interface to Open Database Connectivity (ODBC) drivers. It allows for an efficient, easy way to setup connection to any database using an ODBC driver, including SQL Server, Oracle, MySQL, PostgreSQL, SQLite and others. The implementation builds on the nanodbc C++ library.

ODBC drivers can typically be downloaded from your database vendor, or they can be downloaded from RStudio when used with RStudio professional products. The odbc package works with the DBI


Using

All of the following examples assume you have already created a connection called con. To find out how to connect to your specific database type, please visit the Databases page.

Database information

The odbc package gives you tools to explore objects and columns in the database.

# Top level objects
odbcListObjects(con)

# Tables in a schema
odbcListObjects(con, catalog="mydb", schema="dbo")

# Columns in a table
odbcListColumns(con, catalog="mydb", schema="dbo", table="cars")

# Database structure
odbcListObjectTypes(con)

You can also see other data sources and drivers on the system.

# All data sources
odbcListDataSources()

# All drivers
odbcListDrivers()

Reading and writing tables

The DBI package has functions for reading and writing tables. dbWriteTable() will write an R data frame to a SQL table. dbReadTable() will read a SQL table into an R data frame.

dbWriteTable(con, "cars", cars)
dbReadTable(con, "cars")

You can specify tables outside the database with the Id() command.

table_id <- Id(catalog = "mydb", schema = "dbo", table = "cars")
dbReadTable(con, table_id)

Queries and statements

For interactive queries, use dbGetQuery() to submit a query and fetch the results. To fetch the results separately, use dbSendQuery() and dbFetch(). The n= argument in dbFetch() can be used to fetch partial results.

# Return the results for an arbitrary query
dbGetQuery(con, "SELECT speed, dist FROM cars")

# Fetch the first 100 records
query <- dbSendQuery(con, "SELECT speed, dist FROM cars")
dbFetch(query, n = 10)
dbClearResult(query)

You can execute arbitrary SQL statements with dbExecute(). Note: many database API’s distinguish between direct and prepared statements. If you want to force a direct statement (for example, if you want to create a local temp table in Microsoft SQL Server), then pass immdediate=TRUE.

dbExecute(con, "INSERT INTO cars (speed, dist) VALUES (88, 30)")
dbExecute(con, "CREATE TABLE #cars_tmp (speed int, dist int)", immediate = TRUE)


odbc Performance Benchmarks

The odbc package is often much faster than the existing RODBC and DBI compatible RODBCDBI packages. The tests below were carried out on PostgreSQL and Microsoft SQL Server using the nycflights13::flights dataset (336,776 rows, 19 columns).

PostgreSQL Results

Package Function User System Elapsed
odbc Reading 5.119 0.290 6.771
RODBCDBI Reading 19.203 1.356 21.724
odbc Writing 7.802 3.703 26.016
RODBCDBI Writing 6.693 3.786 48.423
library(DBI)
library(RODBCDBI)
library(tibble)

odbc <- dbConnect(odbc::odbc(), dsn = "PostgreSQL")
rodbc <- dbConnect(RODBCDBI::ODBC(), dsn = "PostgreSQL")

# odbc Reading
system.time(odbc_result <- dbReadTable(odbc, "flights"))

# RODBCDBI Reading
system.time(rodbc_result <- dbReadTable(rodbc, "flights"))

# odbc Reading
system.time(dbWriteTable(odbc, "flights3", as.data.frame(flights)))

# RODBCDBI Writing (note: rodbc does not support writing timestamps natively)
system.time(dbWriteTable(rodbc, "flights2", as.data.frame(flights[, names(flights) != "time_hour"])))

Microsoft SQL Server Results

Package Function User System Elapsed
odbc Reading 2.187 0.108 2.298
RSQLServer Reading 5.101 1.289 3.584
odbc Writing 12.336 0.412 21.802
RSQLServer Writing 645.219 12.287 820.806
library("RSQLServer")
rsqlserver <- dbConnect(RSQLServer::SQLServer(), server = "SQLServer")
odbc <- dbConnect(odbc::odbc(), dsn = "PostgreSQL")

# odbc Reading
system.time(dbReadTable(odbc, "flights", as.data.frame(flights)))

# RSQLServer Reading
system.time(dbReadTable(rsqlserver, "flights", as.data.frame(flights)))

# odbc Writing
system.time(dbWriteTable(odbc, "flights3", as.data.frame(flights)))

# RSQLServer Writing
system.time(dbWriteTable(rsqlserver, "flights2", as.data.frame(flights)))