Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: April 16, 2024
In this article, we’ll explore TimescaleDB, an open-source time-series database built on top of PostgreSQL. We’ll delve into its features, examine its capabilities, and discuss how to effectively interact with this database.
TimescaleDB is an open-source database extension for PostgreSQL, designed to handle time-series data effectively. It extends PostgreSQL’s capabilities to provide dedicated features for time-series data including automated time partitioning, optimized indexing, and compression. Let’s have a brief look at some of its key features:
To get started with TimescaleDB, we first need to make sure to have PostgreSQL installed on our machine. We can install PostgreSQL using the package manager of our operating system, or by downloading it from PostgreSQL’s website. After successfully installing PostgreSQL, we can proceed to install TimescaleDB. We’ll also need to install Homebrew before initiating the installation:
brew tap timescale/tap
brew install timescaledb
Upon successful installation, let’s connect to our PostgreSQL instance and run the query to initialize the TimescaleDB instance:
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
This query sets up the necessary extension and prepares our PostgreSQL environment for efficient time-series data handling.
TimescaleDB introduces the concept of Hypertable, which is a partitioned table optimized for time-series storage. A hypertable is always partitioned on time, since it’s intended for time-series data, and possesses the flexibility to partition on additional columns as well. We can use one of the supported data types for partitioning: date, smallint, int, bigint, timestamp, and timestamptz. Creating a hypertable involves creating a table first and subsequently converting it to a hypertable. Let’s first create a regular table:
CREATE TABLE employee(
id int8 NOT NULL,
name varchar(255) NOT NULL,
login_time timestamp,
logout_time timestamp);
Now, let’s convert it into a hypertable:
SELECT create_hypertable('employee', by_range('login_time'));
The required parameters of create_hypertable() are the table name employee and the dimension builder by_range(‘login_time’). The dimension builder defines which column we want to use to partition the table. In this case, we could skip the dimension builder because create_hypertable() automatically assumes that a single specified column is range-partitioned by time by default. For existing data in the table, we can include the migrate_data option for migration:
SELECT create_hypertable('employee', by_range('login_time'), migrate_data => true);
With the hypertable in place, we can start adding data. TimescaleDB provides effective mechanisms for handling large volumes of timestamped information. Let’s perform the insertion:
INSERT INTO employee values('1', 'John', '2023-01-01 09:00:00', '2023-01-01 18:30:00');
INSERT INTO employee values('2', 'Sarah', '2023-01-02 08:45:12', '2023-01-02 18:10:10');
One of the compelling features of TimescaleDB is its capability to perform efficient queries on time-series data using standard SQL. With our data successfully populated, let’s proceed to query it:
SELECT * FROM employee WHERE login_time BETWEEN '2023-01-01 00:00:00' AND '2023-01-01 12:00:00';
This query retrieves data for a specific time range, showcasing the ease of analyzing and visualizing time-series data with TimescaleDB.
In this article, we presented a quick overview of TimescaleDB, delving into the fundamental steps for setting it up and querying data. TimescaleDB efficiently handles large time-series datasets, making it invaluable for robust data management in various applications.