HomeAssistant ships with a default 10 days retention period. The recorder core integration is responsible of this behaviour. By default the recorder stores the data in a SQLite database. It grows depending on the number of sensors.
The recorder is built into the HA core, and thus, it uses Python. In fact, it uses a SQLAlchemy ORM to allow interacting with different database engines in a seamless way.
You can substitute the SQLite recorder database with your own SQL database. You will need to:
- Create a user on the DB for HA.
- Create a DB in a server that is reachable from the HA server.
- Update the recorder integration default configuration in HA
configuration.yamlfile.
I chose to use TimeScaleDB for this, since it will allow me to continue to use the HA history integration to show the data. And in addition, it will allow integrating with a Long Term Storage integration to compact data in the long term.
I previously considered using InfluxDB or VictoriaMetrics (or any other non-SQL flavour), but I discarded it since it will not allow me to keep using the HA visualizations on the history integration. For InfluxDB, you will need to use Grafana or a similar app for data visualization, and it is not tightly integrated on HA.
In the long term, TimeScaleDB will not be a good solution in terms on storage, (see article here), but it will allow using HA history integration and the available cards for HA dashboards.
I am using a custom TimescaleDB image that is includes PostGIS and other relevant components.
In case you use the official TimescaleDB image, I suggest using the timescaledb-ha flavour, since it includes PostGIS. This image is a derivative of the official PostgreSQL image, so you can fine-tune it.
Configuring the TimescaleDB docker image
Make sure to add the right PGDATA path to the docker-compose.yaml:
...
timescaledb:
image: timescale/timescaledb-ha:pg17
container_name: timescaledb
ports:
- "127.0.0.1:54320:5432"
environment:
POSTGRES_PASSWORD: YOUR_POSTGRES_PASSWORD
PGDATA: /pgdata
volumes:
- /opt/timescaledb/pgdata:/pgdata
...
Using bind mounts to connect external data to your Postgres container can solve data accessibility issues. However, you’re responsible for creating the directory and setting up permissions or security:
# 1. Create the directory
sudo mkdir -p /opt/timescaledb/pgdata
# 2. Set ownership to the expected Postgres UID/GID (commonly 1000:1000)
sudo chown 1000:1000 /opt/timescaledb/pgdata
# 3. (Optional) Set permissions
sudo chmod 700 /opt/timescaledb/pgdata
Add LTSS integration
Follow the instructions at the LTSS integration page.
Test new data is on LTSS database
Run this SQL query:
SELECT COUNT(*) AS row_count, MAX(created) AS last_entry
FROM statistics
WHERE created > NOW() - INTERVAL '1 day';
Update recorder configuration
Use the postgres connection string, but store the secrets on the secrets.yaml file:
psql_string: postgresql://homeassistant:PASSWORD@timescaledb/homeassistant
Then add the secret string in the configuration.yaml:
recorder:
db_url: !secret psql_string
A suitable recorder configuration will exclude some entities that are not relevant. This is an example that will include all relevant domains (a domain is an integration):
recorder:
...
include:
domains:
- automation
- button
- binary_sensor
- scene
- script
- sensor
- switch
For tutorial purposes, here is a list of entities that you probably do not want or need recorded in any database:
- System Monitor: RAM, CPU, NETWORK and DISK usage
- Mobile Device: Battery Level, Public IP address, Internal Storage
- Batteries Percentages for different sensors
- Battery Low Binary Sensors
- Zones
- Most Scripts
- Some Automations