|
| 1 | +--- |
| 2 | +date: 2025-11-19 |
| 3 | +title: How to restore a replica after storage failure |
| 4 | +tags: ['Deployments and Scaling'] |
| 5 | +keywords: ['restore', 'replica', 'storage failure', 'atomic database'] |
| 6 | +description: 'This article explains how to recover data when using replicated tables in atomic databases in ClickHouse and disks/storage on one of the replica is lost/currupted.' |
| 7 | +--- |
| 8 | + |
| 9 | +{frontMatter.description} |
| 10 | +{/* truncate */} |
| 11 | + |
| 12 | +<br/> |
| 13 | +<br/> |
| 14 | + |
| 15 | +<VerticalStepper headerLevel="h2"> |
| 16 | + |
| 17 | +:::note |
| 18 | +This guide assumes that the `<path>` parameter in your config.xml file is set to: |
| 19 | + |
| 20 | +```text |
| 21 | +<path>/var/lib/clickhouse/</path> |
| 22 | +``` |
| 23 | + |
| 24 | +If you have configured a different data path, replace all instances of `/var/lib/clickhouse` in the below commands with the actual value of your `<path>` setting. |
| 25 | +::: |
| 26 | + |
| 27 | +## Copy access configuration from the healthy replica {#copy-access-config} |
| 28 | + |
| 29 | +Copy the contents of the `access` folder which contains local users from the healthy replica: |
| 30 | + |
| 31 | +```text |
| 32 | +/var/lib/clickhouse/access |
| 33 | +``` |
| 34 | + |
| 35 | +## Back up the metadata folder from the healthy replica |
| 36 | + |
| 37 | +1. Navigate to the ClickHouse data directory: |
| 38 | + |
| 39 | +```text |
| 40 | +cd /var/lib/clickhouse |
| 41 | +``` |
| 42 | + |
| 43 | +2. Create a backup of the metadata folder (including symbolic links): The metadata directory contains DDLs for databases and tables. |
| 44 | + The database directory has symlinks to `/var/lib/clickhouse/store/..` which contains all the table DDLs. |
| 45 | + |
| 46 | +```bash |
| 47 | +{ find metadata -type f; find metadata -type l; find metadata -type l | xargs readlink -f; } | tar -cPf backup.tar --files-from=- |
| 48 | +``` |
| 49 | + |
| 50 | +:::note |
| 51 | +This command ensures that both the **metadata files**, and the symlink architecture are preserved in the backup. |
| 52 | +::: |
| 53 | + |
| 54 | +## Restore the metadata on the faulty replica {#restore-the-metadata-on-the-faulty-replica} |
| 55 | + |
| 56 | +1. Copy the generated `backup.tar` file to the faulty replica. |
| 57 | +2. Extract it to the ClickHouse data directory: |
| 58 | + |
| 59 | +```text |
| 60 | +cd /var/lib/clickhouse/ |
| 61 | +tar -xvPf backup.tar |
| 62 | +``` |
| 63 | + |
| 64 | +## Create the force restore flag {#create-force-restore-flag} |
| 65 | + |
| 66 | +To trigger automatic data synchronization from other replicas, create the following flag: |
| 67 | + |
| 68 | +```text |
| 69 | +sudo -u clickhouse touch /var/lib/clickhouse/flags/force_restore_data |
| 70 | +``` |
| 71 | + |
| 72 | +## Restart the faulty replica {#restart-faulty-replica} |
| 73 | + |
| 74 | +1. Restart the ClickHouse server on the faulty node. |
| 75 | +2. Check the server logs, you should observe parts being downloaded from the healthy replicas: |
| 76 | + |
| 77 | +```bash |
| 78 | +2025.11.02 00:00:04.047097 [ 682 ] {} <Debug> analytics.events_local (...) (Fetcher): Downloading files 23 |
| 79 | +2025.11.02 00:00:04.055542 [ 682 ] {} <Debug> analytics.events_local (...) (Fetcher): Download of part 202511_0_0_0 onto disk disk2 finished. |
| 80 | +2025.11.02 00:00:04.101888 [ 687 ] {} <Debug> warehouse.customers_local (...) (Fetcher): Downloading part 2025_0_0_1 onto disk default. |
| 81 | +2025.11.02 00:00:04.102005 [ 687 ] {} <Debug> warehouse.customers_local (...) (Fetcher): Downloading files 11 |
| 82 | +2025.11.02 00:00:04.102210 [ 690 ] {} <Debug> warehouse.customers_local (...) (Fetcher): Downloading part 2022_0_0_1 onto disk disk1. |
| 83 | +2025.11.02 00:00:04.102247 [ 688 ] {} <Debug> warehouse.customers_local (...) (Fetcher): Downloading part 2021_0_0_1 onto disk disk2. |
| 84 | +2025.11.02 00:00:04.102331 [ 690 ] {} <Debug> warehouse.customers_local (...) (Fetcher): Downloading files 11 |
| 85 | +``` |
| 86 | + |
| 87 | +</VerticalStepper> |
0 commit comments