Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: "Under DiscussionFinished"

Discussion thread: here

JIRA: here (<- link to https://issues.apache.org/jira/browse/SOLR-XXXX)

Released: NA

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast). Confluence supports inline comments that can also be used.

Motivation

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keySOLR-15086

Released: no

Motivation

Solr's current backup/restore functionality has several Solr's current backup/restore functionality has several frustrating limitations.

Current index backups are based on full snapshots.  Snapshot-based backups are slow and expensive because they copy all the files of the index regardless of how little the index may have changed since the last backup.  Some, much, or all of the backup may be spent transferring data already present in the backup repository - a needless inefficiency.

...

  1. Incremental i.e. only the changed data is saved instead of a full copy. This allows users to save on storage costs as well as significantly speed up the backup process if only small changes have happened since the last backup.
  2. Cloud friendly i.e. supports one or more blob storage systems available in major public clouds such as Amazon S3, Google Cloud Storage, Azure Blob Storage etc.
  3. Safe against index corruption i.e. backups should succeed only if the backed up index is not corrupt
  4. Restorable to existing collections i.e. it should be possible to restore to the source collection (or any existing collection, assuming it is compatible with the source)

Public Interfaces

The proposed changes involve changes to several different levels of public interfaces.

Observant readers might recognize (1) and (3) from code that Cao Manh Dat proposed in SOLR-13608.  In that sense this SIP is a superset of that ticket, created to cover a broader swath of functionality and generate more discussion/review of the design.  Both Dat's ticket and this SIP are informed by code written by Dat, Shalin Mangar, and others, which is available in rough form here

Public Interfaces

The proposed changes involve changes to several different levels of public interfaces.

  • At the HTTP API level it proposes slight changes to the Backup and Restore APIs (at both the Collection and Core Admin layers).  It also proposes the introduction of two wholly new backup APIs, a "list backups" API, and a "delete backup" API.
  • At the Java API level it proposes changes to the interfaces used to define backup repositories.  Notably: `BackupRepository`
  • At the HTTP API level it proposes slight changes to the Backup and Restore APIs (at both the Collection and Core Admin layers).  It also proposes the introduction of two wholly new backup APIs, a "list backups" API, and a "delete backup" API.
  • At the Java API level it proposes changes to the interfaces used to define backup repositories.  Notably: `BackupRepository`.
  • At the file-format level this SIP proposes a new format for storing backups on disk, which is a public interface in a very limited sense.

...

Solr can support restoring to existing collections by making use of the "read only" mode that was introduced in SOLR-1372113271.  The restore API can put the target collection in read-only mode, restore a backup for each shard, and then toggle off "read only" mode.

...

Listing files in a directory is a common operation in backup and restore. However, the list of files are usually well known at write time. Therefore, we write a manifest file per backup, per directory (if needed) once all files in the directory have been written. This manifest lists the files that are part of the manifest (or directory). The list operation of the backup repository for blob stores can use the manifest file to return the list of files consistently. This is similar to how Lucene writes segment files at the end.

...

Regardless of the BackupRepository in use, this SIP proposes that backups be taken in an incremental manner, so that only those index files not stored by previous backups will be stored for the given backup.  This will result in changes to the format of each backup.  The general thrust of these changes is that a given backup "location" can (and should) be used to store multiple backups, and that backup includes a metadata file used to indicate which Lucene index files are a part of the backup and the path to each of these within the umbrella backup "location".

Proposed Changes:

...

Backup File Format

For details on the specific backup file format being proposed and how it enables incremental backups to be done accurately, see the SIP sub-page dedicated to this topic here.

Proposed Changes: HTTP API

As mentioned above this SIP proposes small tweaks to the existing backup and As mentioned above this SIP proposes small tweaks to the existing backup and restore APIs.  These are described in more detail below.

...

At a given time, only one of backupId, maxNumBackup and purge parameters should be specified.

Code Block
titleV1 Delete Backup Request

/admin/collections?action=DELETE_BACKUP&
  name=myBackupName&
  location=/path/to/my/shared/drive&
  backupId=<number of backupId>

...


Code Block
languagejs
titleV2 Delete Backup Request
POST /v2/collections/backups 
{{
  "responseHeaderdelete-backup" : {
..},
    "collectionname" : "collection1myBackupName",
    "deletedlocation" : [
    {
  "/path/to/my/shared/drive",
    "backupId": :5 2,
  } 
} 

Example response when deleting a particular backupId:

Code Block
languagejs
{
  "responseHeader" : {
..}   "startTime" : "2019-08-27T09:11:17.230673Z",
  "collection" : "collection1",
  "sizedeleted" : 9581, [
    {
      "numFilesbackupId" : 522,
    }
   ]
}
"startTime" : "2019-08-27T09:11:17.230673Z",
      "size" : 9581,
      "numFiles" : 52
    }
  ]
}

Example response for purge:Example response for purge:

Code Block
{
  "responseHeader" : {..},
  "collection" : "collection1",
  "purged" : {
      "numIndexFiles" : 2
    }
}

...

  1. name - A string name of the backup (usually the collection name)
  2. location - A string location of the backup. This is resolved against the repository.
  3. repository - An optional string to identify the repository. If none is provided, then the default repository configured in solr.xml is used.
Code Block
titleV1 List Backups API
/admin/collections?action=LISTBACKUP&
  name=myBackupName&
  location=/path/to/my/shared/drive

Example response:


Code Block
titleV2 List Backups API
POST /v2/collections/backups 
Code Block
{
  "responseHeaderlist-backups": {
    "statusname":0 "myBackupName",
    "QTimelocation":1},
  "collection":"backuprestore_testbackupinc",
  "backups":["/path/to/my/shared/drive" 
  }  {
} 

Example response:

Code Block
{
  "responseHeader":{
    "status":0,
    "QTime":1},
  "collection":"backuprestore_testbackupinc",
  "backups":[
    {
            "indexFileCount":26,
      "indexSizeMB":0.004,
      "shardBackupIds":{
        "shard2":"md_shard2_id_2",
        "shard1":"md_shard1_id_2"},
      "collection.configName":"conf1",
      "backupId":2,
      "collectionAlias":"backuprestore_testbackupinc",
      "startTime":"2019-08-28T16:02:11.485Z",
      "indexVersion":"8.2.1"},
    {
      "indexFileCount":2,
      "indexSizeMB":0.0,
      "shardBackupIds":{
        "shard2":"md_shard2_id_3",
        "shard1":"md_shard1_id_3"},
      "collection.configName":"conf1",
      "backupId":3,
      "collectionAlias":"backuprestore_testbackupinc",
      "startTime":"2019-08-28T16:02:14.375Z",
      "indexVersion":"8.2.1"},
    {
      "indexFileCount":2,
      "indexSizeMB":0.0,
      "shardBackupIds":{
        "shard2":"md_shard2_id_4",
        "shard1":"md_shard1_id_4"},
      "collection.configName":"conf1",
      "backupId":4,
      "collectionAlias":"backuprestore_testbackupinc",
      "startTime":"2019-08-28T16:02:14.406Z",
      "indexVersion":"8.2.1"}]}

...

  1. shardBackupId - (Required) The shard backup ID assigned by the Backup Collection API for the current backup.
  2. prevShardBackupId - The previous shard backup ID against which the incremental backup is to be made. The previous shard backup is used as the base to find changed data.


Code Block
titleV1 Backup Core API
admin/cores?action=BACKUPCORE&
  core=core-node1&
  location=/path/to/my/shared/drive/myBackupName&
  prevShardBackupId=md_shard1_id_0
  shardBackupId=md_shard1_id_1


Code Block
titleV2 Backup Core API
POST /v2/cores/someCoreName

{
  "backup-core": {
    "location": "admin/cores?action=BACKUPCORE&
  core=core-node1&
  location=/path/to/my/shared/drive/myBackupName&
  prevShardBackupId=/with/backupName",
    "shardBackupId": "md_shard1_id_01",
   shardBackupId= "prevShardBackupId": "md_shard1_id_10"
  }
}

Restore Core API

This is also an internal API to be called by the Restore Collection API. It supports two new parameters:

  1. incremental – An optional boolean that signals whether the data being restored is in the "incremental" format or not. Defaults to false.
  2. shardBackupId - The shard backup ID to be restored. This is a required parameter if incremental=true is specified.


Code Block
titleV1 Restore Core API
admin/cores?action=RESTORECORE&
  core=core-node1&
  incremental=true&
  location=/path/to/my/shared/drive/myBackupName&
  shardBackupId=md_shard1_id_1


Code Block
titleV2 Restore Core API
POST /v2/cores/someRestoreCoreName

{
  "restore-core": {
    "incremental": true,
    "location": "
Code Block
admin/cores?action=RESTORECORE&
  core=core-node1&
  incremental=true&
  location=/path/to/my/shared/drive/myBackupName&with/backupName",
  shardBackupId=  "shardBackupId": "md_shard1_id_1"
  }
}

Compatibility, Deprecation, and Migration Plan

...