Backup

From Free net encyclopedia

Template:Cleanup-date Backup in computer engineering refers to the copying of data for the purpose of having an additional copy of an original source. If the original data is damaged or lost, the data may be copied back from that source, a process which is known as Data recovery or Restore. The "data" may be either data as such, or stored program code, both of which are treated the same by the backup software. Backups differ from an archive in which the data is necessarily duplicated, instead of simply moved.

The word may be used as a noun, e.g., "have you remembered to move the backup to a safe place?", or as a verb, "he didn't backup the data, so we lost last week's work". Also common are various combinations, such as backup copy, backup software (the applications that are used for performing the backing up of data, i.e., the systematic generation of backup copies), backup policy (an organisation's procedures and rules for ensuring that adequate amounts and types of backups are made, including suitably frequent testing of the process for restoring the original (production) system from the backup copies).

Contents

Backup strategies

A backup should be planned carefully, and the following points should be considered:

  • Cyclical backups improve data recovery reliability.
  • Automated backup should be considered, as manual backups can be affected by human error.
  • Making two copies of backup can potentially increase security for data recovery, to avoid accidents such as fire and physics randomness.
  • Using standards make them easier to recover, since that's the goal of a standard. Established standards are usually safer for recovery.
  • New standards are generally faster and more powerful.
  • Data compression might be important if there is more data than media space.
  • Uncompressed data are mostly easier to recover if the backup media are damaged or corrupted, unless individual objects (files, folders, etc.) are compressed separately. Many backup programs allow an administrator to increase or decrease the scope of solid compression to tradeoff compression performance against durability.
  • Backups might take a lot of time to accomplish, which can potentially become a problem in a work environment.
  • Multiple media backup, for just one content, can be done with independent indexing to optimize individual data recovery.
  • Backup depends both on software and hardware and so are exposed to expiration due to time issues.

The nature of computers is that there are always options and usually three points to consider when buying equipment: size (measured in bytes), rotational speed (measured in hertz or other cycles such as RPM), warranty (mesured in years or eventually months).

  • Each of the different media has benefits and drawbacks. Also consider the cost per gigabyte when comparing different solutions.

Good strategy in the case of a big accident, disaster, while it's important to keep calm it's predictable that most likely no one would be able to think clearly, and act accordingly. By preparing checklists planned based on the fact that accidents can happen, the despair can be avoided.

The bigger the importance of the content, the more the concerned with a properly trained staff can be observed in big companies.

Backup media types

As of 2005, backups are most often made from hard disk based production systems to large capacity magnetic tape, hard disk storage, or optical disk WORM media like CD-R, DVD-R and similar formats. As broadband access becomes more widespread, network and remote backup/online backups are gaining in popularity. There are quite a few companies offering Internet-based backup (Google Keyword Search). During the period 1975–95, most personal/home computer users associated backup mostly with copying floppy disks. However, the recent drop in hard disk prices and its number one position as the most reliable re-write-able media make it one of the most practical backup media.

A CD-R can be used as an alternative backup device. One advantage of CDs is that they can hold 650 MiB of data on a 12 cm (4.75") reflective optical disc. (This is equivalent to 12,000 images or 200,000 pages of text.) They can also be restored on any machine with a CD-ROM drive. CDs may all look the same, but there are different file formats for different applications.

Special cases

Backing up active databases requires highly-specialized software that must be integrated with the database system in order to prevent data corruption. For example, a user accesses the website of his bank and transfers money from one of his accounts to another while a backup is running. Such a transaction will affect multiple places on the hard disks of the bank's systems.

At minimum, the amount of the transfer will be subtracted from the balance of one account, and added to the balance of the other account. If there is then a disk crash and restore, it is important to ensure that the database holding the user's account balances gets restored correctly. If the subtraction part is restored correctly but the addition part isn't, then the user is unhappy. If the addition part is restored correctly but the subtraction part isn't, then the bank is unhappy.

Metrics

There are six primary metrics relating to data backup:

  • Recovery Point Objective (RPO) is the point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. Reducing RPO requires increasing synchronicity or frequency of copying the data to be protected.
  • Backup Window is the amount of time that is taken to copy a given data set to the backup device. Most traditional backup systems require a data set to be frozen for hours while the entire content of a filesystem is copied to magnetic tape. Newer techniques use incremental backup forever as well as mirror, snapshot, effectively reducing the required backup window.
  • Restore Time is the amount of time required to bring a desired data set back from the backup media.
  • Retention Time is the amount of time in which a given set of data will remain available for restore. Some backup products rely on daily copies of data and measure retention in terms of days. Others retain a number of copies of data changes regardless of the amount of time.
  • Backup Validation, also known as "Backup Success Validation", is the process by which Owners of data can get information regarding how their data was backed up. This same process is also used to prove compliance to regulatory bodies outside of the organization, for example, a biotech company might be required to show "proof" to the FDA that their test result data are backed up properly. Terrorism, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups. For that reason, many organizations rely on third-party or "independent" solutions to test, validate, optimize and charge for their backup operations (backup reporting). See also Backup validation. Some "modern backup to disk software" have built-in validation capabilities.
  • Open File Backup is the ability to backup a file while it is in use by another application.

Different roles of data backups

Computer backups are useful primarily for two purposes, the first and most obvious is to restore a computer to an operational state following a disaster also called disaster recovery. This includes loss of a hard disc or the file system becoming so badly corrupted it cannot be read. The second use, often overlooked but probably more common, is to facilitate the recovery of a single file or set of files when they are accidentally deleted or corrupted by the user or a program.

Backup procedures

Proper backup procedures require redundancy of the backup to a remote location and an effective Backup rotation scheme such as the GFS method (Grandfather-Father-Son Backup). Storing the copy near the original is unwise, since many disasters such as fire, flood and electrical surges are likely to cause damage to the backup at the same time.

The 2001 attacks on the World Trade Center presented many organizations with unprecedented disaster recovery scenarios, due to its scope. A few years earlier, during a fire at the headquarters of a major bank in Paris, system administrators ran into the burning building to rescue backup tapes because they didn't have offsite copies.

Recovery strategy

A backup is only as useful as its associated recovery strategy. Having a complete set of backup tapes is of no use if the only copy of the software required to read them is on one of the tapes. It is also possible for backup software to run successfully for several months, only to fail when it is needed most due to read errors on the backup media. Magnetic tapes in particular should be read-tested on a regular basis.

Validation and verification

Many backup programs make use of checksums or hashes. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as stored on the backup medium has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, to improve backup speed. This is particularly useful when multiple workstations, which may contain duplicates of the same file, are backed up over a network: if the backup software detects several copies of a file having the same size, datestamp, and checksum, it can put one copy of the data onto a backup medium, along with metadata listing all places where copies of this file were found. Also, checksums can improve performance of the verification pass for backups across a network, by computing checksums independently on each computer, then sending only the checksum over the network so that checksums can be compared instead of actual data.

Types of backups

There are primarily three different types of backup - full, incremental, and differential. A full backup is simply the backing up of all the files on the system. An incremental backup will only backup files modified since the last backup. An incremental backup is also referred to as a cumulative incremental backup. A differential backup is a cumulative backup of changes made since the last full backup.

See also

Template:Wikiquote

es:Copia de seguridad fr:Backup he:גיבוי it:Backup nl:Back-up ja:バックアップ pl:Kopia bezpieczeństwa pt:Backup ru:Резервная копия sk:Zálohovanie dát sv:Säkerhetskopiering zh:備份