Data Recovery Info
...there is hope!
Data Loss and Hard Drive Failure: Understanding the Causes and Costs
Hard drive failure is an inescapable reality in the modern business world. Whether due to human error, software corruption or other causes most firms will face incidents of lost data through hard drive failure. In this paper we analyze the various causes of hard drive failure and estimate the costs of each incident. Our calculations indicate that on average a single data loss incident will cost an organization $2,900, the majority of which is measured as lost productivity. Finally, we offer seven suggestions for responding to hard drive failures.
Results of Data Loss
Data loss and computer downtime have serious implications for business. Lost data can lead to costly downtime for sales and marketing and reduced customer service while customer databases are restored or rebuilt. Lost financial data can lead to lost contracts and stock value, or worse. A recent study by Datamonitor found that as many as one-third of IT decision-makers believes that a major data loss incident at their firm could lead to bankruptcy (Datamonitor, 2007). Small businesses may be more vulnerable, according to a recent survey by Verio. Seventy-percent of small businesses reported that a single incident of data loss would be considered significant and costly. These concerns are well grounded as over one-half of the respondents have already experienced some data loss (Verio 2007). Although hard drive manufacturers claim less than a 1% failure rate, recent research by computer scientists at Carnegie Mellon University found that a 2%-4% failure rate is more common and under some conditions the failure rate may reach as high as 13% (Schroeder & Gibson, 2007).
The cost of lost data varies depending upon its application, as well as the potential value that can be captured from use of the data. In addition, there is a cost associated with recovering the data, as well as lost productivity due to computer downtime. Using available data and existing research, this paper attempts to quantify the costs associated with episodes of data loss, considering the costs of recovery, as well as lost sales and reduced productivity.
Growth of Stored Data
The amount of data stored within corporate servers, workstations and user's machines continues to grow. The dramatic growth of stored data is fueled by three trends: the increasing capacity of storage media, decreasing costs per MB, and emerging role of information technology within firms.
In 1965 Gordon Moore, a co-founder of Intel, estimated that the pace of technology would support a doubling of the number of transistors on a single piece of silicon every 18-24 months for about the same cost. This estimate has largely held true (see Figure 1).
While Moore's prediction dealt with the number of transistors on a piece of silicon, it has implications for the growth and costs of data storage as well. The reality of "Moore's Law" implies similar outcomes in the size of magnetic disk storage (see Figure 2).
A second trend affecting the growth of data storage is that the cost per megabyte has plummeted over the past twenty years (see Figure 3). This trend is also related to Moore’s law.
A third trend in the rapid growth of stored data is the emerging role of information technology within firms. In addition to the historical responsibility for storing transactions and financial information many firms are moving their information resources out of the back office and closer to the front line with Customer Relationship Management (CRM) and Supply Chain Management (SCM) systems. While these systems create advantages for firms, they are data intensive and dramatically increase a firm's storage requirements. Some industries--for instance hospitals and financial services--are quickly moving towards paperless environments where all corporate information is stored and transmitted electronically (Golinkin, 2007). The trend from back-office automation towards front-office, value-added and ubiquitous computing requires extensive data storage.
These trends cause many firms to invest in additional storage. The research firm IDC reports that the total disk storage capacity sold in the first quarter of 2006 topped 1,030 petabytes, up more than 48 percent over the year before (one petabyte is equal to one quadrillion bytes of data, or 1,000,000 GB). According to their research the total disk storage market grew by over 6% in 2006 to $24.4 billion. Over the same period, the mid-priced products in this market experienced double-digit growth through sales to small and medium sized businesses (Regan, 2007). According to Gartner, an IT research and consulting firm, many firms are seeing the costs of their data backup and archiving far outpace growth in their overall budgets (Buchanon, 2007). These trends will likely continue or increase into the foreseeable future.
PCs in Use
As noted above, companies are relying more and more on data in a distributed environment. This trend is likely to continue, as evidenced by a 2006 survey of CIOs by Gartner that revealed “increased mobility” as a high priority item for chief information officers. There are over 230 million PCs in use in the United States (Computer Industry Almanac, 2006). In the US, over 50 percent of workforce, or 77 million individuals, use a computer at work (Bureau of Labor Statistics, 2005). And increasingly the PCs are likely to be laptops, with research firm IDC estimating that in 2011 laptops will outsell desktops (BBC News, 2007).
Episodes of Data Loss
Survey data from companies that specialize in data recovery may be used to investigate the primary causes for how data actually gets lost (Harris, 2007). Hard drive failure is the most common cause of data loss, accounting for 38 percent of data loss incidents. Drive read instability includes occasions where media corruption or degradation prevents access to the data on a disk. This explains 30% of lost data. Software corruption, which might include damages caused by system software or other program (e.g., a virus attack), accounts for 13 percent of data loss incidents. Human error accounts for 12 percent of data loss episodes. This includes the accidental deletion of data as well as incorrectly partitioning the hard drive. The relative magnitudes of the different types of data loss are illustrated in Figure 4. (This analysis ignores data loss due to theft, an increasing problem given the growth in use of laptops).
The Cost of a Data Loss Incident
An episode of data loss will result in two outcomes: either the data is recoverable or is permanently lost. In today’s environment, with numerous backup and recovery solutions, businesses need not suffer episodes of irretrievable data, except in case of careless planning or major disaster. We will assume for the purposes of this paper that all data may be retrieved. However, as demonstrated in prior work (Smith, 2003) the inherent value of data can be significant, and as noted earlier, if data is permanently lost it could bankrupt many organizations. Setting this possibility aside, this study will instead focus on the costs of retrieving data, as well as lost productivity and sales during an episode of computer downtime.
In approximately 40% of cases--when there has been no physical damage to the hard drive--data may be retrieved by an in-house technical support person. These cases are often caused by human error, software corruption, or computer viruses. We offer advice for restoring data in these cases below.
In the case of hard drive failures and instability episodes, which make up approximately 70% of severe data loss incidents, internal recovery efforts are not advised, and outside expertise should be sought. The cost of data recovery can vary widely, depending on numerous factors, including the size and type of storage media, severity of damage, parts required, and urgency. The authors conducted an independent survey, by polling eight separate data recovery companies on the estimated cost of recovering a 160GB desktop hard drive. Price estimates varied from $300 to $3,900 depending on the vendor along with the factors noted above. The highest prices were reserved for highly time sensitive (1-2 days) data recoveries. For standard recoveries, the majority of estimates fell between the range of $500 - $2,500. By taking the midpoint of this range as a reasonable approximation, the average cost of recovering data on a 160GB hard drive would cost $1,500.
In cases where the data may be retrieved with in-house expertise, we must consider the internal resources that are encumbered. If there is a computer support specialist employed within the company, both the number of hours needed to recover the data and the cost of employing this individual must be taken into account. The Bureau of Labor Statistics reports that the average computer support specialist currently earns $44.60 an hour, including salary and benefits. Assuming that the average time needed to recover lost data is approximately 8 hours, the cost of using an in house employee to recover lost data is approximately $350.
Taking into account the expected probability of whether the data can be retrieved in-house or would need to be sent to data recovery specialists, the expected average cost to retrieve data is calculated as $1,150. Figure 5 summarizes the cost incurred in order to recover the lost data.
In addition to the cost of outsourcing the recovering data, users and companies are subjected to lost productivity. Every computer user has experienced a time of frustration—and corresponding lost productivity--when their computer is unavailable for use. When data loss occurs, these episodes can sometimes become protracted, and can become quite costly. In order to estimate the costs, the following factors must be considered: the individual user’s productivity, the length of the downtime, and the extent to which an individual’s data loss episode affects others in the organization.
During the time in which the attempt to recover data is underway, an individual is unable to access his or her PC, thereby reducing productivity, which in turn impacts company sales and profitability. This opportunity cost - lost productivity due to computer downtime - impacts a company’s income statement just as other more common and explicit costs. By what mechanism does this impact the bottom line? Some employees are directly involved in sales and revenue production; others are involved in more supportive or indirect roles. Economic theory says that each employee’s productivity, or contribution to firm revenue, can be approximated using the individual’s compensation. Available data sources suggest that individuals who use computers at work earn an average of $46.48 an hour in wages and benefits. The time needed to recover data may vary greatly from one hour to several days. In addition, most workers won’t have their productivity reduced to zero, as they could perform other tasks that do not require their computer. We will assume a productivity slowdown of 50 percent.
Costs begin to mount when considering the “contamination effect”: when one individual’s computer downtime affects others within an organization. The IT Department may have to be involved, and in work environments that are collaborative, productivity slowdowns may impact many others within the organization. The slowdowns will depend on the level and nature of collaboration. In a related scenario, when a computer network is down, others have estimated that costs may run into millions of dollars for each hour of downtime (Patterson, 2002).
Precision in estimating the contamination effect will depend on the factors noted above, but a conservative estimate for a typical data loss episode might suggest that an individual’s inability to access key data would impact 3 other co-workers’ productivity, and reduce their productivity by 25 percent each.
The total loss due to productivity slowdown depends critically on the length of downtime, which will be determined to a significant manner on whether the computer needs to be sent to an outside firm, or whether the data can be retrieved in-house. For outside recoveries, the authors’ survey of data recovery firms suggests a 5-day turnaround would likely serve as the minimum amount of time needed for a standard hard drive recovery, including time needed for transport. For in-house recoveries, 8 hours would appear a reasonable estimate of time needed for recovery.
Taking into account these various factors—whether the data is recovered onsite or not, the length of the recovery period, and the expected “contamination effect,” the average estimated productivity loss due to an episode of data loss is $1,750. Figure 6 summarizes this expected productivity loss.
Adding together the expected cost of data recovery ($1,150) to the expected loss of productivity ($1,750), we calculate an average cost of a data loss episode as $2,900. Once again, this assumes that the data is retrievable. If data is lost on a permanent basis, this estimate would grow significantly, as shown in Smith (2003). Note that productivity losses dominate the costs of an episode of lost data.
Understanding Hard Drive Failure
Understanding hard drive failure is important because it is the largest single explanation for data loss. Hard drive failure may be related to mechanical, electronic, or firmware failures. Mechanical failures occur when physical components of the device itself begin to wear or malfunction. Electronic failures occur when the printed circuit board (PCB) begins to produce errors. Finally, many hard drive failures are related to out-of-date, corrupt or buggy firmware. Firmware is the controlling software that is built into the hardware device itself stored on disk platters of the drive. Like most software in use today, firmware may become damaged or corrupt over time. This is a very common failure for modern drives because of the complexity of firmware design.
Is "Lost Data" Really Lost?
Most instances of hard drive failure do not destroy all of the data on the disk and much of the data on failed drives is often recoverable. Both consumer applications and professional data recovery tools and services are available to recover lost data. Which alternative to choose depends upon the value of the lost data. The more valuable the data on the failed drive the fewer non-professional recovery attempts should be made. Non-professional tools and system software (e.g., chkdsk) often fix errors by overwriting the file system on the drive. Though this may repair the file system, it permanently destroys the data. Disks with highly valuable data should be sent to a professional data recovery service. A recent survey of 50 data recovery firms across 14 countries found that 15% of all non-recoverable data loss situations were created by prior non-professional data recovery attempts (DeepSpar Data Recovery Systems, 2007).
The most thorough professional data recovery services are able to retrieve data from drives with mechanical, electronic, and firmware failures. Comprehensive professional efforts include drive restoration, disk imaging, and data retrieval (DeepSpar, 3D Data Recovery Process). First, during the drive restoration phase any existing damage on the hard drive is repaired. This includes mechanical problems such as failed heads, electronic problems such as failed PCBs, and firmware issues. These repairs are made by replacing individual drive components with donor parts and fixing firmware. A second phase is disk imaging where the contents of the drive are retrieved, e.g. retrieving bad sectors or handling other read instability issues, and copied to a healthy drive to reduce the probability of further data loss on the original drive. Finally, the data is retrieved from the new healthy drive. During this phase the drive’s file system is restored, all files are verified for integrity and repaired. It should be noted that many professional data recovery services focus almost exclusively on data retrieval. However, without adequate attention to drive restoration and disk imaging any data retrieval effort will likely encounter serious challenges and may lead to further drive degradation and data loss.
What to Do When you Experience Hard Drive Failure
As evidenced above, hard drive failure is the most common source of data loss which can lead to negative consequences for any business. Unfortunately, hard drive failure is inevitable. It is not a question of if a firm's hard drives will fail, but when. However, with proper planning and a strategic response hard drive failure does not have to lead to data loss. Below we offer seven recommended strategies for dealing with hard drive failure.
In this paper we have analyzed the most common causes of data loss, estimated the average cost per incident, and suggested several strategies for responding to a hard drive failure. In summary, every firm will face the problem of hard drive failure. We argue that as data storage costs decrease and the role of information technology in modern firms increase these problems will become more prevalent. Using the strategies described above, however, a firm should be able to recover data from most hard drive failures through either internal support services or external data restoration services.
Michael L. Williams
DeepSpar Data Recovery Systems
DeepSpar Data Recovery Systems is an Ottawa based company. They are dedicated to providing the most advanced systems and equipment for the 3D data recovery process which includes: Drive Restoration, Disk Imaging, Data Forensics, and Data Retrieval. DeepSpar also provides the technologically advanced PC-3000 family of products.
DeepSpar is committed to providing the best knowledge and data recovery equipment to recovery professionals. Their knowledge and extensive experience extends beyond the product line to R&D, technology research, hard disk drive theory, and business management / consultation.
With unparalleled training and technical support available, DeepSpar also educates data recovery professionals around the globe, sharing knowledge, theory, and optimal practices.
NOTE: This is not an endorsement.