Business Technology Consulting

What is Data Decay and How Does it Affect Your Business?

Written:

Do you use a software system in your business?
Does your software system deal with client contact information?

If you answered ‘yes’ to both questions, chances are extremely high that you face the inevitable issue of data decay.
(If you answered ‘no’, you probably deal with data decay anyway, but should definitely implement a software solution to digitize your customer information first.)

 

What is data decay?

Data decay refers to the gradual loss of data within a system. There are two main classifications of data decay: mechanical and logical.

 

Mechanical Data Decay

Mechanical data decay is probably the more well-known form of data decay. Server crashed, hard disk corrupted and all your CRM records vanished without a trace? That’s mechanical data decay at work, and as you can probably tell by now, it’s not a pretty thought.

Data decay happens on a mechanical level everyday even if you do nothing to it. Every time data is written to or read from a storage medium, there’s a chance that corruption will occur or that the medium will fail.

Based on a BackBlaze article, some drives they used had an annual failure rate as high as 25.4%.

There is only one solution to mechanical data decay: backup. A solid backup strategy is essential for any system, and thankfully most companies realize this and do their due backups.

hard-disk-failure-crash

However just backing up data isn’t enough. Sometimes, unbeknownst to you, undetected corrupted data can still find their way into your backups.

 

Here, another key data management concept comes into place: integrity checking.

Most of you, at some point, have probably edited a Microsoft Word document, experienced a crash and when you returned, Word prompted you with a restoration option from an auto-saved version. This is integrity checking in its simplest form.

microsoft-word-paperclip

A typical editing / saving process of a Microsoft Word document goes something like this:

  1. A copy of the document is created and edits etc are performed on this copy
  2. Regular auto-saved and such goes into this copy
  3. Whenever an actual save is triggered or Microsoft Word exits normally, the copy is checked against the original document, and if there are differences, the original file is updated.
  4. If Microsoft Word crashes unexpectedly, and the original file is opened, Microsoft Word will attempt to see if any temporal copies of the document exists, and if they do exist, prompt the user if the want to attempt to restore data from it.

The best form of integrity checking at the hardware level for every single file is probably the usage of ZFS as a file system. Unfortunately ZFS isn’t available for Windows, and so Windows users have to mostly rely on RAID 5 or 6 to manage the checks for you.

 

Logical Data Decay

Now, I’m going to talk about the silent killer of data: data decay due to logical issues.

Ever called a client or a company and found out, to your horror, that the contact number is not valid? Tried to visit a shop and found that they have already moved? Have you received a personalized promotional mail in your letterbox, sent to your address, but not addressed to anyone staying there?

These are situations that people experience daily, and the reason for this occurrence is simple – data is time sensitive.

Every business keeps contact information of their vendors and customers in one form or another, be it a pen and paper list or having the data stored within a computer system. However, how often do you interact with people in your contact list? How often do you check if their contact information is valid?

 

Let’s look at 2 possible types of data that can get outdated very quickly.
Disclaimer: The calculations, while based on as updated statistics as I can find, take into account only limited parameters, and are estimates.

Referring to statistics released by the Ministry of Manpower, the average monthly recruitment rate and reassignment rate for 2014 are 2.6% and 2.0% respectively. This means that you can expect about 4.5% of of your list experiencing a job change, depending on the industry your business is in.

Let’s take a look at another piece of data: Telecom Service data from IDA for the first four months of 2015.

Here’s a graph showing the total number of mobile line subscriptions –

ChartGo(2)

January: 8,106,700
February: 8,092,700 (+14,000)
March: 8,103,800 (+11,100)
April: 8,123,900 (+20,100)

Let’s work with the lowest change and assume that each month there are 11,100 new mobile phone lines being registered. This could be due to number changes, telco changes, or simply new lines being registered.

Based on Singstat, the current population of Singapore is ~5.47m in Jun 2014. Granted, this is 9 months outdated (from March 2015), but it’s probably not too far off.

Assuming that the 11,100 new mobile phone subscriptions are spread across the entire population, 11,1000/5,470,000*100 = 2% of the the people in Singapore might get new mobile subscriptions and on a monthly basis, and in turn, the records of their contact number might therefore become invalid. Of course, this does not take into account the number of people signing up for a new number for the first time, and is therefore an over-estimate.

Looking at just jobs and new mobile phone line subscriptions, we are already seeing ~6.5% (assuming that they are mutually exclusive groups of people) chance of your customer data data getting outdated on a month to month basis. (Again, there are a lot of assumptions that can both inflate and deflate this number significantly, but let’s work with this for now.) If we consider more factors like the sales of houses, migration rates, death rates, etc… It’s more than safe to assume that at LEAST 15% of the contact details within your customer list gets outdated on a monthly basis.

Naturally you will have frequent contact with some customers and hence get regular updates on their information. But just how much of your customers do you actually interact with frequently and how valid is the rest of your data?

 

Data Validity

To understand this better, I’ll use numbers from the Customer Relationship Management (CRM) system of a (undisclosed) beverage company that I’ve worked with.

They had the contact details of 7213 customers in one segment of their CRM. Out of this 7213 customers, 3105 had made an order within the past 6 months. So, let’s assume that the 3105 customer have valid information.

7213-3105 = 4108 customers whose contact information had not been validated in the last 6 months.

Let’s take the monthly data decay rate of 6.5% and compound it over 6 months:

ChartGo(3)

Month 1: 4108 * (100-6.5)% = 3840
Month 2: 3840 * (100-6.5)% = 3456
Month 3: 3456 * (100-6.5)% = 3111
Month 4: 3111 * (100-6.5)% = 2800
Month 5: 2800 * (100-6.5)% = 2520
Month 6: 2520 * (100-6.5)% = 2268

In just 6 months, the predicted amount of reliable contacts dropped to 2268. Adding the 3015 that we assumed is correct, we had 5283 customers whose data was probably still valid.

5283/7213*100 = ~73.2%

If the company were to use the contact information in their CRM system for a campaign, their actual reach would only be ~73.2% of what they envisioned. This is using a conservative change rate of only 6.5% over 6 months. Extend this to a year at 10-15%, and easily half the data in the CRM is already obsolete.

 

What can be done to prevent this?

Many companies I’ve seen typically place the job of updating / verifying contact information on just the sales department. This isn’t sufficient, and data integrity checks should be prioritized as one of the key objectives of the entire company.

Take the beverage company for example, instead of just relying in sales to update information, they adopted some practices on our recommendation –

1) A newsletter was setup and contests / promotions regularly released through the newsletter to captivate interest among their clients, making it more likely that the client will update their contact information regularly so that they won’t miss these promotions.

2) Clients were monitored more closely and if they didn’t make any new orders or initiate any kind of interaction with them within 3 months, attempts were made to check if these contact details were still valid.

These may sound very mundane or time consuming, but they are tasks that have to be carried out to ensure that your marketing efforts are efficient, so you do not lose potential sales.

Is your business experiencing this problem? Let us know how we can help!