Skip to main content

Salesforce and the Costs of Dirty Data

Cindy Bender | Principal Consultant, Salesforce

January 9, 2023

In a previous post, we discussed how Salesforce can be used to collect data the right way. Now we take a look at what constitutes dirty data and its costs.

Dirty data is much more than typos or having Street, St and ST in an address line. Answering yes, to any of the following questions, likely indicates dirty data.

  • Is sensitive data or data related to digital consent (opt-in option) secure?
  • Are there duplicate records in a single dataset?
  • Is the same piece of data stored in multiple places?
  • When reviewing data, often by running a report, does a CMO, CTO, or an account or sales rep cite inaccuracies?
  • Is information missing from what was supposed to have been collected?

Data Security Impacts Data Hygiene

Not complying with consumer data and privacy regulations like GDPR and CCPA can result in fines to an organization. As one considers the information their organization stores, they should consider if digital consent related fields, including opt-in fields, are updated appropriately. And if merging two or more records and the values for the consent field (or any other field) differs, how is the correct value determined?

Duplicative, Missing, and Inaccurate Data

Having duplicate records increases the odds of having inaccurate data. Let’s use the case of a record where all a client’s data is stored. If there are multiple records for this client, it becomes increasingly difficult to keep each record updated as new data is added. For example, an address change may be made to one record. Another member of the team may learn of a name change and put that updated information on the second record. With conflicting information on different records, it becomes time-consuming to determine which pieces of information are accurate.

Another scenario we often encounter is one in which the same information is stored in different databases or on different records in the same database. Using email address as an example, if a client’s email address is stored in Salesforce on their contact record but also on an opportunity or event registration, it’s easy for errors to occur. 

But how can something that is missing be the cause of dirty data? Duplication checking tools often use data to determine if two records are candidates for a merge. If data is missing, these records may be ignored, which contributes to the aforementioned duplication issue. Additionally, when data is missing, a record may be missed in a report or campaign filter, causing missed opportunities to connect with a client.

When reviewing data, look for these signs of dirty data.

  • Is there more than one record for the same account or contact due to nicknames or abbreviations (Robert Jones and Rob Jones or Acme, Inc. and Acme)?
  • Are there typos in a key field such as a name field, resulting in difficult to find duplicate records?

Salesforce and the Cost of Dirty Data

Data hygiene should be a priority for at least one team in every organization. The team name will vary by company but typically it’s the people in IT, information security, Salesforce admins, and sometimes, marketing. 

Plain and simple, dirty data costs an organization. Consider the industry standard of 1-10-100 rule.

  • It costs $1 to verify the data as it’s being entered.
  • It costs $10 to clean the data once records are infected with dirty data.
  • It costs $100 if you do nothing.

How can doing nothing cost more?

When dirty data exists, future sales, a reputation, and even financial penalties could be on the line. In healthcare, patient care can be compromised when duplicate or incomplete data exists. Dirty data lacks credibility causing those relying on the data to spend extra time validating its accuracy. Incomplete or disjointed data can cause an organization to miss opportunities to promote its offerings.

Further Considerations

Recent studies have shown dirty data costs the U.S. healthcare industry approximately $300 billion annually. Furthermore, dirty data costs businesses an average of 15%-25% of revenue in turn costing the U.S. economy $3 trillion annually.

A recent Salesforce post indicated that by global volume, client data doubles every 12 hours. Pause for just a moment and let that sink in. If one shuts down for the night at 5 PM, by the time they return the next morning, the amount of client data globally has doubled. Now imagine if the slice of data a company uses isn’t clean. That’s a lot of dirty data to remedy.

We hope you enjoyed reading the Phase2 blog! Please subscribe below for regular updates and industry insights.

Recommended Next
Data & Insights
Navigating the Next Wave: AI-Assisted Search in Healthcare Marketing
Purple, pink gradient
Data & Insights
HIPAA Compliant A/B Testing in Healthcare Marketing
woman talking
Data & Insights
How to Track Clicks in the Shadow DOM with Google Tag Manager
Three coworkers seated in a conference room discussing a project
Jump back to top