This contributed story by Dr. Michael Feindt, strategic advisor with Blue Yonder, originally appeared in Forbes on Aug. 3, 2021. Excerpts from the story below. To see the full story

Business decision makers, and especially CIOs, are facing an inconvenient truth — to enact a successful artificial intelligence (AI) and machine learning (ML) project, there can only be one single source of truth when it comes to their data. Among larger companies, in particular, the data deluge has reached almost unmanageable levels. Big data as an all-encompassing concept is not a new challenge; however, repeat data is. This is data that exist in numerous places across an enterprise empire and isn’t necessarily consistent.

In a retail context, you may have a data set — such as prices, a current stock gauge, a future demand prediction or seasonal stats — in a local store. You may have that same data at headquarters, on a different system, in a different country or being merged and transferred following an acquisition. Several computers, databases and data warehouses are not uncommon consequences of the big data surge. It only takes an anomaly in one location to threaten the validity of the data.

This is why we’re seeing so many organizations investing in data integration projects as part of their digital transformation efforts. It’s a way to bring those disparate networks together and to establish a single “source of truth.”