Synopsis: In this Techstrong AI interview, Soniya Bopache, VP and general manager of data compliance at Veritas Technologies, explains how unstructured, untagged and unused data, otherwise known as dark data, when not effectively managed, can lead to sensitive data being inadvertently exposed to a large language model (LLM).

In this episode of the Techstrong AI video series, host Mike Vizard speaks with Soniya Bopache, VP and general manager for data compliance and governance at Veritas, about the growing issue of dark data. Bopache explains that dark data is akin to forgotten items stored in an attic—data that is collected but not actively used or managed, such as emails, documents, and log files. She emphasizes the importance of proactively managing this data through robust data monitoring systems, regular audits, and strong data governance practices, especially in the context of AI, where unstructured and unprocessed data can introduce biases or security risks. The discussion highlights how dark data, if not properly governed, can hinder AI outcomes, making comprehensive data management strategies essential.

Bopache further discusses the collaboration needed across different teams in organizations to manage dark data effectively, noting that the responsibility extends beyond just IT or compliance teams to include all data producers. She stresses the importance of fostering a culture where all employees are mindful of the data they generate. The conversation also touches on the need for organizations to adapt their data governance frameworks to meet evolving regulations and AI standards, ensuring data accuracy and compliance. Ultimately, as AI technologies evolve, organizations must integrate advanced tools and collaborative approaches to handle data proactively and responsibly.