Tech Trends: Synthetic data: artificial data; real solutions

In this podcast, Alexy Thomas, EY India Technology Consulting Partner, talks about synthetic data, which is being considered the future of the artificial intelligence space.


Podcast host Silloo Jangalwala, Associate Director, BMC, speaks to Alexy Thomas from Tech Consulting at EY India, addressing predominant questions surrounding synthetic data, its potential to become the future of AI, its capability to solve privacy concerns and whether it is a one-stop solution for all AI data needs.

Background: industries need large amounts of high-quality data to train new AI models. Because of emerging data privacy concerns and stringent regulations on data sharing, gathering, and accessing real and high-quality data is becoming difficult. Synthetic data is generated artificially, with or without the help of real data sets, for the purpose of training AI modules. This may address some of these problems faced with real data.

 Key takeaways

  • While actual data may lack quality, volume, or variety, synthetic data can overcome these limitations and be generated in all the permutations and combinations of any given condition. Real data may also be unavailable for unseen conditions and events.
  • Synthetic data can better train AI models and test systems and help build better prototypes than real data sets.
  • It can also provide faster turnaround for AI testing, which requires large amounts of iterations and inputs. In the coming years, synthetic data is going to overshadow real data in AI models.
  • In sectors like financial services, synthetic data can help to evaluate market behavior and to develop new and innovative products, which is what large and small financial services organizations are trying to do.
  • Synthetic data comes with significant risks and limitations since the quality of synthetic data generated depends on the quality of the model that created it. So, if the input has errors or biases, the data generated using it will lead to false insight generation and, automatically, to erroneous decision-making. 
Digitally generated data has the same predictive power as real data, as it replicates the statistical characteristics of the existing dataset. It can be generated for unseen conditions and events. Where actual data lacks quality, volume, or variety, synthetic data overcomes these weaknesses, as it is generated for unseen conditions.

For your convenience, a full text transcript of this podcast is available below:


If you would like to listen to our podcasts on the go:


Podcast

Episode 03

Duration

6m 24s

Related content

Synthetic data: fake is the new real

Synthetic data is a class of data that is artificially generated. Learn more about synthetic data in AI.

Chapter IV: How cloud adoption lets untethered enterprises soar

Cloud computing has reshaped the way organizations do business. Learn more about the emerging trends of cloud adoption.

EY Tech Trends chapter I: stitching data together

Learn how data fabric and data mesh architectures play an integral role in building a connected enterprise