Backup Header Below

‘Data-Centric AI Community’ to Foster Community-Driven and Expert-Guided Transformations for Better AI

YData's 'Data-Centric AI Community' aims to help data scientists improve the quality of their datasets for improved ML models

Recent research points that there will be no digital transformation without high-quality data. The new paradigm shift in approach to AI development, from model-centric to data-centric, YData, a tech-startup, created the Data-Centric AI Community to facilitate community-driven and expert-guided transformations for better AI development.


The Data-Centric AI Community established by YData, which created the first development platform for data quality to accelerate the development of AI solutions, aims to break down barriers for data science teams, researchers, and beginner learners and create a friendly place where data quality issues are discussed and solved.


YData has always been a pioneer in open-source development and community-driven AI transformation, launching the Synthetic Data Community in 2020. In 2021, YData open-sourced two notable libraries, ydata-synthetic and ydata-quality, with the sole goal of ensuring data science teams have access to high-quality data.


YData’s Synthesizer leverages state-of-the-art deep learning techniques to learn the statistical information from the actual data and mimics it on a new dataset. YData’s Pandas Profiling helps one profile the raw data and understand the quality of the data in a few lines of code.


“We understand that a community driving the paradigm shift to data-centric AI is essential, and we aim to focus on data profiling, synthetic data and data labelling, the most significant pain points of the data scientists,” states YData co-founder Gonçalo Martins Ribeiro.


With experts like Andrew Ng raising awareness for the data-centric approach and the first competitions and workshops conducted, the Data-Centric AI community stands out as the missing piece of the data-centric movement.


“We believe that having quality data is truly a game-changer and that by creating high-quality data that resembles real-world data that was initially inaccessible, endless possibilities can be unlocked,” explains Ribeiro. “Being able to profile and understand data, early in the development, is crucial and can save a lot of time and money to organizations.” he further stresses the importance of data profiling.


“Not every company, researcher, or student has access to the most valuable data like some tech giants do. As ML algorithms coding frameworks evolve rapidly, it’s safe to say the scarcest resource in AI is high-quality data at scale. We need to find ways to improve the data used for AI development. The Data-Centric AI Community is a step towards addressing that.”


YData’s development platform follows a data-centric mindset by bringing together the major data science frameworks with proprietary tools for data access and profiling, synthetic data generation and labelling to deliver better data quality for AI. Better data means fewer errors, biases and a representative set that ensures AI is built responsibly. Organizations have already adopted the company’s technology in the financial services, utilities, and telecoms sectors.


Source :

Other Press Releases