TAAFT
Free mode
100% free
Freemium
Free Trial
Deals
March 20, 2026
Use tool
Inputs:
APITabularText
Outputs:
TabularAPIText
Easy data preparation with AI-powered operators.
DataFlow website
Featured alternatives MyReport MyReport
225,692
Lyric Remix Studio Lyric Remix Studio
9,930
DescribeThat DescribeThat
8,167
GoAI GoAI
37,711
Clawdi Clawdi
35,784
Flowova AI Flowova AI
38,728
Smart Clerk Smart Clerk
20,586

Overview

OpenDCAI/DataFlow is a tool developed for data preparation and training. It's intended to generate, refine, evaluate and filter high-quality data for AI from noisy sources such as PDFs, plain text, and low-quality QA.

This tool aims to improve the performance of large language models (LLMs) through targeted training in specific domains like healthcare, finance, legal, and academic research.

The system incorporates operator-based design to transform the entire data cleaning workflow into a reproducible, reusable, and shareable pipeline. This serves as the core infrastructure for the Data-Centric AI community.

Additionally, OpenDCAI/DataFlow has an intelligent agent capability that can dynamically assemble new pipelines by either recombining existing operators or creating new ones based on demand.

This tool assists in generating high-quality LLM training datasets from raw data using visual, low-code pipelines with flexible orchestration across domains and use cases.

The tool also includes text, math, and code data generation, as well as tools like AgenticRAG and Text2SQL for data creation. Other features include large-scale PDF to QA conversion and structured data extraction.

Show more

Releases

Get notified when a new version of DataFlow is released
DataFlow icon
Initial release
March 20, 2026
Initial release of DataFlow.
Author

Pricing

Pricing model
Free
Paid options from
Free
Save
#610 0 0
0 AIs selected
Clear selection
#
Name
Task