Git for Data
The industry's first database to bring Git-style version control to data. Unify transactional, analytical, vector, and full-text workloads in a single system — MySQL compatible, AI-native, cloud-native.
Ingest multimodal data, AI-driven processing, run hybrid workloads
Say goodbye to 4 databases, multiple ETL jobs, hours of data lag, and sync nightmares
The first database with Git for Data — every data change is traceable, reversible, and collaborative
Zero-copy snapshots in milliseconds, no storage explosion
Query data as it existed at any point in history
Test migrations and transformations in isolated branches
Restore to any previous state without full backups
Track every data change with immutable history
Built-in vector search, full-text search, and Agent data sandbox — no external vector databases needed
Built-in IVF/HNSW vector indexes and full-text search engine. Supports billions-scale vector retrieval, directly powering RAG applications and semantic search.
Git for Data branching naturally provides isolated data environments for AI Agents — one branch per agent, no interference, safe merging.
As the core data engine of MatrixOne Intelligence, deeply integrated with MatrixPipeline, MatrixGenesis, and other AI components — providing a unified data foundation for enterprise AI applications.
from matrixone import Client
client = Client()
client.connect(database='demo')
# Vector search
query = [0.2, 0.3, 0.4, 0.25, 0.35]
results = client.query(
Article.title,
Article.embedding.l2_distance(query)
).filter(
Article.embedding.l2_distance(query) < 0.1
).execute()Built for Consolidation, Scale, and Intelligence
Disaggregated storage-compute design — each layer scales independently
Raft Shared Log
S3 Object Storage
From single-node to distributed, from private cloud to public cloud
Primary-replica architecture for small to medium workloads
Single node + S3 object storage, balancing simplicity and elasticity
Fully distributed deployment with unlimited horizontal scaling
Start your AI data journey today