Privacy First.
As it should be.

Conduct ETL, business analytics, and machine learning without ever seeing raw row-level data

fully auditable with mathematical proofs

toggleable privacy features

Privacy - Key Building Blocks

Differential Privacy

Injecting noise into a dataset to protect individual data points. DataFleets employs this mathematical definition of privacy to guarantee no individual’s data can be “reverse-engineered” from statistical queries or machine learning.

Attack Based Evaluations

Dummy testing user-submitted models by subjecting them to privacy attacks known in scientific literature. If a model is compromised in testing, DataFleets automatedly “turns the dial” on one or more of its privatizing techniques until the model is secured. No model is released from DataFleets to the user until it’s entirely secured against all attacks.

Synthetic Data Generation

In DataFleets, no user gets access to row-level plaintext data. Synthetic data takes its place, helping you “get your teeth into” data as you normally would for ETL, entity-linking, structuring analysis, etc. Upon connection to a dataset, DataFleets automatedly generates synthetic data that is structurally representative of the underlying plaintext. Analytics themselves are always run on the raw data.

Federated World.

Connect across data silos and devices with Federated Learning and SQL

Accelerate internal access provisioning and data sharing initiatives with third parties

reduce costs by eliminating data copy

consolidate redundant operations across jurisdictions

link entities across silos without betraying privacy

Federated Analytics - Key Building Blocks

Federated Learning

A machine learning technique that trains models across decentralized datasets by distributing and aggregating model parameters instead of the data itself. DataFleets uses federated learning to analyze data across borders and silos in fragmented architectures, as if they were all in one dataframe.

In addition to machine learning, DataFleets offers privacy-preserving distributed SQL.

Private Entity

We support privacy-preserving entity linking based on unique identifiers, but also understand the real world of enterprise data is messy. So we have fuzzy matching capabilities that resolve the inevitable misspellings, maiden vs married names, scrambled numbers, and other forms of disorder, all while maintaining privacy and security.

Secure Multi-Party

A cryptographic protocol that ensures no party can see another’s data in multi-party, distributed, “data sharing” settings.


Encryption technique enabling computation on encrypted data without decrypting it first.

DataFleets Architecture Overview


DataFleets' python and SQL APIs let you access the full breadth of federated intelligence capabilities on your existing IDE or workflow system.

Fleet Coordinator

The Fleet Coordinator sits in your data silo infrastructure, securely managing federated jobs and aggregating machine learning gradients.

Fleet Runner

The Fleet Runner is an extensible endpoint daemon that can be placed in any silo or edge device, with a low-level SDK to match a variety of environments.

Ready to Give
DataFleets a Try?

put your Distributed Data To work