Injecting noise into a dataset to protect individual data points. DataFleets employs this mathematical definition of privacy to guarantee no individual’s data can be “reverse-engineered” from statistical queries or machine learning.
Dummy testing user-submitted models by subjecting them to privacy attacks known in scientific literature. If a model is compromised in testing, DataFleets automatedly “turns the dial” on one or more of its privatizing techniques until the model is secured. No model is released from DataFleets to the user until it’s entirely secured against all attacks.
In DataFleets, no user gets access to row-level plaintext data. Synthetic data takes its place, helping you “get your teeth into” data as you normally would for ETL, entity-linking, structuring analysis, etc. Upon connection to a dataset, DataFleets automatedly generates synthetic data that is structurally representative of the underlying plaintext. Analytics themselves are always run on the raw data.
A machine learning technique that trains models across decentralized datasets by distributing and aggregating model parameters instead of the data itself. DataFleets uses federated learning to analyze data across borders and silos in fragmented architectures, as if they were all in one dataframe.
In addition to machine learning, DataFleets offers privacy-preserving distributed SQL.
We support privacy-preserving entity linking based on unique identifiers, but also understand the real world of enterprise data is messy. So we have fuzzy matching capabilities that resolve the inevitable misspellings, maiden vs married names, scrambled numbers, and other forms of disorder, all while maintaining privacy and security.
A cryptographic protocol that ensures no party can see another’s data in multi-party, distributed, “data sharing” settings.
Encryption technique enabling computation on encrypted data without decrypting it first.
DataFleets' python and SQL APIs let you access the full breadth of federated intelligence capabilities on your existing IDE or workflow system.
The Fleet Coordinator sits in your data silo infrastructure, securely managing federated jobs and aggregating machine learning gradients.
The Fleet Runner is an extensible endpoint daemon that can be placed in any silo or edge device, with a low-level SDK to match a variety of environments.