Data exports aren’t ETLs
A common misconception is that building a data export resembles setting up an ETL (Extract, Transform, Load) process. While both involve data movement, they serve different purposes. ETLs move data between one provider and one consumer, often within a single organization’s systems. Data exports, on the other hand, are designed for external sharing — pushing data from one provider to multiple consumers. This requires handling multi-tenancy, permissions, and complex customer support that ETL tools aren’t equipped for.
Enterprises are the core customer
Each enterprise has unique infrastructure, security, and compliance needs. Data exports need to be able to support a wide variety of data platforms, connection methods, and transfer types while ensuring security, reliability, and speed.
Capabilities overview
At a high level, data exports need to be able to:
- Connect to your data platform
- Connect to your customers’ data platforms
- Read from your data platform and write to your customers’ data platforms
- Provide data security and comply with regulatory requirements
- Deliver an excellent customer experience from onboarding to support
- Monetize your data export product
- Use tools, including APIs, to automate workflows
Connect to your data platform
Obviously, vendors will need to be able to connect to your platform, and you’ll know best what’s required to make that happen. We recommend checking the vendor’s developer documentation to understand the feasibility and complexity of that connection.
Connect to your customers’ data platforms
For planning purposes, most software companies need to support a minimum of seven data platforms to cover the majority of their customers. However, a single product supporting every major data platform minimizes the cost and complexity of support.
Destinations
The most popular databases, data warehouses, and object storage services include:
Data warehouses
- Snowflake
- Google BigQuery
- Amazon Redshift
- Databricks
- Amazon Athena
- ClickHouse
Databases
- PostgreSQL
- Amazon Aurora
- MariaDB
- MySQL
Object storage
- Google Cloud Storage
- Amazon S3
- Microsoft Azure Blob Storage
- SFTP
Connection modalities
Each enterprise has its own setup and security requirements. Plan to support:
- Username and password authentication
- Role-based authentication
- IP whitelisting
- SSH tunneling
Read from your data platform and write to your customers’ data platforms
Products also need to read from the source and write to destinations.
Column & schema-based tenancy
How does your company organize user information? Does it keep all customer information in the same table and use a column, like customer_id, to identify which customer the data belongs to? Or does each customer have its own schema? You’ll need to confirm that your data export product is compatible with your method of choice.
Max throughput per destination
How much data do your customers need to sync? How often? That will tell you how much volume and frequency you need to support.
Version-controlled schemas
Schemas should be version-controlled and managed using a solution like GitHub so that changes over time can be tracked and managed along with other code.
Schema migration
What happens downstream when schemas change? Products should have a way to cascade changes from the source to each destination in a way that doesn’t break downstream pipelines. Products should support the addition and removal of both columns and tables.
Types of transfers
Products should support:
1. Full transfers
Products should push all relevant data in the initial sync. Then, users should be able to trigger a full refresh when needed.
2. Incremental transfers
After the initial sync, products should be able to push new data to each destination.
3. Windowed transfers
When transfers are very large, products should be able to split that transfer into smaller chunks so that customers don’t have to wait for full transfers to complete to start working.
Eventual consistency
Products should have a method to prevent data loss during the transfer process due to data arriving out of order.
Data integrity checks
Products should include metrics that help both the sender and receiver understand transfer health and verify that data transfers are complete.
Provide data security and comply with regulatory requirements
Your data export product should be at least as careful with customer data as the rest of your systems.
Deployment
Can the product be deployed privately behind your company’s firewall, or do you need to evaluate its system security?
Data retention
Does the system retain any of the data it transfers? Or does it use an ephemeral server to read from one system and write to the other?
Data residency
Where is the data processed? Does the product help prevent you from sending data to locations where it cannot be sent?
Certifications & testing
Does the product have all of the certifications your business needs for a vendor that handles customer data, such as SOC2, GDPR, CCPA, and HIPAA?
Deliver an excellent customer experience from onboarding to support
The data export product should offer a customer experience that matches existing experience standards.
Embedding
Where do customers interact with the product? Can it be embedded in your website or app, or do customers need to go somewhere else for help with setup and support?
White-labeling
Does the vendor offer a white-labeled product, or will you need to introduce your customers to a third party?
Magic links
Can you send your customers a unique link to begin the onboarding process?
Data and frequency selection
Customers should be able to choose which data they want to receive and how often they want it synced.
Documentation
Does the product provide all the documentation customers need to set up and use your data export feature?
Direct support
Will you have direct access to the vendor for customer support? Does the SLA support your existing SLAs with customers?
Use tools, including APIs, to automate workflows
Products should deliver an excellent developer experience, including access to all the tools needed to automate and streamline workflows.
Webhooks & alerts
Products should be capable of sending HTTP callbacks. Ideally, they should also integrate with commonly used platforms like PagerDuty and Slack for easy monitoring.
API
Products should expose a RESTful API to handle most common actions, like configuring sources and destinations, initiating data transfers, monitoring activity, and more.
Developer documentation
Developers should have access to clear, comprehensive documentation to build and manage data export products confidently.
Monetize your data product
If you’re going to charge for data export, you’ll want all of the features you’ll need to charge customers for service.
Affordable
The product should offer a cost model that will sustain a business. That likely means you’ll need to hit a cost target that will allow you to maintain existing profit margins.
Client cost model
The product should offer a cost model that allows you to allocate costs to clients.
Building an enterprise-class product
Getting the enterprise experience right takes careful planning. Now that you have a comprehensive list of the features your customers may need, explore available data export market guides to find the best tools for building your product.