Developer Implementation Guide
Provide data portability exports
Give people a complete, portable copy of their data in a structured, commonly used, machine-readable format. Build a repeatable pipeline with strong verification, clear schemas, secure delivery, and proofs of what was sent.
Provide Data Portability Exports
Give people a complete, portable copy of their data in a structured, commonly used, machine-readable format. Build a repeatable pipeline with strong verification, clear schemas, secure delivery, and proofs of what was sent.
Scope and format
Include data the person provided and data observed from their use of the service, where feasible.
Exclude secrets, internal risk scores, and other users’ personal data. Redact third-party identifiers or replace with pseudonyms.
Prefer JSON Lines for records and CSV for simple tables. Bundle large media as files. Provide a data dictionary.
Use ISO-8601 UTC timestamps, stable IDs, and UTF-8 everywhere.
Intake, verification, and status
Provide self-service and admin tools to request an export.
Re-verify identity and require MFA for high-risk accounts.
Use idempotency keys and a simple state machine:
received → verified → building → ready → delivered → deleted.
Export design
Create per-domain extractors with consistent field naming and null handling.
Stream large tables to JSONL to avoid memory spikes. Paginate by primary key.
Add joins by reference only. Do not inline other users’ PII.
Include attachments from object storage and rewrite URLs to relative file paths.
Delivery and security
Package as ZIP with AES-GCM encryption or provide an expiring, single-use download link to object storage.
Require authenticated session plus one-time token. Expire links within 7 days. Auto-delete export artifacts after expiry.
Sign the manifest and store checksums for later verification.
Redaction and third-party references
Replace other users’ identifiers with consistent pseudonyms per export.
Truncate or mask risky fields by default, with a clear explanation in the dictionary.
Omit server logs and debug traces unless they are already privacy-safe.
Vendors and processors
Pull subject data you store in vendor systems via their APIs where contracts allow.
Include vendor source labels and timestamps. Keep the raw vendor receipts in the audit trail.
Retention and cleanup
Keep export artifacts only as long as needed to deliver, commonly 7–14 days.
Record delivery events and delete the package. Never retain a permanent copy.
Monitoring and SLAs
Track median time to build and deliver, failure rates by dataset, and size distributions.
Alert on stalled exports, token reuse attempts, or checksum mismatches.
Data dictionary snippet
Quick portability checklist
Stable DSID lookup and strong re-verification
Structured JSONL/CSV plus media, with schemas and a dictionary
Streaming exporters with pagination and memory safety
Redaction of other users’ data and risky fields
Tamper-evident manifest with per-file checksums
Secure delivery with short-lived, single-use links or encrypted ZIPs
Vendor pulls with receipts and clear source labels
Auto-deletion of export artifacts and full audit logging
Conclusion
A clear, secure export pipeline gives people control of their data and gives you proof of compliance. With consistent schemas, safe redaction, streaming builders, and verifiable delivery, you make data portability reliable for users and low risk for engineering and compliance.