External Data
Not all data belongs inside Salesforce. A CTA must know when to bring data onto the platform, when to leave it external, and how to bridge the two. This decision affects storage costs, performance, integration complexity, and user experience.
The Fundamental Question
Before designing any data architecture, ask: Should this data live in Salesforce?
graph TD
A[Data Source Identified] --> B{Do Salesforce users<br/>need this data?}
B -->|No| C[Keep external.<br/>No integration needed.]
B -->|Yes| D{How do they<br/>need it?}
D -->|Read-only reference| E{Data volume?}
D -->|Read-write,<br/>interactive| F{Is Salesforce the<br/>system of record?}
E -->|Small < 100K| G[Replicate via ETL<br/>into custom objects]
E -->|Large > 100K| H{Real-time needed?}
H -->|Yes| I[Salesforce Connect<br/>External Objects]
H -->|No| J[Nightly ETL sync<br/>or Data Cloud]
F -->|Yes| K[Store in Salesforce<br/>Standard/Custom objects]
F -->|No| L{Latency tolerance?}
L -->|Low latency OK| I
L -->|Must be instant| K
style C fill:#868e96,color:#fff
style K fill:#51cf66,color:#fff
style I fill:#4c6ef5,color:#fff
style G fill:#ffd43b,color:#333
style J fill:#ffd43b,color:#333
Salesforce Connect
Salesforce Connect enables real-time access to external data through External Objects, without copying data into Salesforce.
How It Works
sequenceDiagram
participant User
participant Salesforce
participant Adapter
participant ExternalSystem
User->>Salesforce: View related list / run report
Salesforce->>Adapter: OData query
Adapter->>ExternalSystem: Translate to native query
ExternalSystem->>Adapter: Return results
Adapter->>Salesforce: OData response
Salesforce->>User: Display external data
Adapters
| Adapter | Connects To | Best For |
|---|---|---|
| OData 2.0 | Any OData 2.0 endpoint | SAP, Microsoft Dynamics, custom APIs |
| OData 4.0 | Any OData 4.0 endpoint | Modern REST APIs with OData support |
| Cross-org | Another Salesforce org | Multi-org architectures |
| Custom (Apex) | Any system via Apex | Systems without OData support |
Adapter Architecture Comparison
Each adapter type has a different data flow path. Understanding these paths is critical for latency analysis and troubleshooting.
graph TD
subgraph ODataFlow["OData Adapter Flow"]
U1[User Action] --> SF1[Salesforce Platform]
SF1 --> OA[OData Adapter<br/>translates to OData request]
OA --> MW[OData Endpoint<br/>middleware or direct]
MW --> EXT1[External Database<br/>SAP, Dynamics, etc.]
end
subgraph CrossOrgFlow["Cross-Org Adapter Flow"]
U2[User Action] --> SF2[Salesforce Org A]
SF2 --> XO[Cross-Org Adapter<br/>uses REST API]
XO --> SF3[Salesforce Org B<br/>direct API call]
end
subgraph CustomFlow["Custom Apex Adapter Flow"]
U3[User Action] --> SF4[Salesforce Platform]
SF4 --> CA[Custom Apex Class<br/>DataSource.Connection]
CA --> ANY[Any External System<br/>REST, SOAP, GraphQL, etc.]
end
style OA fill:#4c6ef5,color:#fff
style XO fill:#51cf66,color:#fff
style CA fill:#ffd43b,color:#333
Latency differences
OData adds translation overhead (Salesforce-to-OData-to-native query). Cross-org is faster for Salesforce-to-Salesforce because it uses the native REST API with no translation layer. Custom adapters have the most flexibility but require Apex development and testing for each external system.
Cross-Org Adapter
The cross-org adapter is particularly important for CTA scenarios involving multi-org strategies:
- Connects two Salesforce orgs without middleware
- Uses standard Salesforce APIs under the hood
- Supports SOQL-like queries across orgs
- Subject to API limits on both orgs
- Useful for franchise models, acquisitions, or multi-cloud architectures
Salesforce Connect Limits
| Limit | Value |
|---|---|
| External objects per org | 100 |
| Rows returned per query | 2,000 (page-based) |
| Named credentials | 50 per org |
| Callout time limit | 120 seconds |
| Monthly callout limit | Based on license type |
User experience impact
External objects add latency to every page load and related list render. Users accustomed to sub-second Salesforce response times will notice 1-3 second delays for external data. Set expectations and consider caching strategies.
External Objects Deep Dive
External objects (__x suffix) represent data stored outside Salesforce.
Capabilities and Limitations
| Feature | Supported? | Notes |
|---|---|---|
| Related lists | Yes | Appear on parent records |
| List views | Yes | With filter limitations |
| SOQL | Partial | Subset of operators, no aggregate queries |
| SOSL (search) | No | External objects are not searchable via global search |
| Triggers | Limited | After-insert only, asynchronous |
| Flows | Limited | Some actions supported |
| Reports | Limited | Can be included in custom report types |
| Validation rules | No | Validation must happen in external system |
| Workflow rules | No | Use triggers or flows instead |
| Approval processes | No | Not supported |
Relationship Types for External Objects
| Relationship | Description |
|---|---|
| External lookup | Standard/custom object looks up to external object (by External ID) |
| Lookup | External object looks up to standard/custom object (by 18-char Salesforce ID) |
| Indirect lookup | External object looks up to standard/custom object (by unique external ID field) |
Indirect lookups are key
Indirect lookups let you relate external objects to Salesforce objects without the external system knowing Salesforce IDs. The external system uses its own ID, and Salesforce matches it against a unique external ID field on the parent object. This is the recommended approach for most scenarios.
Big Objects
Big Objects (__b suffix) store massive volumes of data on the Salesforce platform itself. They complement external storage by keeping data within the platform’s trust boundary.
When to Use Big Objects
| Scenario | Why Big Objects |
|---|---|
| Audit trail archival | Store field history beyond 18-month limit |
| Historical transactions | Transaction logs, payment history, event logs |
| IoT telemetry | Sensor data, device events |
| Regulatory compliance | Long-term record retention on-platform |
| LDV archival | Move aged records from standard objects |
Big Object Constraints
| Constraint | Detail |
|---|---|
| Query | Standard SOQL on indexed fields (Async SOQL retired as of Summer ‘25) |
| DML | Database.insertImmediate() — no standard insert/update |
| Index | Defined at creation, immutable after |
| Triggers | Not supported |
| Reports | Not supported directly (query results into custom objects) |
| Relationships | Can have lookups but cannot be child in master-detail |
| Storage | Counts toward Big Object storage, not data storage |
Big Objects vs Custom Objects
Understanding when to use Big Objects versus standard Custom Objects is a critical architectural decision. The diagram below highlights the key differentiators.
graph TD
A[Need to store<br/>structured data] --> B{Expected record<br/>volume?}
B -->|"< 10M records"| C{Need triggers,<br/>reporting, SOQL?}
B -->|"10M - 1B records"| D{Need real-time<br/>query access?}
B -->|"> 1B records"| E[Big Objects<br/>or External Storage]
C -->|Yes| F[Custom Object]
C -->|No| G{Audit / compliance<br/>archival use case?}
G -->|Yes| E
G -->|No| F
D -->|Yes| H{Can you use<br/>indexed fields only?}
D -->|No - batch OK| E
H -->|Yes| E
H -->|No - complex queries| I[Keep in Custom Object<br/>+ optimize with LDV<br/>strategies]
style F fill:#51cf66,color:#fff
style E fill:#4c6ef5,color:#fff
style I fill:#ffd43b,color:#333
| Capability | Custom Object | Big Object |
|---|---|---|
| Record scale | Millions (with LDV tuning) | Billions |
| SOQL | Full SOQL support | Standard SOQL on indexed fields (Async SOQL retired as of Summer ‘25) |
| DML | Standard insert/update/delete | Database.insertImmediate() only |
| Triggers | Full support | Not supported |
| Reporting | Full support | Not supported (query into custom objects) |
| Index changes | Configurable anytime | Immutable after creation |
| Storage type | Counts toward data storage | Separate Big Object storage |
| Use case | Operational data | Archival, audit, telemetry, historical |
Big Object Index Design
Big Object indexes are defined at creation and cannot be changed. This makes upfront design critical.
- Index fields define query capability (you can only query by index fields)
- First index field is the most significant (leftmost in the composite key)
- Index determines sort order of results
- Maximum 5 fields in the index
Immutable indexes
If you get the Big Object index wrong, you must delete the Big Object and recreate it. All data is lost. Design the index based on how you will query the data, not how you will insert it.
Data Cloud (now Data 360)
Salesforce Data Cloud (now Data 360 as of the October 2025 rebrand; formerly CDP, formerly Salesforce 360 Audiences) is the platform’s answer to unified data management at scale.
Data Cloud Capabilities
| Capability | Description |
|---|---|
| Data ingestion | Ingest from Salesforce CRM, external databases, cloud storage, streaming |
| Identity resolution | Unify customer profiles across sources |
| Segmentation | Create audiences based on unified data |
| Activation | Push segments to marketing channels, CRM, or external systems |
| Analytics | Query large datasets without platform limits |
| Calculated insights | Computed metrics available in Salesforce records |
Data Cloud Architecture Pipeline
Data Cloud processes data through a defined pipeline: ingest, unify, analyze, and activate. Each stage transforms raw data into actionable customer insights.
graph LR
subgraph Ingest["1. Ingest"]
S1[Salesforce CRM]
S2[Marketing Cloud]
S3[External DBs<br/>hundreds of connectors]
S4[Streaming APIs<br/>Web SDK, Mobile]
end
subgraph Unify["2. Unify"]
DM[Data Model<br/>Objects / DMOs]
IR[Identity Resolution<br/>Match + Reconcile]
UP[Unified Profile]
end
subgraph Analyze["3. Analyze"]
CI[Calculated Insights<br/>Aggregated metrics]
SEG[Segmentation<br/>Audience creation]
DG[Data Graphs<br/>Related DMOs]
end
subgraph Activate["4. Activate"]
CRM[CRM Actions<br/>Flows, Apex]
MKT[Marketing<br/>Journeys, Ads]
EXT[External Systems<br/>Data Actions]
end
Ingest --> Unify --> Analyze --> Activate
style Ingest fill:#4c6ef5,color:#fff
style Unify fill:#51cf66,color:#fff
style Analyze fill:#ffd43b,color:#333
style Activate fill:#ff6b6b,color:#fff
Key architectural details:
- Storage Native Change Events (SNCE) notify when data changes; Change Data Feed (CDF) identifies what changed — making the platform reactive rather than polling-based
- Identity resolution uses matching rules (exact and fuzzy) and reconciliation rules to merge duplicate profiles. Near-real-time pipelines target sub-five-minute turnaround
- Calculated insights define aggregated metrics (e.g., lifetime value, engagement score) that can be surfaced on CRM records
- Activation pushes segments and insights to any downstream channel — Marketing Cloud journeys, ad platforms, CRM flows, or external systems
When Data Cloud vs Traditional Integration
| Factor | Data Cloud | Traditional ETL/Integration |
|---|---|---|
| Primary goal | Unified customer view, analytics | Transactional data sync |
| Data volume | Billions of records | Millions of records |
| Latency | Near real-time (minutes) | Real-time (seconds) to batch |
| Query model | SQL-like on data lake | SOQL on platform objects |
| Cost | Separate Data Cloud license | Integration tool license |
| Complexity | Data model mapping, identity rules | Field mapping, error handling |
Data Virtualization vs Replication
This is a core architectural decision that a CTA must articulate clearly.
Comparison
| Dimension | Virtualization (Salesforce Connect) | Replication (ETL/API Sync) |
|---|---|---|
| Data freshness | Real-time (live query) | Depends on sync frequency |
| Performance | Slower (external callout per query) | Faster (local data) |
| Storage cost | No Salesforce storage consumed | Consumes Salesforce data storage |
| Offline access | Not available | Available |
| SOQL support | Limited subset | Full SOQL |
| Reporting | Limited | Full reporting |
| Trigger/Flow support | Minimal | Full support |
| Complexity | Lower (no sync logic) | Higher (sync, conflict resolution) |
| Availability | Dependent on external system uptime | Independent of external system |
Decision Flowchart
graph TD
A[External data needed<br/>in Salesforce] --> B{Write access<br/>needed?}
B -->|Yes| C[Replicate into<br/>Salesforce objects]
B -->|No| D{Users need it in<br/>reports/dashboards?}
D -->|Yes| E{Volume > 100K<br/>records?}
D -->|No| F{Real-time freshness<br/>critical?}
E -->|Yes| G[Data Cloud or<br/>external reporting tool]
E -->|No| C
F -->|Yes| H[Salesforce Connect<br/>External Objects]
F -->|No| I{Sync frequency<br/>tolerance?}
I -->|Hourly OK| J[Scheduled ETL sync]
I -->|Daily OK| J
I -->|Must be live| H
Hybrid Patterns
Most enterprise architectures use a combination of approaches:
Pattern 1: Core + Extended
- Core data (Accounts, Contacts, Opportunities) — replicated in Salesforce
- Extended data (transaction history, audit logs) — external via Connect or Big Objects
- Analytics data (clickstream, IoT) — Data Cloud or external data warehouse
Pattern 2: Warm + Cold Storage
- Warm data (recent, actively accessed) — Salesforce standard objects
- Cold data (aged, infrequently accessed) — Big Objects or external storage
- Archival — External data lake with Salesforce Connect for occasional access
Pattern 3: System of Record + Reference
- System of record data — mastered and stored in Salesforce
- Reference data (product catalogs, pricing from ERP) — virtualized via Connect
- Enrichment data (firmographics, credit scores) — periodic batch sync
CTA Scenario Considerations
When evaluating external data in a CTA scenario, address these questions:
- Who is the system of record? — Where is each entity mastered?
- What is the access pattern? — Read-only reference or interactive read-write?
- What is the acceptable latency? — Sub-second or minutes OK?
- What is the data volume? — Hundreds, thousands, or millions of records?
- What reporting is needed? — Standard reports or just inline visibility?
- What is the availability requirement? — Can the solution tolerate external system downtime?
- What are the licensing implications? — Salesforce Connect licenses, Data Cloud licenses?
Cross-Domain Impact
- Integration — External data access is an integration pattern (Integration)
- Security — External data must respect org sharing model (Security)
- LDV — External storage is an LDV archival strategy (Large Data Volumes)
- Data Modeling — External objects have different relationship types (Data Modeling)
- System Architecture — Multi-org + external data affects org strategy (System Architecture)
Sources
- Salesforce Help: Access External Data with Salesforce Connect
- Salesforce Help: External Objects
- Salesforce Help: OData Adapters for Salesforce Connect
- Salesforce Help: Considerations for Salesforce Connect OData Adapters
- Salesforce Help: Considerations for Salesforce Connect Cross-Org Adapter
- Salesforce Help: Cross-Org Adapter for Salesforce Connect
- Salesforce Help: OData 4.01 Adapter Enhancements
- Salesforce Help: Big Objects Implementation Guide
- Salesforce Architects: Data 360 Architecture
- Salesforce Architects: Data 360 Interoperability Decision Guide
- Salesforce Help: Data Cloud Identity Resolution
- Trailhead: Discover Data Ingestion and Identity Updates (Winter ‘25)
- Trailhead: Learn How Data Cloud Works
- Salesforce Developer: Custom Adapters for Salesforce Connect
- CTA Study Guide: Data Domain — External Data