Skip to content

Best Practices & Anti-Patterns

Each best practice is paired with the anti-pattern it prevents, connecting the design principle to the failure mode it guards against.

Data Architecture Decision Flow

End-to-end flowchart from new data requirement through object type selection, relationship type choice, LDV threshold check, archival policy definition, and ERD documentation.
Figure 1. Every new data requirement should follow this complete path: object type selection, relationship type decision, LDV threshold check, and archival policy definition, all before the ERD is finalized. Skipping the LDV check during design is how objects reach millions of records without an indexing or archival strategy in place.

Data Modeling Best Practices

Best Practices

1. Start with standard objects, justify custom

Always evaluate whether a standard object can serve the need before creating a custom object. Standard objects include features that take months to replicate. Document why you rejected the standard option.

2. Choose relationships deliberately

Never default to lookup because it is easier. Evaluate each relationship against the five criteria: child independence, roll-up needs, sharing inheritance, reparenting, and cascade delete. See Decision Guides for the full flowchart.

3. Design for sharing from the start

The relationship type determines the sharing model. If child records must inherit parent sharing, master-detail is required. Changing relationship types after data is populated is painful and sometimes impossible (lookup to MD requires no nulls).

4. Use External IDs on every migrated object

External IDs enable upserts, prevent duplicates during re-runs, and provide traceability back to the source system. There is no reason not to create them.

5. Plan record types before data exists

Adding record types after hundreds of thousands of records exist requires backfilling the RecordTypeId on every record. Design record types during the data model phase, not during UAT.

6. Keep objects focused

Each object should represent one business concept. If an object has 400+ fields spanning 5 different business processes, it is a God Object. Split it.

7. Document your ERD

Maintain a current entity-relationship diagram. It is the single most referenced artifact in a CTA review board. Keep it updated as the model evolves.

Anti-Patterns

God Object

Stuffing unrelated data into Account (or any single object) because “it is related to the customer.” Result: 500+ fields, 12 record types, unmaintainable validation rules, deployment conflicts between teams, and page load times that push users to spreadsheets.

Lookup when Master-Detail needed

Using lookup because “we might need to reparent” when the business process never reparents. Result: No native roll-ups (building fragile trigger-based alternatives), no sharing inheritance (building manual sharing rules), no cascade delete (orphan records accumulating).

Over-engineering with junction objects

Creating junction objects for relationships that are actually one-to-many. Not every relationship is many-to-many. A junction object adds query complexity and an extra object to maintain.


LDV Best Practices

Best Practices

1. Monitor data growth proactively

Track record counts per object monthly. Set alerts at 500K, 1M, and 5M thresholds. By the time users complain about performance, optimization is already 6 months overdue.

2. Design queries for selectivity

Every query on an LDV object should be selective. Use indexed fields in WHERE clauses. Check the Query Plan tool during development, not after deployment.

3. Request custom indexes early

Custom indexes require a Salesforce Support case and take time to provision. Identify candidates during design and submit requests before the data reaches LDV thresholds.

4. Address data skew before it hurts

Identify potential skew during data modeling: Will one Account have 100K child records? Will a single queue own 500K Cases? Design mitigation before the skew causes record locking and sharing timeouts.

5. Implement archival before reaching limits

Archival is a proactive strategy, not an emergency response. Define retention policies during design and put archival processes in place before objects hit LDV thresholds.

6. Use PK chunking for bulk extracts

When extracting large datasets via Bulk API, enable PK chunking to avoid timeouts. This splits the query into smaller chunks based on record ID ranges.

7. Right-size Batch Apex scope

The default scope of 200 is not always right. Reduce scope for complex processing with many DML operations. Increase scope (up to 2,000) for simple, read-heavy jobs.

Anti-Patterns

Full table scans in production

Deploying list views, reports, or SOQL queries without indexed filters on objects with millions of records. The query works in the sandbox (10K records) and times out in production (5M records). Always test with production-representative data volumes.

Ignoring data skew

Allowing a single Account to accumulate 200K Contacts, or a single user to own 1M records, without mitigation. Result: Record locking, sharing recalculation timeouts, and degraded performance for the entire org, not just the skewed records.

No archival strategy

Keeping every record forever because “we might need it.” Data grows, performance degrades, storage costs increase, and users lose trust in the platform. Define retention policies and implement them.


Migration Best Practices

Migration phases, tool selection, load sequencing, trial migrations, cutover strategies, and anti-patterns are covered in Data Migration. The key principles:

  • Three trial runs minimum: first finds structural issues, second validates fixes, third proves repeatability
  • Profile source data before mapping: discover quality issues before cutover
  • Load parents before children: use External IDs for relationship resolution
  • Disable automations during load: re-enable with a checklist and assigned owners
  • Plan for rollback: document how to restore the system if migration fails
  • Validate with business stakeholders: technical checks alone are not enough
  • Freeze source systems during cutover: or implement a delta migration strategy

Top migration anti-patterns

Testing with data subsets instead of production-representative volumes. No rollback plan. Forgetting to re-enable automations after load. See Data Migration for the full list.


Data Quality Best Practices

Best Practices

1. Prevent duplicates at the point of entry

Configure matching rules and duplicate rules on key objects (Account, Contact, Lead). Alert users when creating potential duplicates. Gradually tighten from alert to block as matching rules prove accurate.

2. Define data entry standards

Required fields, picklist values, naming conventions, and format standards should be documented and enforced through validation rules, not just training.

3. Assign data stewards

Every business-critical object should have a named data steward responsible for monitoring quality, resolving issues, and approving changes to data standards.

4. Build quality dashboards

Create dashboards that show completeness rates, duplicate counts, stale record percentages, and orphan record counts. Review them monthly with data stewards.

5. Address quality at integration boundaries

Every integration point is a data quality risk. Inbound integrations should validate data before insert. Outbound integrations should handle dirty data gracefully.

6. Run regular deduplication scans

Run batch deduplication scans weekly or monthly, even with real-time duplicate prevention. Duplicates still slip through batch imports, API creates, and edge cases.

7. Implement data lifecycle management

Define what happens to data as it ages. Active data is maintained. Stale data is reviewed. Aged data is archived. Expired data is deleted. See Data Quality & Governance.

Anti-Patterns

Clean up later

Loading dirty data with the plan to “clean it up after go-live.” Post-go-live teams are busy with support tickets, training, and enhancement requests. Data cleanup never gets prioritized. Clean before migration.

No data stewardship

No one is accountable for data quality. Everyone assumes someone else is handling it. Quality degrades over time as users find workarounds and integrations inject bad data.

Over-trusting source systems

Assuming source system data is accurate because “it has been in production for 10 years.” Legacy systems accumulate technical debt in data just like in code. Profile everything.


Governance Best Practices

Best Practices

1. Classify data by sensitivity

Not all data is equal. Classify data into tiers (Public, Internal, Confidential, Restricted) and apply appropriate controls (encryption, FLS, sharing) based on classification.

2. Document retention policies

For every object, document how long records should be retained, where they should be archived, and when they should be deleted. Align with legal and compliance requirements.

3. Implement Field Audit Trail for compliance

Standard field history tracking retains data for 18-24 months. If compliance requires longer retention, use Salesforce Shield Field Audit Trail (up to 10 years) or export to an external audit system.

4. Design for GDPR from day one

If any data subjects are in the EU (or subject to similar privacy laws), design the ability to find, export, rectify, and delete a data subject’s data from the start. Retrofitting GDPR compliance costs far more than building it in.

5. Separate data governance from development governance

Data governance (quality, retention, stewardship) is a business function. Development governance (CI/CD, deployment, testing) is a technical function. They need different processes, roles, and cadences.

6. Audit access patterns

Use Shield Event Monitoring to understand who accesses what data, how often, and through which channels. This informs both security design and data lifecycle decisions.

Anti-Patterns

Encrypt everything

Applying Shield Platform Encryption to every field “for security.” Encryption disables sorting, some filtering, formula references, and other features. Encrypt based on data classification, not paranoia.

No data lifecycle

Keeping all data forever with no archival or deletion strategy. Storage costs grow, query performance degrades, and compliance risk increases (you cannot comply with “right to erasure” if you do not know where data lives).

Shadow IT data stores

Users maintaining spreadsheets, personal databases, or unauthorized cloud apps because the Salesforce data model does not meet their needs. The CTA solution must address these shadow systems proactively.


Checklist: Data Architecture Review

Use this checklist before presenting a data architecture at the CTA review board:

  • Every object justified: standard first, custom only when needed
  • Every relationship type chosen deliberately (not defaulted to lookup)
  • Sharing model implications of each relationship documented
  • External IDs on all migrated objects
  • LDV objects identified with growth projections
  • Index strategy defined for LDV objects
  • Data skew risks identified and mitigated
  • Archival strategy defined with retention policies
  • Migration sequence documented (parent before child)
  • Migration cutover strategy chosen with justification
  • Data quality controls designed (dedup, validation, stewardship)
  • Compliance requirements addressed (GDPR, data residency, encryption)
  • ERD diagram current and complete

Sources

Personal study notes for the Salesforce CTA exam. Content compiled from VJ's study notes, official Salesforce documentation, community sources, and online publicly available content, then organized and presented with AI assistance. Not affiliated with Salesforce. © 2025–2026 VJ Srivastava.