Domain Grilling: D3 Data

Data architecture decisions cascade through security, integration, and performance. Judges test whether candidates understand the downstream consequences of modeling choices, can handle LDV scenarios, have realistic migration strategies, and can articulate Data 360 (formerly Data Cloud, rebranded Winter ‘26) architecture and data governance.

Type 1: Invalid - “Your Solution Won’t Work”

The judge believes your approach is technically incorrect or impossible.

Q1.1: Relationship type selection

Judge: “You proposed a Master-Detail relationship between Account and Financial Transactions. Compliance requires 7-year retention, but deleting the Account cascade-deletes all transactions. Your solution won’t work.”

What they’re testing: Understanding that Master-Detail cascade delete conflicts with data retention and compliance obligations.

Model answer: “That is a critical flaw. Deleting the master record in a Master-Detail relationship automatically deletes all detail records, violating the 7-year retention requirement. I will change the relationship to a Lookup, which prevents cascade deletion and leaves child records orphaned if the parent is deleted. Because we lose native roll-up summaries, I will implement a Flow or DLRS for the required calculations. For the retention guarantee, I will also remove Delete permission on Account from non-admin profiles and add a validation rule preventing Account deletion when active Financial Transaction records exist.”

Follow-up: “If the business also needs roll-up summaries of transaction amounts on Account, how do you replace that functionality after switching to Lookup?”

Q1.2: LDV query selectivity

Judge: “You recommended a Skinny Table for the 5-million-record Case object, but the main query users run filters for all Cases where Status is not ‘Closed’, returning about 2 million records. Your skinny table solution won’t fix the performance issue.”

What they’re testing: Whether you understand that the query optimizer bypasses skinny tables and indexes when the query is non-selective, and that NOT operators are generally non-selective.

Model answer: “Correct. The query optimizer relies on selectivity. Custom indexes or skinny tables are only used if the filter targets fewer than 10% of the first million records and 5% beyond. The NOT operator is generally non-selective, and targeting 2 million of 5 million records (40%) guarantees a full table scan regardless of skinny tables. I will modify the query to use positive selective filters — a specific CreatedDate range combined with a selective Status value like WHERE Status__c = ‘Open’ AND CreatedDate > LAST_N_DAYS:90. This ensures the optimizer can choose a cheaper index path. The skinny table then helps by reducing I/O for the selective query.”

Follow-up: “What if the business insists on seeing all non-closed cases without a date filter?”

Q1.3: Big Object limitations

Judge: “You archived 50 million historical IoT records into a Big Object but also stated the operations team will use standard Salesforce dashboards to analyze this data. Your solution won’t work.”

What they’re testing: Whether you know that Big Objects do not support standard Salesforce reporting.

Model answer: “You are right. Standard Salesforce reporting and dashboards are not supported on Big Objects. To meet the analytics requirement, I will use Batch Apex to query and aggregate the necessary data from the Big Object, and write those aggregated results into a standard custom object. The operations team builds dashboards on that summary object. Note: Async SOQL was retired in Summer 2023, so Batch Apex and Bulk API are the current methods for querying Big Objects. For ad-hoc deep-dive analysis, I would route the data to CRM Analytics, which can handle larger datasets without governor limits. The trade-off must be explicit: Big Objects preserve on-platform retention but sacrifice direct reporting capability.”

Follow-up: “How often should the aggregation job run, and what happens if it fails?”

Q1.4: Shield encryption vs LDV

Judge: “For the 10 million Patient records, you applied a custom index on Medical_ID__c and also applied Shield Platform Encryption on the same field. Your LDV strategy won’t work.”

What they’re testing: Whether you understand that Shield encryption disables custom index filtering capability, breaking query selectivity strategies for LDV.

Model answer: “Correct. Applying Shield Platform Encryption with probabilistic encryption disables filtering on the encrypted field, so the query optimizer ignores the custom index and defaults to a full table scan on 10 million records. Two fixes: first, configure Deterministic Encryption for Medical_ID__c so it can still support exact match filtering and retain index usage. Second, if the field must use probabilistic encryption for stronger security, index a non-encrypted external ID proxy field instead while encrypting only the sensitive Medical_ID__c field. The key principle: never encrypt and index the same field with probabilistic encryption.”

Follow-up: “What other platform capabilities does deterministic encryption still limit compared to an unencrypted field?”

Q1.5: Data 360 zero-copy vs ingestion

Judge: “You recommended Data 360 with zero-copy integration to Snowflake for real-time customer segmentation. Walk me through the trade-offs around using that zero-copy data in your identity resolution ruleset.”

What they’re testing: Whether you understand the current capabilities of Data 360 zero-copy (Live Query) versus ingestion, including where each pattern fits for identity resolution, segmentation, and calculated insights, and where credit economics or latency might still push you toward ingestion.

Model answer: “Data 360’s zero-copy partner network provides bidirectional access to Snowflake via Apache Iceberg. With Data 360 Live Query (the current name for the zero-copy feature), federated tables can be mapped to Data Lake Objects and then Data Model Objects, and Live Query now supports unification and segmentation over federated data without requiring a full ingest. So the outdated framing of ‘zero-copy = analytics only, ingestion = identity’ is no longer accurate. The trade-offs I’d actually evaluate: first, latency — Live Query hits Snowflake at request time, so identity resolution run times depend on Snowflake warehouse sizing and the query plan, which can be slower than resolution over locally materialized DLOs. Second, credit economics — Live Query consumes Live Query-specific credits per federated read plus Snowflake compute, while ingestion trades a one-time data-ingested credit against cheaper local resolution. At 200M clickstream events, ingestion pricing is prohibitive; at a few million tightly-scoped identity fields, ingestion is often cheaper than the per-query cost of Live Query. Third, feature support — certain calculated insights, data actions, and activation flows still work better against ingested data, so I’d validate that the specific activations this scenario needs are supported over Live Query. My recommendation for this scenario: ingest the high-signal identity fields (name, email, phone, address) for low-latency resolution and reliable calculated insights, while keeping the large analytical datasets (transaction history, clickstream) on Live Query for segment enrichment. Dual-path, driven by latency and credit math, not a hard capability blocker.”

Follow-up: “If the warehouse team tells you Snowflake query latency on the clickstream table is 8-12 seconds, how does that change your decision between Live Query and ingestion for that dataset?”

Type 2: Missed - “You Haven’t Addressed…”

The judge is pointing out a gap in your design.

Q2.1: Data skew identification

Judge: “You modeled all 150,000 B2C Contact records under a single ‘Retail Customers’ Account. You have not addressed how this data distribution impacts system performance.”

What they’re testing: Whether you recognize Account Data Skew and its cascading impact on sharing and locking.

Model answer: “You are correct — I missed addressing Account Data Skew. Salesforce’s documented threshold is 10,000 child records per parent Account; beyond that, ownership changes, sharing rule modifications, and bulk updates on the parent trigger a sharing recalculation cascade and locking contention that grows linearly with child count. At 150,000 children the pain is severe: list view timeouts, locked records during bulk DML, slow sharing rule deployments, and noisy integrations. I will restructure the data model to eliminate the mega-account by either splitting Contacts into regional sub-accounts or enabling Person Accounts so each consumer has their own distinct Account record, avoiding the skew entirely.”

Follow-up: “What if Person Accounts are not an option because of AppExchange package incompatibility?”

Q2.2: Data quality strategy

Judge: “You are migrating 2 million records from three legacy ERPs and maintaining ongoing API integrations, but you have not addressed how you will prevent and manage duplicate records.”

What they’re testing: Whether you treat data quality as an architectural concern, not an afterthought.

Model answer: “I will implement a three-layered deduplication strategy. First, prior to migration, I will profile and cleanse data using the ETL tool to resolve duplicates before they reach Salesforce. Second, for real-time prevention, I will configure native Salesforce Matching and Duplicate Rules to alert or block at the point of entry. Third, because batch imports and integrations can bypass real-time rules (Bulk API does not evaluate duplicate rules by default), I will schedule regular batch deduplication scans. For the integration layer specifically, all upsert operations will use External ID fields with source-system prefixes to prevent cross-source collisions.”

Follow-up: “How do you handle the scenario where the three legacy ERPs have overlapping customer records that are legitimate duplicates?”

Q2.3: Data classification and compliance

Judge: “The custom objects include fields capturing Social Security Numbers and patient health information, but you have not addressed how this data will be governed or secured differently than standard business data.”

What they’re testing: Whether you apply data classification to drive security architecture decisions.

Model answer: “I will classify SSNs and health information under the ‘Restricted’ data tier. Based on this classification: strict Field-Level Security so only authorized roles can view the data, Shield Platform Encryption to encrypt these fields at rest (deterministic for SSN if exact-match search is needed, probabilistic for maximum security otherwise), and Field Audit Trail to retain history for up to 10 years for compliance. I will also implement data masking in sandbox environments and GDPR right-to-erasure capability via crypto-shredding. Classification should happen during data model design, not as a retrofit, because adding Shield encryption after data exists requires re-encrypting all records.”

Follow-up: “What impact does encrypting the SSN field have on any custom indexes you planned for LDV optimization on that object?”

Judge: “Your solution processes EU customer personal data in Salesforce, including names, email addresses, and purchase history. You designed the data model and integration layer, but you have not addressed how you will handle a GDPR Article 17 right-to-erasure request that spans Salesforce, the data warehouse, and three integrated marketing systems.”

What they’re testing: Whether you architect for compliance as a first-class concern, with erasure cascading across all systems that hold personal data, not just the CRM.

Model answer: “I missed the cross-system erasure architecture. A right-to-erasure request must cascade across every system holding that individual’s personal data. In Salesforce, I will implement a two-layer approach. First, for active CRM records: an erasure Flow triggered by a Data Subject Request record that anonymizes or deletes personal fields on Contact, related Cases, Emails, and custom objects, respecting any legal retention exemptions. That field-level anonymization is the mechanism for single-record GDPR erasure on Salesforce data. Second, important nuance on Shield crypto-shredding: Shield Platform Encryption tenant secrets are per-org, per-key-type (not per-record). Destroying an active tenant secret for a given key type renders all data encrypted with that key permanently unrecoverable — every record in every object that used that key. Crypto-shredding is therefore an all-or-nothing operation at the key-scope level, so it’s a fit for end-of-life erasure (org decommission, product sunset, closing an entire customer segment) rather than per-individual GDPR requests. For targeted per-individual erasure on encrypted fields, use Cache-Only Keys with customer-managed per-individual key material (so a single key can be destroyed without affecting others) or rely on the Flow-based anonymization path for the standard Shield setup. For the downstream cascade: the erasure event must propagate to the data warehouse and all three marketing systems. I will publish a Platform Event (DataSubjectErasure__e) that the integration middleware subscribes to and fans out deletion requests to each system. Each system must acknowledge completion. I will also add a Data Subject Request tracking object that monitors erasure status across all systems with a reconciliation dashboard showing completion per system. The 30-day GDPR response deadline drives the SLA on this entire pipeline.”

Follow-up: “What happens if one of the three marketing systems cannot complete the erasure within the 30-day window because it archives data to cold storage with no deletion API?”

Q2.5: MDM and golden record strategy

Judge: “You have five integrated systems feeding customer data into Salesforce, but you have not addressed which system is the master for each data entity or how conflicts are resolved when two systems update the same customer record.”

What they’re testing: Whether you treat MDM as an architectural concern with explicit system-of-record designation and conflict resolution strategy, not an afterthought.

Model answer: “I missed the MDM strategy entirely. With five systems contributing customer data, I need to designate a system of record per data entity and field group. Account firmographic data (name, industry, revenue) is mastered in the ERP. Contact information (email, phone, title) is mastered in Salesforce because sales reps maintain it. Billing addresses are mastered in the finance system. For conflict resolution, I will implement a survivorship ruleset: when two systems update the same field, the designated master wins. In the integration layer, the middleware applies survivorship rules at the field level before writing to Salesforce, using a last-write-wins policy only for non-mastered fields. I will add a Source_System__c and Last_Modified_By_Source__c field on each entity so the integration layer can evaluate provenance. For ongoing governance, a Data Steward role reviews conflict logs weekly. The alternative approach is a dedicated MDM hub like Informatica MDM that mediates all updates, but that adds licensing cost and architectural complexity that must be justified by the conflict volume.”

Follow-up: “The sales team insists they should be able to override ERP-mastered Account data directly in Salesforce. How do you handle that without breaking the MDM model?”

Type 3: Suboptimal - “Have You Considered…?”

The judge suggests a better approach.

Q3.1: Data model optimization

Judge: “You designed a junction object to capture the multiple ‘Interests’ a Contact selects on their web form. Since users only need to see these values for basic filtering, wouldn’t a multi-select picklist be simpler?”

What they’re testing: Whether you can distinguish when a junction object is over-engineering versus when it is necessary.

Model answer: “Valid point. A junction object adds query complexity and an extra object to maintain, which is over-engineering when the relationship does not require tracking additional metadata like enrollment date or status. Because the business only needs simple, flat values for basic filtering, a multi-select picklist or checkbox fields is a more optimal and maintainable solution. The junction object pattern is warranted when the intersection itself has attributes (start date, priority, status) or when reporting requires joining through the relationship.”

Follow-up: “What if the requirement changes next quarter to track when each interest was added and by whom?”

Q3.2: Replication vs virtualization

Judge: “You designed a nightly ETL process to replicate 5 million historical invoice records from the ERP into Salesforce custom objects. Since users only occasionally view these for reference, wouldn’t Salesforce Connect and External Objects be more optimal?”

What they’re testing: Whether you can identify when virtualization is more appropriate than replication.

Model answer: “That is a better approach. I originally proposed ETL to allow full SOQL and reporting capabilities. However, replicating 5 million read-only historical records consumes expensive Salesforce data storage. Because the data is large, read-only, and only occasionally referenced, virtualizing it using Salesforce Connect and External Objects is the optimal choice. This provides real-time access to the ERP data via live queries without copying it onto the platform, saving storage costs while still allowing the data to appear in related lists. The trade-off: External Objects have limited reporting support (custom report types only, not standard reports), do not support standard triggers (only CDC event triggers), and have limited Flow support.”

Follow-up: “What if the finance team then asks for a Salesforce report comparing current-quarter invoices with historical ones?”

Q3.3: Data 360 zero-copy vs ETL

Judge: “You have designed a full ETL pipeline using MuleSoft to ingest 200 million clickstream events into Data 360 for customer segmentation. Since this data already resides in your Snowflake data warehouse and Data 360 supports zero-copy with Snowflake, wouldn’t zero-copy be more optimal?”

What they’re testing: Whether you can evaluate when zero-copy is the better pattern versus full data ingestion, considering credit consumption, data freshness, and processing requirements.

Model answer: “That is a better approach for this scenario. Ingesting 200 million clickstream events consumes significant Data 360 credits for data ingestion and storage, and the ETL pipeline adds maintenance overhead and latency. Since the data already lives in Snowflake and Data 360’s zero-copy partner network (Live Query) supports bidirectional access via Apache Iceberg, I can query the clickstream data in place without copying it. This saves ingestion credits, eliminates ETL maintenance, and gives near real-time access to the latest Snowflake data. The trade-off at this scale is primarily latency and Live Query credit consumption per read: warehousing 200M events locally gives tighter resolution and calculated-insight performance, while Live Query spreads cost across every query execution. Live Query now supports segmentation and unification over federated data, so zero-copy is a first-class segmentation source, not analytics-only. For identity matching on high-signal fields (email, phone, hashed identifiers) I’d still consider a narrow ingest pattern if Live Query latency on the match job is too slow at scale. The credit savings are substantial: ingestion of 200M records could consume tens of thousands of credits versus per-query Live Query credits for zero-copy.”

Follow-up: “What if the marketing team needs sub-second latency on clickstream-based segments for real-time website personalization?”

Type 4: Rationale - “WHY Did You Choose…?”

The judge probes your reasoning. This is a gift question.

Q4.1: Relationship type rationale

Judge: “You modeled Account to Vehicle as a Lookup. Why not Master-Detail?”

What they’re testing: Whether you can articulate the specific business drivers that forced the Lookup choice.

Model answer: “I chose Lookup because the business requirements dictate that the child record needs its own independent owner and sharing model. Vehicles are occasionally sold or reassigned, requiring flexible reparenting, which is a native Lookup capability but restricted in Master-Detail. And we must prevent cascade deletes: if an Account is deleted, historical Vehicle records must remain in the system for warranty tracking, which Lookup guarantees by simply orphaning the record. The trade-off is losing native roll-up summaries, which I will replace with Flow-based aggregation.”

Follow-up: “What if the sales director says they need a real-time count of active vehicles per account on the Account record?”

Q4.2: Migration tool selection

Judge: “You selected MuleSoft for the migration instead of Data Loader. This is a straightforward CRM-to-Salesforce migration with 5 million records across 8 standard objects. Why not the free tool?”

What they’re testing: Whether you can justify a licensed middleware tool over free alternatives when complexity warrants it.

Model answer: “Data Loader could handle the raw volume through Bulk API 2.0. But three complexity factors push toward MuleSoft. First, transformation complexity: the source CRM uses a different data model with custom field types and concatenated address fields that do not map one-to-one to Salesforce. Data Loader has no transformation layer; I would need to write external scripts and manage handoffs between transformation and loading manually. MuleSoft handles transformation inline with DataWeave as a single pipeline. Second, orchestration: the 8 objects have dependency chains requiring precise sequencing. Data Loader requires manual execution of each load. MuleSoft orchestrates the full sequence, pausing between steps to verify counts and integrity. Third, error handling: Data Loader writes errors to a CSV; MuleSoft routes failed records to an error queue with automatic retry. For a 48-hour cutover window, I cannot afford to discover errors from CSV at hour 40. If the migration were a single object with clean data, Data Loader is the right tool.”

Follow-up: “If the client does not have a MuleSoft license and cannot procure one before the deadline, how would you replicate this orchestration?”

Q4.3: Denormalization rationale

Judge: “You chose to denormalize the Account billing address directly onto Invoice__c instead of using a cross-object formula. Walk me through why.”

What they’re testing: Whether you can articulate when denormalization is a deliberate, justified design choice rather than laziness.

Model answer: “Three factors drove the denormalization. First, point-in-time accuracy: the invoice must reflect the billing address at the time of generation, not the current address. A cross-object formula recalculates on every read and always shows the current Account address. When the address changes post-invoicing, historical invoices would display the new address, which is legally incorrect for tax compliance. Second, performance at scale: Invoice__c is projected to reach 10 million records within 2 years, and every page load or report using a cross-object formula adds a join. With the address directly on Invoice, I eliminate the dependency. Third, the Invoice object may need to exist independently of Account in certain scenarios. The trade-off: I add a process to propagate address updates to Draft invoices while freezing the address on Sent or Paid invoices.”

Follow-up: “What if auditors require tracking exactly when and why the billing address on an Invoice was changed?”

Q4.4: Archival threshold rationale

Judge: “You chose a 90-day archival threshold for IoT Telemetry records instead of the standard 1-2 year retention period. Why so short?”

What they’re testing: Whether the archival threshold is driven by data volume analysis, not arbitrary convention.

Model answer: “The 90-day threshold is driven by volume math. IoT telemetry generates extreme volumes that would push the standard custom object into the High LDV danger zone of 10-100 million records within months, degrading query, list view, and report performance. Access frequency for this data drops to monthly or occasional after 90 days. Moving it to a Big Object keeps the operational database lean while maintaining on-platform access for historical analysis and compliance via Batch Apex or Bulk API (Async SOQL was retired in Summer 2023). The threshold is volume-driven, not arbitrary — if the ingestion rate were lower, I would extend it.”

Follow-up: “What if the operations team needs real-time dashboards showing 6-month trend lines?”

Q4.5: Person Accounts architecture decision

Judge: “You recommended enabling Person Accounts for this mixed B2B and B2C organization. This is an irreversible change. Walk me through why you made that call.”

What they’re testing: Whether the candidate has evaluated the full risk profile of an irreversible platform decision, including AppExchange compatibility, sharing model implications, and downstream technical debt.

Model answer: “Three factors drove the recommendation despite the irreversibility. First, the B2C volume: 500,000 individual consumers represent 80% of the customer base. Without Person Accounts, each consumer requires both an Account and Contact record, doubling data volume, complicating deduplication, and creating an unnatural data model where artificial ‘Household’ or ‘Individual’ Accounts clutter the Account object. Person Accounts collapse this into a single record per consumer. Second, native platform support: Person Accounts work with standard Salesforce features including campaigns, leads (with blank Company field during conversion), and most Flows. Third, the B2B segment (20%) continues using standard Business Accounts unaffected. I evaluated the risks: once enabled, Person Accounts cannot be disabled. Contact sharing becomes Controlled by Parent, which means B2C contacts always inherit Account sharing. I audited the three AppExchange packages in the org and confirmed Person Account compatibility. The rejected alternative was a custom Individual__c object, which avoids irreversibility but loses native campaign membership, lead conversion, and standard UI support. Given 80% B2C volume, the native platform path is worth the irreversibility.”

Follow-up: “Six months after go-live, the business acquires a B2B company and needs to install an AppExchange CPQ package that does not support Person Accounts. What now?”

Type 5: Cascading - “If You Change X, What Happens to Y?”

The judge tests cross-domain awareness.

Q5.1: Master-Detail to Lookup cascade

Judge: “You originally designed the Order and Order_Line_Item relationship as Master-Detail. Now the business wants to convert it to Lookup so line items can be reparented between Orders. Walk me through every downstream system that breaks.”

What they’re testing: Whether you can trace the full cascade of consequences from a relationship type change across sharing, roll-ups, integrations, and reports.

Model answer: “Converting Master-Detail to Lookup triggers a cascade across at least six areas. First, sharing model: Order_Line_Item currently inherits sharing from Order. After conversion, every line item gets its own OwnerId and needs its own sharing configuration. If Order is Private, line items need new OWD and sharing rules. At 2 million line items, the recalculation could take hours. Second, roll-up summaries: every roll-up on Order (Total_Amount__c as SUM, Line_Count__c as COUNT) breaks immediately. I need Flow-based or DLRS replacements before converting. Third, cascade delete: deleting an Order no longer deletes line items. Orphaned line items accumulate without cleanup automation. Fourth, integrations: every integration creating Orders must now explicitly set OwnerId on each line item. Fifth, reports: the Order with Line Items custom report type may need recreation for the new relationship type. Sixth, deployment: relationship type conversions are data operations, not metadata — they must run directly in production with a maintenance window.”

Follow-up: “If the sharing recalculation takes much longer than expected during the maintenance window, what is your rollback plan?”

Q5.2: Archival impact on analytics

Judge: “You archived Cases older than 12 months to a Big Object. The analytics team relies on 3 years of Case data for trend analysis, SLA compliance, and customer health scoring. What just happened to their analytics?”

What they’re testing: Whether you trace the downstream impact of archival on analytics and reporting stakeholders.

Model answer: “I broke their analytics. The Case object now only contains 12 months of data, but analytics needs 36 months. Their trend dashboards show a cliff at the 12-month boundary. SLA compliance reports compute against a partial dataset and produce incorrect results. The customer health score drops for every account because 2 years of history disappeared. Fix: create a monthly aggregation batch job that runs before archival, summarizing key metrics per Account per month into a Case_Analytics_Summary__c object. The analytics team pivots to the summary object for historical data. For the health score, switch the data source to a combination of the summary object for historical months and the live Case object for the current period. For ad-hoc deep-dive, route to CRM Analytics which can ingest both sources. The trade-off is maintaining three data paths, but archival was necessary for operational performance.”

Follow-up: “What if the customer health score is consumed by an external system via API that was querying Case records directly?”

Q5.3: OWD change cascade

Judge: “The security team changes Account OWD from Public Read/Write to Private. You have 8 million Account records. Walk me through every downstream impact.”

What they’re testing: Whether you can trace the full cascade of an OWD change on sharing, performance, integrations, and user experience.

Model answer: “This cascades across every system that touches Account data. First, sharing recalculation: the platform recalculates the share table for 8 million Accounts plus every child object inheriting sharing. This could take 6-24 hours. During recalculation, users see inconsistent access. Second, share table growth: with Private OWD, the platform creates explicit share rows for every sharing rule, role hierarchy grant, and manual share — the table grows from near-zero to potentially millions of rows. Third, report visibility: every report referencing Account now returns only records the running user can see. Executive dashboards break overnight. Fourth, integration access: any integration user querying Accounts now gets only records shared with that user. If the integration user lacks View All Data, syncs break. Fifth, master-detail children (Opportunities, Cases) inherit the new Private OWD, so every child object’s sharing rules must be reviewed. Sixth, query performance: with Private OWD, the Salesforce query optimizer enforces sharing at plan time for non-admin users by joining against the share table and the user’s group membership. For queries that already have a selective WHERE clause, the optimizer can still produce an efficient plan. For queries without selective filters (broad list views, large report result sets, integration full scans), the added sharing filter makes the plan less selective and may switch from an index path to a table scan, degrading latency noticeably as volume grows. System-context code (‘without sharing’, View All on the running user) bypasses the check entirely. Mitigation: execute during maintenance window, pre-build all sharing rules before flipping OWD, add selective filters to list views and integration queries, and notify all integration teams.”

Follow-up: “The sharing recalculation has been running for 18 hours. Users cannot see their Accounts. Do you let it finish or roll back?”

Q5.4: Encryption cascade on LDV

Judge: “You added Shield Platform Encryption to 5 fields on the Contact object (6 million records). Contact is the most-queried object, with list views, reports, SOQL in Apex, and 4 integrations. What just changed?”

What they’re testing: Whether you understand the broad impact of encryption on a high-volume, heavily-queried object.

Model answer: “Five dimensions of impact. First, query performance: encrypted fields cannot participate in standard or custom indexes (probabilistic), or lose LIKE/sort capability (deterministic). Every list view, report, or SOQL WHERE clause referencing these 5 fields must be audited. Second, formula fields: any formula referencing an encrypted field may break. Cross-object formulas spanning Account to encrypted Contact fields need evaluation. Third, existing data: all 6 million records must be re-encrypted. During re-encryption, fields contain a mix of encrypted and unencrypted data, producing inconsistent query results. Fourth, integrations: the 4 integrations receive decrypted values via API (Shield is at-rest), but any server-side SOQL filtering on encrypted fields must be refactored. Fifth, sandboxes: Developer and Partial Copy sandboxes may not have the encryption configuration, causing environment inconsistencies. Add encryption setup to the sandbox post-refresh checklist.”

Follow-up: “The sales team’s primary list view filters Contacts by Last_Name__c, one of the encrypted fields. That list view is their main work surface. What do you do?”

Q5.5: Data 360 identity resolution cascade

Judge: “Your Data 360 implementation has been running identity resolution for 3 months. The business discovers a 15% false merge rate — different individuals incorrectly merged into one profile. Walk me through the full impact.”

What they’re testing: Whether you understand the downstream consequences of flawed identity resolution across segments, activations, insights, and CRM data.

Model answer: “A 15% false merge rate is severe. The cascade: unified profiles contain blended data from two different individuals. Every segment built on unified attributes may include or exclude wrong individuals. A high-value-customer segment based on lifetime value is inflated because merged profiles combine two people’s spending. All activations pushed to Marketing Cloud targeted partially wrong audiences for 3 months. Calculated insights (health scores, LTV) are incorrect for 15% of profiles. Most dangerously, if Data 360 enriched CRM records via data actions, production Salesforce data now contains wrong values. Remediation: immediately pause all activations and data actions. Audit the identity resolution ruleset — tighten fuzzy match thresholds and add blocking key dimensions. Re-run identity resolution with corrected rules. For corrupted CRM records, run a remediation batch to re-compute correct values from newly resolved profiles. Timeline: 2-4 weeks for full remediation.”

Follow-up: “How would you design an ongoing monitoring system to detect false merge rates before they reach 15%?”

Q5.6: Storage limit cascade

Judge: “The org is at 95% data storage. You have proposed a project management module generating 2 million records in year one. Walk me through every consequence of hitting the storage limit.”

What they’re testing: Whether you connect storage pressure to operations, integrations, and architecture decisions.

Model answer: “When storage is exceeded, every record insert fails org-wide. Sales cannot create Opportunities. Service cannot log Cases. Integrations return STORAGE_LIMIT_EXCEEDED errors. Platform Events that create records fail silently. Email-to-Case stops. The entire org halts. My data strategy changes in four ways. First, immediate: storage audit to export and delete aged data beyond retention. Second, structural: redesign the module to archive completed items to Big Objects after 90 days, and move attachments to external storage with Salesforce Connect. Third, licensing: escalate discussion about Unlimited Edition (120 MB/user vs 20 MB/user on Enterprise) or additional storage packs. Fourth, architecture: evaluate whether the module belongs in Salesforce at all — if users do not need CRM context, a purpose-built tool with bidirectional integration might be more appropriate than pushing 2M records into a constrained org.”

Follow-up: “The CEO wants everything in Salesforce. How do you reduce the 4 GB storage footprint to fit within the existing allocation?”

Q5.7: Data model change deployment cascade

Judge: “You need to deploy a data model change that converts a text field to a picklist, adds 3 new lookup relationships, and renames a custom object from Project__c to Engagement__c. The org has 15 active integrations. Walk me through the deployment cascade.”

What they’re testing: Whether you understand that data model changes ripple through metadata, integration contracts, and operations.

Model answer: “Three changes with different risk profiles. The object rename is highest risk: Salesforce cannot change a custom object’s API name — I can only change the label. If the API name must change, I create Engagement__c, migrate data, update all references, and deprecate Project__c. Every integration referencing Project__c breaks when the old object is removed. All 15 need audit and coordinated cutover. For the 3 new lookups: deploy as optional first, update integrations to populate them, then make required in a subsequent release. For text-to-picklist: inventory all distinct text values, ensure each maps to a picklist value, add a catch-all for outliers, then convert. Post-conversion, a batch job validates all records. Integrations sending freeform text must send valid picklist values. Deploy in phases with rollback triggers at each gate.”

Follow-up: “The integration team can only update 3 of the 15 integrations before go-live. The other 12 belong to external partners with their own release cycles.”

Q5.8: Sandbox data strategy for LDV testing

Judge: “Your LDV optimization strategy was validated in a Partial Copy sandbox with 50,000 records. Production has 8 million records on the same object. The custom indexes and skinny tables you requested were approved. Walk me through every way this sandbox testing strategy gives you false confidence.”

What they’re testing: Whether you understand that sandbox data volume, distribution, and configuration gaps make LDV testing unreliable without a Full sandbox and production-representative data.

Model answer: “The testing gives false confidence across five dimensions. First, volume gap: Partial Copy sandboxes sample a subset of records using a template. 50,000 records is 0.6% of production volume. The query optimizer behaves differently at 50K versus 8M: selectivity thresholds that pass at 50K may fail at 8M because the percentage of matching records changes with volume. Second, data distribution: the sampled 50K records may not reflect production’s actual distribution. If production has severe data skew (one Account with 200K child records), the sample likely missed it entirely. Third, custom indexes and skinny tables: these must be requested from Salesforce Support and are applied per-environment. If Support applied them in production but not the sandbox (or vice versa), the test measures different infrastructure. Fourth, sharing model load: at 8M records with Private OWD, the share table in production could be millions of rows. The sandbox’s 50K-record share table adds negligible query overhead, masking the production-level sharing filter cost. Fifth, concurrent load: production has hundreds of concurrent users running list views, reports, and integrations. The sandbox test ran in isolation. Fix: request a Full Copy sandbox, load production-representative data volumes, apply the same indexes, and run load tests with simulated concurrent access before go-live.”

Follow-up: “The client cannot afford a Full Copy sandbox license. How do you validate LDV strategy without one?”

Always verify against official Salesforce documentation

This content is study material for CTA exam preparation. Content compiled and presented with AI assistance. Not affiliated with Salesforce.

Personal study notes for the Salesforce CTA exam. Content compiled from VJ's study notes, official Salesforce documentation, community sources, and online publicly available content, then organized and presented with AI assistance. Not affiliated with Salesforce. © 2025–2026 VJ Srivastava.