LaceySnr.com - Salesforce Development Posts by Matt Lacey

SeeAllData=true - Why I Think You Shouldn't Use It

Posted: 2012-04-05

Test coverage is something of a source of contention for developers on the force.com platform, and I'm not always a fan of — it seems to me that the more robust you make your code (i.e. with more try-catch blocks and sanity checks) the harder it is to get the coverage you require, but that's not something I plan to discuss today. Instead I want to focus on a change that came with the latest release of the API.

Spring '12 has been with us for a decent time now, but there are still instances where people are struggling to understand why new unit tests they write don't appear to find any data to work with, and unless you read the release notes, or one of the excellent summary blog posts around, there's a good chance you're not aware of a significant change that took place. If you're one of them, I'll let the documentation explain the change:

IsTest(SeeAllData=true) Annotation

For Apex code saved using SalesforceAPI version 24.0 and later, use the isTest(SeeAllData=true) annotation to grant test classes and individual test methods access to all data in the organization, including pre-existing data that the test didn’t create. Starting with Apex code saved using Salesforce API version 24.0, test methods don’t have access by default to pre-existing data in the organization.

It is not a complicated change, but it is one that was easy to miss with the rollout. Note that the documentation specifically states that for older code (i.e. classes written using API version 23.0 and older) the tests still work exactly as they used to and so will cause you no trouble. Should you update the API version for the class via its metadata, or you're writing a new test, then you will need to know what the difference entails.

For a long time, one of my biggest gripes with writing tests was that I had to modify my actual code to support the test cases, adding special scenarios for limiting the number of rows returned by queries, or for ensuring a specific record is included in a query. For example, if I had some code which queried for all contact records inserted within the last 30 days, and it took one branch if MyCustomBoolean__c == true and another otherwise, I would have to ensure that the query would find at least one contact with each value. Before now, test methods could access all existing data in an org, so if my sandbox contained 10 contacts with MyCustomBoolean__c set to true, and then (following best practice) I created two contacts myself in the test method:

Contact c1 = new Contact(LastName = 'Test 1', MyCustomBoolean__c = true);
Contact c2 = new Contact(LastName = 'Test 2', MyCustomBoolean__c = false);

insert new list<Contact>{c1, c2};

I could be sure that when calling a method with a code snippet like the one below, that both branch conditions would be catered for, i.e. block1 and block2 would get covered.

for(Contact c : [select Id, MyCustomBoolean__c from Contact where CreatedDate = LAST_N_DAYS:30 limit 100])
{
    if(c.MyCustomBoolean__c)
    {
        // block1: do a heap of calculations or something here
    }
    else
    {
        // block2: do something else here
    }
}

When I came to deploy this code to a production org however, I couldn't guarantee that there wouldn't already be 100+ contacts with MyCustomBoolean__c set to true and a further 100+ with it set to false. Depending on the order of the results returned, I may find that I actually get 100 contacts all with the same value, and bam: pop goes the test coverage with only one of the two branches covered. The impact of this type of problem only increases with the number of branches, making complex logic very difficult to test effectively and it's never long before headaches ensue.

To get around such difficulties, it is possible to add various switches and checks to ensure that data inserted by the test method gets used, but the better option is to embrace change, realising that it makes life a whole lot easier, and your code a whole lot cleaner. I imagine one of the reasons for the availability of SeeAllData=true is to save you pain when you need to update older classes to API 24.0 for some other feature or benefit, but this edge case aside, I implore you not to make use of it. Sure, it may be an easy quick-fix now, but you should strive to make sure your tests are independent of existing data; it will only save you blood, sweat and tears (and likely curse words also) later down the line.