During the last two months, I’ve been doing Selenium development. I’ve been somewhat mixed in feelings about the Selenium platform.
Prefer CSS3 selectors
In theory, Selenium makes it fast and easy to create new tests. With a Firefox plugin available, practically anyone can take a page and record a test for it.
In practice, however, there is catch. Most of the selectors of Selenium, when recorded using the Firefox plugin use Xpath to operate their magic. While using an Xpath is not a problem in itself, it is a technology with which fewer developers are familiar, then say, CSS. More aggravating to me is that the Xpath that is implemented is incomplete. I haven’t checked thoroughly on what the differences are between Selenium’s Xpath and the W3 specifications, but there are enough that I will often write a valid Xpath query with another Firefox plugin called Xpath Checker that fails with Selenium. The solution often is to rewrite the selector using CSS3, which seems to work better in most circumstances. The thing that saves Xpath is its great flexibility when it comes to, say, selecting a tag based on a value somewhere else on the page.
Get GIGO with it
Not all the problems we’ve been having with Selenium come from it.
I’ve spent a week or two to get a meaningful, representative data set, from which to operate Selenium. The, of course crucially important in order to get non GIGO results.
The problem in our case is that it is not something that had been done since the project inception. While migrations help a great deal with the structure and fixtures help with seeding some values, our system is complex enough that it is not realistic to simply rely on fixtures.
The first approach was to take an existing data dump from a live system. While this was quite workable, it was also hard to maintain the test data, the Selenium tests and the application up to date and sufficiently seperate from one another.
The second approach we took was to include fixtures on top of the dump. While this was convenient to quickly resolve issues with a particular test, we rapidly found out that this caused unit and functional test which already relied on certain value to fail.
The third approach was to start from scratch and completely rebuild the database, in order to have a perfect blank, or master from which to evolve. In practice, the system’s data was complex enough that when there was missing data, it was often difficult and time consuming to isolate the origin of the error and correct it.
The current approach I feel is the best one so far. It is simply to take the previous, start-from-scratch approach and automate it entirely, thereby introducing better traceability into the test data process and shortening the time required to fix initial data errors.
The lessons I take from this are:
- Since you’ll be agile and testing, plan your data creation strategy from the beginning.
- Make sure that the data creation process is as automated as possible, so that when large changes to the application come about, it is possible to quickly make Selenium follow along
- Involve domain experts. Often, creating a blank slate of the application can be daunting, especially in a large system that has evolved without such a system. Domain experts are your best bet into what need to be included in the creation of initial data.
- Maintain the strategy. While it would be nice to have tests on the data that the strategy creates, this is a bit of a chicken and the egg problem. So, instead of testing directly the data creation strategy, reexamine it in thought when one of its dependent test fails.
You’ll probably want to have your strategy using Rake or Raven depending on your platform and preferences. Even using Rake on DotNet and learning a new tool is better then relying on manual efforts to create that valuable data.