Thursday, April 6, 2017

Regular Expression in both Python and Perl

Regular expressions

I would like to share some highlights of my past experience that connected to Regular Expressions. It is a summary or quick reference that will help you to use regular expressions in your duties.

Let me start with a brief explanation what we could use in Python.

Python programming language has the re API
Those functions from the API are used widely:
  • compile - this function compile expression and result could be used later in your program. So, you could improve performance of your program if this regular expression is used very often, just by reducing costs on compilation.
  • search - with this function you will be able to find first occurrence of the expression in your text.
  • match - this function tries to match the string to the regular expression starting from the beginning of the string. 
  • findall - this function returns all matches in the string.
Comparing to Python, Perl has no special API, or, let's call it - library, you could use regular expressions inside your program or script by using special syntax.


Basic syntax

What we should know about regex basics:
  • ^ - marker that notifies a regex engine to start search from the beginning of the string.
  • $ - marker that notifies a regex engine to end search with the end of the string.
  • . - dot in regular expressions means any kind of symbols except a newline.
  • | - this symbol works like logical OR operator, it allows us to use different alternate variants that could occurs in particular position.
  • ( and ) - round brackets are used as borders for operations that includes the OR symbol | or in grouping operations.


Character classes

Character classes are used to define set of characters, and only one of them should be in particular position to match regular expression. The square brackets [ and ] is used for this purpose. Regular expressions support definition of a range of symbols, using dash between first symbol and last symbol, as well as using ^ symbol to denote that any symbols, except those, that were defined in set, should be in this position to satisfy regular expression.


Quantifiers

On this step we already know how we can describe a symbol in certain position.
What will be in case if we would like to check this behavior across neighborhoods of the symbol?
For this purpose we can use quantifiers:
  • ? - means that particular symbol or expression (remember, we can use round brackets for grouping) is optional and is not required to be in checked string to satisfy to regular expression.
  • * - means that particular symbol or expression can occurs 0 or more times in a row.
  • + - means that particular symbol or expression can occurs 1 or more times in a row, so, it means that it definitely should be in this place to satisfy to regular expression.
  • { and } - curly brackets, it is last type of brackets that we didn't talk about till this time. This type of brackets is used to define an interval, that includes possible occurrence of particular symbol or expression to match to regular expression. Value can be a single number like this: {amount} (it means that particular expression should occurs exactly this number of times in a row) or it can be a range in format: {left, right}, where both borders are inclusive (in this case occurrence should lay in this range to match the regular expression).


Escaping

Definition of such set of special symbols (and it is only a basics!) raises a constraint or a question: "What we should do, if string, that should be processed, contains one or more of them?". For this case the escaping mechanism could be used. Several approaches can be used if we are talking about the Python programming language:
  • First one is related to using \ symbol, to inform regular expression processor that symbol followed by should not be recognized as a special one. But, it doesn't work within [ ] range.
  • Second one is related to using of Python's raw strings. This approach is related to putting the r symbols before the string to notify interpreter that it is a raw string. In this case any kind of combinations like r'\n' will be interpreted as two different symbols '\' and 'n', not like a special symbol '\n'.
  • Third one is related to usage of functionality of regular expression in Python, in details it is re.escape function.


Shorthands

In real life a lot of different types or classes of symbols, like english alphabet, digits, symbols for punctuation and so on. Several of such classes can be specified by using \ symbol that followed by a one special symbol or abbreviation. Those classes are:
  • Special characters (mostly used):
    • \t - symbol of tabulation.
    • \r - carriage return.
    • \n - new line.
  • Ranges of characters:
    • \s - space symbols or shortcut for range [ \t\r\n\f\v].
    • \S - all symbols except space symbols. This range can be described as [^ \t\r\n\f\v] or [^\s].
    • \w - alphabetic symbols that can be covered by following range [a-zA-Z0-9_].
    • \W - non-alphabetic symbols. The same analogue that can be noticed with \s and \S, range of symbols are [^a-zA-Z0-9_].
    • \d and \D it is a range that includes numbers from 0 to 9 inclusively and range which include all symbols, except numbers.
  • Special indicators:
    • \b - this symbol is used to match boundary of the word. The word is a set of characters from group \w, so, \b points to a border between \w and \W, as well as point between start of the string and \w and end of the string and \w.
    • \B - this symbol is opposite to \b 


Modes

Several modes can be applied to improve flexibility of regular expressions:
  • i - ignore case.
  • g - continue search after the moment when first occurrence found and returned.
  • m - multiline mode.
  • x - spaces and text after symbol # will be ignored, like a comments in regular expressions.
  • s - the . symbol will satisfy to any symbols, as well as symbol of new line.
In Python those modes are part of the re module and has two versions - short and long. For example mode to ignore case of letters can be used in following ways: re.I and re.IGNORECASE. This mode should be used as a second parameter of the compile function.

Backreference and Grouping

The grouping is useful when we would like to get more details from the match or decrease a size of regular expression. Such king of capturing calls as grouping. Round brackets ( ) are used for such purpose. Please note, that if round brackets are part of the expression, so, they should be escaped.
Index of the group is important if we what to get access to this group, make that, to information that matches to this group. Index is calculated very easily: if we consider the regular expression as a tree with full expression as a root, and groups as nodes and leaves, so the index of the group is the index of a node or a leave during the DFS (Depth First Search). Let's consider following example:

regular expression: (exp1 (exp2) exp3 (exp4)) exp5 (exp6 (exp7))
group 0: exp1 exp2 exp3 exp4 exp5 exp6 exp7
group 1: exp1 exp2 exp3 exp4
group 2: exp2
group 3: exp4
group 4: exp6 exp7
group 5: exp7

In some cases index of a group is useless to identification. Regular expressions has capability to add special labels or names to identify particular group. Syntax of group's definition is changed to the following (?P<name_or_label>expression). Please notice, that symbol P is highlighted, the reason why I did that is: this symbol is important when we are working with Python but can be omitted in Perl. Perl supports both cases: with symbol P and without it.

It is time to talk about the backreference. It is a useful capability to check that different part of the expression has the same value. It can be the check that the same year is used across the document or the same person is mentioned in different locations of a contract. To check that it is possible to mark one group that matches to particular text and use it to check that the same text in other places. Syntax of such functionality might be different and depends on language.
Python: (?P=name) for groups that has names; \index - for group that can be accessed via index
Perl: \k<name> for named groups; \gindex or \g{index} - for group that can be accessed via index; \g{-shift} or \g-shift - reference to group the occurred previously (shift steps before)

Additional and important case is related to situation when groups are used widely in regular expression and it is a good if groups that has useless information has no index, so, access via index will be only for groups with useful details or data. The following syntax of group's definition is used in this case (?:expression).


Search & Replace

The search (re.search) behavior is differ from match (re.match) behavior in Python (link), and the difference is that match start comparison from the beginning but search from a position in the string, just to be able to identify a match into the string. This behavior is default for Perl, so, it was the reason why I started all my expression previously with symbol ^.
The replace functionality is covered by function re.sub in Python and expression s/<expression_to_find>/<text_to_replace_with>/.


Lookaround

It can be required to check that sub-part of regular expression is located before or after particular position in checked string. For that purpose regular expression has feature of lookahead and lookbehind. If it is required to check that specific text occurs before the position we should use positive lookbehind feature: (?<=expression)another_expression. In this case to satisfy match, a text that satisfy to expression must occurs before a text that satisfy to another_expression. The same analogue of positive lookahead feature; it can be described as another_expression(?=expression). In this case, match will be successful only when a text in the position is matched to another_expression and after that text must match to the expression. Both features are positive lookarounds, but regular expressions has features to check or match negative lookarounds: it is negative lookahead and negative lookbehind. Templates for those features are: another_expression(?!expression) for negative lookahead and (?<!expression)another_expression for negative lookbehind.

Note:  Lookbehinds at Python needs to be fixed-width, and so you have to avoid alternations in a lookbehind pattern that are of different length.


Compilation

Both Python and Perl supports compilation of regular expressions. It is important if you will use this regular expression more than one time. Before processing of regular expression the regular expression routine will compile the expression to special object and use it to match the string. So, it is very useful to use such technique to increase performance of your application.
In Python you should use the re.compile function to do that, in Perl, when you define the regular expression, you should use the qr sequence before it.
The benefit from compilation is depends on regular expression itself. Here is just one of examples that I used to catch the speed-up.
Consider following regular expression: 'with (a|an|the|this) ' and compiled version regex_compiled = compile(regex_str)
Let's try to identify all matches of this expression in text of rfc793 (https://tools.ietf.org/html/rfc793).
The program reads the text line by line and tries to match the string and eventually it should be about 38 matches across 5248 lines. To identify those matches both functions perform_match_with_compile_regexp and perform_match_with_string_regexp was used. Detailed description can be found on my GitHub account - link
A few additional details about the implementation:
  1. To track the execution time the timeit module has been used. 
  2. Number of executions of the main method has been taken from the range [10, 50, 100, 250, 500, 750, 1000].
  3. Number of iterations to calculate the average is 10.
Result of the execution is on the following graph:


As we can see for this particular example improvement is about 3x.
Raw data from the experiment can be found  by this link.

Start of match

This functionality is available in Perl, in details you can read it here, in perldoc. In few words, this option is allows you to start next match on the same string where the latest match were finished.


Mode modifier

As it was mentioned before, several modes can be apply to regular expression. But it is possible to apply several of them only to selected part of regular expression. Such modes are i, m, s, x. Description of them can be found in special section above. This feature is possible to use with combination (?mode:expression). Of course, if it is possible to apply a modifier on a part of string, it should be possible to turn off a global mode on a part of regular expression, it can be done with following syntax (?-mode:expression).  
There is one additional feature that exists in both Perl and Python but has different meaning:

  • Perl allows to turn on/off usage of a mode till end of the regular expression. Both expressions (?mode) to turn on and (?-mode) to turn off should be used for that purpose.
  • The same options in Python is used without subexpression - they will be applied to whole expression! For the purpose such flags should be placed at the beginning of the regular expressions, otherwise result is undefined.

Atomic grouping

This sort of grouping is used to prevent look back to another attempt to identify the match, even if it is not possible to find a match with rest of expression. The following syntax is used for that purpose: (?>expression). This syntax is supported in Perl, but in Python it is possible to get the same effect by following expression: (?>expression) is equals to (?=(expression))\n, where \n is a particular group that matches to expression.
If we consider a quantors where a difficult grouping is not required, the same effect is reached by adding a + sign after the quantor, for example, for quantors *+, ?, {number, number}, the expression will be looks like that: *+++, ?+, {number, number}+.
The purpose of that capability is to reduce the time that will be required to identify match and as a result to improve performance.


Conditional

Regular expressions provides capability to choose the regular expression with dependency on previous matched subexpression. The syntax for this functionality is (?(expression)regex_true|regex_false) - this expression is an equivalent of if-then-else expression; and (?(expression)regex_true) - this expression is an equivalent of if-then expression. expression in this syntax is back reference to previous expression.


Lazy quantifiers

One of the first topics in this article was related to quantifiers, symbols or combination of symbols like *, +, ?, {number, number}.  The purpose of these quantifiers is to define how many times an expression should appears in processed string, like a string of joined substrings where each substring is matched to the expression. Those quantifiers are executed as much time as possible in the sentence, but regular expression engine also looks back to have successful match of full expression. As a result, name of such quantifiers is greedy quantifiers. Regular expression engine allows us to mark cases when it is required to execute such quantifiers as less time as possible: to do that the special mark ? should be placed after the quantifier: *?+?, ??, {number, number}?.


Books / Articles / Sites

  1. Regular Expression Pocket Reference, 2nd Edition (link) - short and sweet reference. This book helped me a lot with preparing of simple regular expressions.
  2. Programming Perl, 4th Edition (link) - of course, this book about Perl, but it also contains an article about regular expressions, so, you can find it useful or double useful if your duties connects with Perl. I read this book and it helped me with one of my tasks that was connected to regular expressions and Perl.
  3. Regular Expressions from Princeton University (link)
  4. My favorite site where we can 'play' with regular expressions and see detailed results - https://regex101.com/ but you can also use this one http://www.regexpal.com/
  5. Mastering Regular Expressions, 3rd Edition (link) - very good book if you would like to go in details of the regular expressions.
  6. You may also enjoy by this course from CodeSchool - https://www.codeschool.com/courses/breaking-the-ice-with-regular-expressions
  7. HackerRank challenges: https://www.hackerrank.com/domains/regex/re-introduction
  8. Stepik.org course prepared by me available here. You need to register to participate :-)

Saturday, December 3, 2016

Simple changes in an algorithm and how it can improve performance.

Everyone knows how it is important to implement something efficiently.

Today I tried to solve one algorithmic challenge from HackerRank.
Link on the challenge is: https://www.hackerrank.com/contests/w26/challenges/hard-homework
This challenge is marked as Hard, and at the moment, I am on my way to be able to solve any kind of Medium challenges without any problems. So, I didn't expected that my solution will be 100% optimal, but tomorrow we will see.

So, common rule that we know from interviews is: "if you don't know how to solve problem efficiently, try to implement the brute-force algorithm and optimize it.".

If you have no access to the challenge, the problem is to find three integers x, y and z (all numbers are positive), with sum of all of them equal to required number n and sum of sin(x), sin(y), sum(z) is maximal across all triples x, y, z. Here parameter of sin operation is an angle in radians.

The simple solution is:

static double getMaxSumOfSinsBruteForce(int n) {
    double maxSumOfSins = Double.MIN_VALUE;
    double currentSumOfSins;
    for (int x = 1; x < n - 1; x++) {
        for (int y = 1; x + y < n; y++) {
            currentSumOfSins = Math.sin(x) + Math.sin(y) + Math.sin(n-x-y);
            if (currentSumOfSins > maxSumOfSins) {
                maxSumOfSins = currentSumOfSins;
            }
        }
    }
    return maxSumOfSins;
}

We try to iterate over x from 1 to n-2. n-2 is because y and z should be positive, so both these numbers should have value that great or equal to 1. After that we try to check all possible values of y and calculate required sum of sinuses with parameters x, y and n-x-y.

Hm... it takes a good time to find a sum where n is equal to 10000... 
How we can do better? Oh, thanks to Tim Roughgarden and Algorithms specialization from Stanford on the coursera.org site (https://www.coursera.org/specializations/algorithms), for now I ask myself this question every time when I implement something that computationally intensive.

The first thing that we might notice is that in the for loop over y, on every iteration we calculate the Math.sin(x) where x is a constant inside this loop over y.

static double getMaxSumOfSinsOptV1(int n) {
    double maxSumOfSins = Double.MIN_VALUE;
    double currentSumOfSins;
    double sinOfX;
    for (int x = 1; x < n - 1; x++) {
        sinOfX = Math.sin(x);
        for (int y = 1; x + y < n; y++) {
            currentSumOfSins = sinOfX + Math.sin(y) + Math.sin(n-x-y);
            if (currentSumOfSins > maxSumOfSins) {
                maxSumOfSins = currentSumOfSins;
            }
        }
    }
    return maxSumOfSins;
}

Better? Yes!
What about performance? Still slow...

So, if we look on the solution from another angle, what we can see. Actually, in target sum of sinuses all summands are values from collection Math.sin(1), ..., Math.sin(n-2).
Next idea is to create a map which has angle as a key and value is a sinus of that angle. 

As a result we got the following:

static double getMaxSumOfSinsOptV2(int n) {
    double maxSumOfSins = Double.MIN_VALUE;
    double currentSumOfSins;
    double sinOfX;
    // pre-calculation
    Map<Integer, Double> preCalculation = new HashMap<>();
    for (int i = 1; i <= n-2; i++) {
        preCalculation.put(i, Math.sin(i));
    }
    // Common processing
    for (int x = 1; x < n - 1; x++) {
        sinOfX = preCalculation.get(x);
        for (int y = 1; x + y < n; y++) {
            currentSumOfSins = sinOfX + preCalculation.get(y) + preCalculation.get(n-x-y);
            if (currentSumOfSins > maxSumOfSins) {
                maxSumOfSins = currentSumOfSins;
            }
        }
    }
    return maxSumOfSins;
}

Performance? Much faster!
Can we do better? We can try!

According to problem description, x, y and z are positive numbers, so there values we can use as indexes in array. As a result, we no need to use map collection type here.
Let's do that!

static double getMaxSumOfSinsOptV3(int n) {
    double maxSumOfSins = Double.MIN_VALUE;
    double currentSumOfSins;
    // pre-calculation
    double[] preCalculation = new double[n];
    for (int i = 0; i < n; i++) {
        preCalculation[i] = Math.sin(i);
    }
    // Common processing
    for (int x = 1; x < n - 1; x++) {
        // At least one should be positive.
        if (preCalculation[x] >= 0) {
            for (int y = 1; x + y < n; y++) {
                currentSumOfSins = preCalculation[x] + preCalculation[y] + preCalculation[n - x - y];
                if (currentSumOfSins > maxSumOfSins) {
                    maxSumOfSins = currentSumOfSins;
                }
            }
        }
    }
    return maxSumOfSins;
}

At the end of the day, let's try to launch all four solution with the same value of n and compare results, especially performance or time to calculate the sum.
Assume we pass n that equals to 10000 to those functions

On my laptop results are:

* 10000
* BF: Calculated value: 2.739180081. Time to calculate: 19030
* O1: Calculated value: 2.739180081. Time to calculate: 12290
* O2: Calculated value: 2.739180081. Time to calculate: 995
* O3: Calculated value: 2.739180081. Time to calculate: 72

Time is in milliseconds.
As we can see, it is the same approach, but with few enhancements and final version works ~250 times faster.
Is it enough to pass the challenge on HackerRank? No... Another approach should be used, and it is an another story :-)

Monday, November 14, 2016

Using of Selenium WebDriver with py.test for test automation. Part #1

This article is the second part in the series of articles that cover Selenium and Unit Testing. All configuration stuff as well as usage of different approaches for unit testing was discovered in the first part.
Structure of the article is as follows:
  • Infrastructure for test development and launching
  • Special tool for communication with system under test through browser. In our case it is WebDriver.
  • Special design patterns that were developed for such kind of tasks.
Let’s talk about test automation system. It consists of:
  • The system or application that we would like to test and it is available for us through a browser like a web-based application.
  • Special driver that allows us to send special commands to the application or receive information from the browser that were required to perform our tests. In our case it is Selenium
  • Tests that we can develop or record using special tools.
  • Framework for test launch. It may support functionality for launching selected or pre-defined group(s) of tests. Most known systems for Python are unittest, nosetest and py.test
So, Selenium WebDriver is a library that allows to develop tests that communicate with browser. For example, it can receive required data from browser or send special commands to the browser that can change its behavior. Selenium has special IDE that allows test recording and code generation. Also Selenium package includes Selenium Server component, it is required if we would like to launch our tests remotely. Selenium supports more known browsers like FireFox, Chrome, Internet Explorer, etc. For our article we will use FireFox, but let’s download driver for browser Google Chrome to demonstrate how tests work with both FireFox and Google Chrome. Selenium WebDriver has the capability to launch tests remotely, using Remote browser; it should work the same way for both FireFox and Chrome.
In our case workflow will have the following structure: Tests ↔ FireFox Driver ↔ FireFox ↔ Application that we would like to test.

To isolate our experiment from the original installation of the Python we will use virtual environment. Detailed information how to install it, configure it and use with Eclipse you may find in this article.

Since Selenium is needed for web-based application automation, the following list of functionality should be reviewed. It consists of actions needed for test automation development:
  • Launch the web-browser
  • Open the web-page
  • Find the required elements (with delays and without)
  • Collect information from elements on the web-page
  • Perform the required actions with elements
  • Quit the browser.
Sometimes switching between windows and frames is required. We will look on this functionality in the end.
Selenium home page for Python could be found on official website by the following link (http://selenium.googlecode.com/git/docs/api/py/index.html). Description of Selenium API is available via this link (http://selenium.googlecode.com/git/docs/api/py/api.html). That page is also available from the home page.

Let's quickly overview the list of commands and capabilities that Selenium framework provides to us. Since we work with a browser, Selenium has commands for browser launch and quitting it. Browser is an application so we are able to work with this window too. We can open, close it, and change size or position of the window. Selenium may also work with dialogue window for file uploading. Browser’s window contains web-page. Selenium provides us with the capability to open it, both back and next navigation, and executes JS code in context of the web-page. Each page contains elements. Selenium provides API to search elements on the page. Selenium supports emulation of mouse or keyboard actions as well as shows the properties of the web-page elements.

Launching of the browser.

Launching of the browser is executed by call of constructor for required driver

from selenium import webdriver
driver = webdriver.Firefox()
driver = webdriver.Chrome()

Each browser has its own class that represents the driver. Special constructor should be called to launch an instance of the required browser. Each browser has its own settings and list of settings is different for different browsers.

All browsers has one common parameter – capabilities (link on documentation is here (https://code.google.com/p/selenium/wiki/DesiredCapabilities)). Special instructions should be passed by using this way to change browser mode.

driver = webdriver.Firefox(capabilities={'native_events': False})

Command quit is used to stop the browser. It closes all opened windows and stop the driver. It clears all temporary files too.

driver.quit()

Command close closes current browser window, if it latest window then stop the browser. In some cases Selenium cannot identify that current window is the latest one, and as a result, it can be cause of situation when temporary files presents in the system after end of work.

driver.close()

Behavior for remote launching (with Selenium Grid) are differ from local launch. The quit function clear the session (session is moved to inactive state) and session is available for launching of new browser instance. The close function only stops the browser and do not clear the session, as a result the session will be available for next launch only after timeout.

Page opening and navigation

Function get provides capability to open web-page. This function accepts full path to the required page:

driver.get("https://www.yandex.com")

This command works synchronously. It waits till the moment when page loading process is finished. It is depends on browser.
Navigation by history is supported by the following list of commands:

driver.back()
driver.refresh()
driver.forward()

A hidden problem is related when a page has completed form and during back/refresh/forward navigation dialogue window appears with question to resubmit values from the form. Selenium does not support this feature.

Search elements on page

Selenium has two commands that can be used to find element on the page:

element = driver.find_element(by, locator)
elements = driver.find_elements(by, locator)

The first command looks for first element occurrence by special condition, the second command returns list of all elements that satisfy to special condition.
Different approaches for type of search exists:
  • By.ID: the element ID should be unique across the web-page, so, the By.ID is a better and faster approach to find the element on the page.
  • By.XPATH: Initially XPath was developed for XML language. Description of XPath capabilities can be found here (http://www.w3schools.com/xml/xml_xpath.asp). It is more general approaches that has more features than By.CSS_SELECTOR locator.
  • By.LINK_TEXT: search by text of the link. This approach works only with link elements. Keep in mind that if application supports several languages, this approach may works incorrectly.
  • By.PARTIAL_LINK_TEXT: the same like for By.LINK_TEXT
  • By.NAME: this approach is used with input fields and buttons, because these elements usually has attribute name.
  • By.TAG_NAME: search by name of the html tag.
  • By.CLASS_NAME: search by class of the elements on the page.
  • By.CSS_SELECTOR: locators that described in Cascading Style Sheets (CSS) standard (http://www.w3.org/Style/CSS/)

Selenium provides functions where locator's type included into function's name.

driver.find_element_by_id(<id to search>)
driver.find_elements_by_id(<id to search>)
driver.find_element_by_xpath(<xpath to search>)
driver.find_elements_by_xpath(<xpath to search>)
driver.find_element_by_link_text(<link text to search>)
driver.find_elements_by_link_text(<link text to search>)
driver.find_element_by_partial_link_text(<partial link text to search>)
driver.find_elements_by_partial_link_text(<partial link text to search>)
driver.find_element_by_name(<name to search>)
driver.find_elements_by_name(<name to search>)
driver.find_element_by_tag_name(<tag name to search>)
driver.find_elements_by_tag_name(<tag name to search>)
driver.find_element_by_class_name(<class name to search>)
driver.find_elements_by_class_name(<class name to search>)
driver.find_element_by_css_selector(<css selector to search>)
driver.find_elements_by_css_selector(<css selector to search>)

Selenium also supports search inside web-element by surrounding search requests in chain:

element = driver.find_element_id(<id>).
                 find_element_by_xpath(<xpath>).
                 find_element_by_name(<name>)

It is equals to:

element_with_id = driver.find_element_id(<id>)
element_with_xpath = element_with_id.find_element_by_xpath(<xpath>)
element = element_with_xpath.find_element_by_name(<name>)

If find_element function does not find anything, the NoSuchElementException will be raised, but the find_elements function returns empty list in this case. The exception or empty list can be returned not at the moment when function is called - implicit wait can be used to notify WebDriver to wait required time till check element's occurrence on web-page:

driver.implicitly_wait(<time to wait>)

Selenium uses explicit wait too. It is call of function time.sleep(<small time-out: ~0.5 sec>). Selenium encapsulates such behavior into class WebDriverWait:

from selenium.webdriver.support.wait import WebDriverWait
# initialization of explicit wait object
wait = WebDriverWait(<instance of WebDriver>, <time-out to wait>)
# using explicit wait object
element = wait.until(<condition for wait something>)

With the explicit wait functionality, we can wait for special conditions that is not possible to catch by implicit wait functionality. List of expected conditions available here (http://selenium.googlecode.com/git/docs/api/py/webdriver_support/selenium.webdriver.support.expected_conditions.html) and several good examples also available here (https://blog.mozilla.org/webqa/2012/07/12/how-to-webdriverwait/). Positive conditions or events should be checked to avoid tests slowness due to using of the explicit wait mechanism. For example, if web-page should have element a1 and shouldn't have element a2, then required condition should checks presence of element a2, and does not check absence of element a1 in source code of tests.

Differences between implicit wait and explicit wait are:
  • Explicit waits works on client side, in Python source code, but implicit waits works on browser side
  • Explicit waits can wait for big list of different conditions, but implicit waits waits for element appearance in DOM model
  • Explicit waits should be written by the developer, but implicit waits work automatically
  • Explicit waits can raise the TimeoutException, but implicit waits can raise NoSuchElementException
  • Explicit waits generates a lot queries across the network, but implicit waits generates only one query that will be send to the browser.

Actions with the element

Simple actions:
  • Click on links, buttons, radiobuttons and checkboxes. Select an item from the list. Clicks are emulated by command click.
  • Enter text and attach file (enter special text into special input item). Text input emulated by command send_keys
Complicated actions:
  • Combined keyboard actions (Ctrl + something, etc.) or navigation by arrows
  • Homing or put mouse on top of element without click
  • Drag and drop (with keyboard interaction)
  • Right mouse buttons click
Now more details about two basic operations: click and send_keys. Both commands can be executed with any visible element and events will be generated during emulation, for example for the click command it will be mouse-down -> on-click -> mouse-up, and so on. The click command sends action at center of the element. The send_keys command adds text at the end of existed text in the element. The send_keys command can process key combinations, navigation arrows, and so on.
Operations with elements like select/toggle/check/uncheck can be emulated by series of click commands. But the select operation is a complicated one and its support has been added into WebDriver as a part of support package:

from selenium.webdriver.support.select import Select

dropdown = Select(element)
dropdown.select_by_index(1)

This class supports operations select_by_value and select_by_visible_text also.
For example if text should be entered at the beginning of the string, following list of commands is required:

from selenium.webdriver.common.keys import Keys

element.click()
element.send_keys(Keys.HOME)
element.send_keys(<some text>)

Both commands click and send_keys has two implementations for each browser: native and synthetic:
  • Native events works on OS level, but synthetic works inside the browser.
  • Native events is implemented on C/C++ programming language, but synthetic events is implemented using the JavaScript programming language.
  • Native events emulates user behavior more clearly than synthetic.
  • Native events requires focus on the browser, but synthetic not.
  • Not all versions of the browser supports native events, but synthetic events works in all versions of the browser.
Lets look on complicated actions. It includes both click command and send_keys command as well as following list of complicated commands:
  • move_to_element command
  • click_and_hold command
  • release command
  • key_down command
  • key_up command
Execution supported by ActionChains function. Following example demonstrates element drags and drops with the Ctrl button pressed.

Webdriver.ActionChains(driver).
    move_to_element(drag).
    key_down(Keys.CONTROL).
    click_and_hold().
    move_to_element(drop).
    release().
    key_up(Keys.CONTROL).
    perform()

Collect information about the element

  • text. This property contains only visible text of the element. Invisible element have empty text.
  • get_attribute. This function returns value of element's attribute. Sometimes this value can be changed, for example if attribute 'href' is requested. Selenium WebDriver returns absolute link (doesn't matter what kind of link was placed into the element initially).
  • boolean attributes, these attributes has two kind of values: None or True. Selenium  returns none if attribute is not set, and True if attribute exists (it doesn't matter what value is stored in this attribute)
  • Selenium returns value of the property with the same name like attribute name.
  • The is_displayed method
    • If required element locates before left border or above top border the is_displayed method returns true
    • If required element partially covered by another element, the is_displayed method returns true
    • If elements is transparent or its color are the same like background color, the is_displayed method returns true

Switch between content

The WebDriver allows switch to alert message, switch between frames and switch to another browser window. In details:
  • Switch to alert:
    • WebDriver has special method switch_to_alert, that return the alert if it exists. The NoAlertPresentException will be generated if no alert in the system, as well as UnhandledAlertException will be generated if no alert processing is implemented. Special function accept and dismiss of the alert object to close the alert message. Additional processing of alert window is covered by unexpectedAlertBehavior capability of the driver.
  • Switch to frame. This capability is covered by function switch_to_frame
  • Switch to window. Major point here is the following: the WebDriver doesn't support switching between tabs!
    • List of all browser's windows can be get by function driver.window_handles()
    • Handle of current window can be get by function driver.current_window_handle()
    • Switch to required window by its handle can be performed by function driver.switch_to_window(<handle of required window>)
    • Selenium doesn't switch back to previous window if current window was closed in program. Switch to another window (or previous window) should be done in the program too, otherwise WebDriver raises special exception.

Other browser window properties.


  • get_window_size
  • set_window_size
  • maximize_window
  • get_window_position
  • set_window_position
These functions can be used for screenshot creating, avoid scrolling or for special checks with scrolling. Keep in mind, that Selenium WebDriver automatically performs auto-scroll operation if required element locates outside the window.

Implementation points

Common approach with development using Selenium is related to PageObject pattern usage. More detailed description on this pattern is here (https://code.google.com/p/selenium/wiki/PageObjects) This pattern makes test development process easier and allows to reduce amount of duplicated code into project. Benefits of PageObject pattern are:
  • Clear difference between source code of tests and source code that depends on page structure
  • Single repository for all web-page's services and operations

Example:

As example of usage Selenium Web-Driver as well as features and patterns from above, I will prepare simple automation for creating a mailbox on www.yahoo.com.
But it is a separate case for new post.

In any case, I've prepared initial architecture with same programming logic that could be found here.