Appium, Android and Unicode: 'ascii' codec can't encode character

Appium server 1.4.16
Python 2.7.10
Appium-Python-Client 0.20
Mac OSX 10.10.5

My AUT is mostly tables composed of rows filled with dynamic data. Of course I have to read this data from the table rows during testing. On iOS this works without any effort on my part. But on Android some data will cause a codec error.

Here’s a sample attribute from a row in my table, note the long dash:

text="Weekly Rewind – February 19, 2016" 

When I try to grab the page_source and parse it with ElementTree like this:

parser = ET.XMLParser(encoding="utf-8")
tree = ET.parse(StringIO(self.driver.page_source), parser=parser)

It throws a codec error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 7405: ordinal not in range(128)

The XML produced via Appium page_source appears to be UTF-8 encoded so I’m unsure exactly why I’m getting this message.

<?xml version="1.0" encoding="UTF-8"?>

I’m not always trying to use ElementTree to chew threw the page_source, usually I’m grabbing elements directly from the Appium driver, stashing their text attribute, then often using that text to later find the element in the UI again via xpath so I have to make sure the text is compatible with xpath. Originally I’d run into codec errors during those operations. I’ve mostly seemed to overcome them by passing all of my text through this function:

def unicode_format_string(string_to_format):
    if isinstance(string_to_format, str):
        # Normal byte string, do nothing
    elif isinstance(string_to_format, unicode):
        # Unicode, force format as UTF-8
        string_to_format = string_to_format.encode('utf-8')
    # Further processing, escape out any quotes that will break downstream string literal encoding
    # string_to_format = string_to_format.encode('unicode-escape')  # This isn't escaping single quote like it should
    formatted_string = string_to_format.replace("'", "\\'")
    return formatted_string

Hoping to get an education from other Python users working with Appium + Android and how they successfully work with non-ASCII characters in their AUT.

I’ve seen this issue. It’s been around for a while. Here is an old thread, with a fix:!topic/appium-discuss/-oPV4pvFbwo

And here is our fix in Ruby. We mix in the Selenium::WebDriver module and override ‘text’ in class Element and class Alert. Not exactly sure how to do it in Python, but between the above and this code you should be able to get it:

require 'unicode_utils'
module Selenium
  module WebDriver
    class Element
      def text
        UnicodeUtils.nfc bridge.getElementText @id

    class Alert

      def text
        UnicodeUtils.nfc @bridge.getAlertText
1 Like