Appium server 1.4.16
Python 2.7.10
Appium-Python-Client 0.20
Mac OSX 10.10.5
My AUT is mostly tables composed of rows filled with dynamic data. Of course I have to read this data from the table rows during testing. On iOS this works without any effort on my part. But on Android some data will cause a codec error.
Here’s a sample attribute from a row in my table, note the long dash:
text="Weekly Rewind – February 19, 2016"
When I try to grab the page_source and parse it with ElementTree like this:
parser = ET.XMLParser(encoding="utf-8")
tree = ET.parse(StringIO(self.driver.page_source), parser=parser)
It throws a codec error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 7405: ordinal not in range(128)
The XML produced via Appium page_source appears to be UTF-8 encoded so I’m unsure exactly why I’m getting this message.
<?xml version="1.0" encoding="UTF-8"?>
I’m not always trying to use ElementTree to chew threw the page_source, usually I’m grabbing elements directly from the Appium driver, stashing their text attribute, then often using that text to later find the element in the UI again via xpath so I have to make sure the text is compatible with xpath. Originally I’d run into codec errors during those operations. I’ve mostly seemed to overcome them by passing all of my text through this function:
def unicode_format_string(string_to_format):
if isinstance(string_to_format, str):
# Normal byte string, do nothing
pass
elif isinstance(string_to_format, unicode):
# Unicode, force format as UTF-8
string_to_format = string_to_format.encode('utf-8')
# Further processing, escape out any quotes that will break downstream string literal encoding
# string_to_format = string_to_format.encode('unicode-escape') # This isn't escaping single quote like it should
formatted_string = string_to_format.replace("'", "\\'")
return formatted_string
Hoping to get an education from other Python users working with Appium + Android and how they successfully work with non-ASCII characters in their AUT.