Best ways to double click and drag and drop on desktop?

Hi! My name is Mathieu, I’m a quality analyst currently working on setting up an automated testing suite for my employer’s proprietary software. I’ve little experience in desktop automation, most of my past experience has been with robot framework for cloud, browser applications.

I’ve been setting up and using Appium for the past few months and up to now, have managed pretty well to do basic operations, like clicking around our app and entering text. However, two functions I’ve been entirely unable to perform: Double clicking, and drag and drop.

For clicking and entering text, I’m using By2.nativeName().click() and .sendKeys(). By2 doesn’t seem to have methods natively for double clicking or drag and drop, but Actions seemed to have it, so I’ve been trying to use Actions, with code looking like this:

const actions = driver.actions({async: false}); (note: I’ve used async: true as well);
actions.move({ x: -500, y: 0, duration: 1000, origin: By2.nativeXpath(‘//Pane/Group/Group/Group/Group/Group/Group[5]/Group/Group[2]’) });

Or I’ve tried this one for drag and drop too:
actions.dragAndDrop(By2.nativeXpath(‘//Pane/Group/Group/Group/Group/Group/Group[5]/Group/Group[2]’), {x: -750, y: -100})

Unfortunately, I’m met with the following error:
UnsupportedOperationError: Currently only pen and touch pointer input source types are supported

I did find this issue on github that seem to discuss this and this would be an issue on the Microsoft side of things due to mouse pointers not being supported, which is fair. However, clicking works with By2, so perhaps there’s a solution that doesn’t use Actions? Using the click method twice doesn’t seem to work, either.

I’m assuming there’s gotta be a way to get a drag and drop or a double click working, right? It seems like a pretty vital thing to get going on most apps, especially double clicking. How have people been able to get it working, historically? I’m not hard tied to Typescript, but my personal programming experience is only an hobbyist’s on Javascript, so even Typescript is a bit complex to me, so I was hoping to stay on it.

In short, I’m hoping to find a way to get double clicking and drag and drop working on Windows and Mac. My current setup uses WinAppDriver through Appium with tests written in Typescript. I’m hoping for any and all advice.

Thanks in advance for the help :slight_smile:

Edit: Just realized I should probably send my dev dependencies
“devDependencies”: {
@babel/core”: “^7.24.1”,
@babel/preset-env”: “^7.24.1”,
@babel/preset-typescript”: “^7.24.1”,
@types/jest”: “^29.5.12”,
@types/selenium-webdriver”: “^4.1.22”,
“babel-jest”: “^29.7.0”,
“jest”: “^29.7.0”,
“jest-junit-reporter”: “1.1.0”,
“selenium-appium”: “^1.0.2”,
“selenium-webdriver”: “^4.18.1”,
“typescript”: “^5.4.2”,
“wdio-ywinappdriver-service”: “^0.2.54”,
“winappdriver”: “^0.0.7”

Try gesture shortcuts, for example GitHub - appium/appium-windows-driver: Appium's interface to WindowsAppDriver provided by Microsoft or GitHub - appium/appium-windows-driver: Appium's interface to WindowsAppDriver provided by Microsoft

1 Like

Hey Mykola, thanks a bunch for your response! Happy to see you respond, I’ve seen your responses tons when trying to figure things out haha

Could you give me a code example of how to implement that in my code? I tried something like “windows.dragAndDrop” or “” but it’s not really working out… I’m a bit inexperienced in coding. I would really appreciate it if you could give me a code example, I’m sure I can figure it out from there. Thank you :slight_smile:

Here you can find examples for double click:

Regarding drag and drop please provide the source code you’ve tried so far and the server log

1 Like

Thanks a bunch for the code examples! Our software is written using Flutter, our dev team tells me there’s no “ID” mechanics in there, so I’ve been using nativeName through Semantics, or nativeXpath temporarily as I convince them to implement more semantics as I use them and write test cases. The examples you’ve provided used elementId, but I’ve been trying to figure out how to use XY coordinates, as I’m pretty sure I can use nativeName to find something’s coordinates… so far I’ve tried the following:

await driver.executeScript(“windows: click”, {x: elementLocated[“62”], y: elementLocated[“292”], “times”: 2})
{x: [“62”], y:[“292”], “times”: 2}
{x: 62, y: 292, “times”: 2}
{“x”: ele.location[“62”], “y”: ele.location[“292”], “times”: 2} (this one I get “cannot find name ele”, which is why I tried elementLocated)

I also tried to find the element by nativeName then store it in a variable and use that variable…
let doubleclick = driver.findElement(By2.nativeName(“Name of the thing”));
{“elementId”: doubleclick, “times”: 2}

Each times, I get the following error: InvalidArgumentError: Either element identifier or absolute coordinates must be provided

I’m not too certain what format I should be putting my coordinates in. I’m pretty sure that’s the last piece I’m missing of that puzzle though… or hopefully anyhow haha

For drag and drop, I’ve tried a couple of things, mostly different ways to hold click, move mouse, then release click, or actual methods named dragAndDrop. Here’s a couple code snippets.

var actions = new Actions(driver.getExecutor());

const actions = driver.actions({async: false});;
actions.move({ x: -500, y: 0, duration: 1000, origin: By2.nativeXpath(‘//Pane/Group/Group/Group/Group/Group/Group[5]/Group/Group[2]’) });

const actions = driver.actions({async: false});
actions.dragAndDrop(By2.nativeXpath(‘//Pane/Group/Group/Group/Group/Group/Group[5]/Group/Group[2]’), {x: -750, y: -100})

In all of those cases, I got the error mentionned originally, that the pointer type is not supported. With your provided link though, I tried figuring it out and tried the following snippets:

await driver.executeScript(“windows: clickAndDrag”, {startX: 1095, startY: 351, endX: 500, endY: 500, })

For which I got:
InvalidArgumentError: Starting drag point: Either element identifier or absolute coordinates must be provided

I’m thinking my mistake in both drag and click and click are the same, considering the error. Just not writing the coordinates right…

For server log, I’m not entirely sure how to obtain that or what it is. Could you tell me where to get that?

Thank you for your time, I greatly appreciate it :slight_smile:

Please also provide the server log. there you can see what arguments the server actually receives, which would help to also fix the client side.

I’m not entirely sure what server log refers to or how to obtain it, but I think it’d mean the output of the command prompt hosting the appium server? If so, I think this part covers the section about double clicking!

[WindowsDriver@42c3 (2db8c7c9)] Calling AppiumDriver.execute() with args: ["windows: click",[[{"x":62,"y":292,"times":2}]],"2db8c7c9-c45e-47e6-b9bc-19fa7d32cacd"]
[WindowsDriver@42c3 (2db8c7c9)] Executing extension command 'windows: click'
[WindowsDriver@42c3 (2db8c7c9)] Encountered internal error running command: InvalidArgumentError: Either element identifier or absolute coordinates must be provided
[WindowsDriver@42c3 (2db8c7c9)]     at WindowsDriver.toAbsoluteCoordinates (C:\Users\mathieu.chouinardlav\.appium\node_modules\appium-windows-driver\lib\commands\gestures.js:64:11)
[WindowsDriver@42c3 (2db8c7c9)]     at WindowsDriver.windowsClick (C:\Users\mathieu.chouinardlav\.appium\node_modules\appium-windows-driver\lib\commands\gestures.js:300:72)
[WindowsDriver@42c3 (2db8c7c9)]     at WindowsDriver.executeWindowsCommand (C:\Users\mathieu.chouinardlav\.appium\node_modules\appium-windows-driver\lib\commands\execute.js:46:57)
[WindowsDriver@42c3 (2db8c7c9)]     at WindowsDriver.execute (C:\Users\mathieu.chouinardlav\.appium\node_modules\appium-windows-driver\lib\commands\execute.js:33:23)
[WindowsDriver@42c3 (2db8c7c9)]     at commandExecutor (C:\Users\mathieu.chouinardlav\.appium\node_modules\@appium\base-driver\lib\basedriver\driver.ts:106:18)
[WindowsDriver@42c3 (2db8c7c9)]     at C:\Users\mathieu.chouinardlav\.appium\node_modules\async-lock\lib\index.js:171:12
[WindowsDriver@42c3 (2db8c7c9)]     at AsyncLock._promiseTry (C:\Users\mathieu.chouinardlav\.appium\node_modules\async-lock\lib\index.js:306:31)
[WindowsDriver@42c3 (2db8c7c9)]     at exec (C:\Users\mathieu.chouinardlav\.appium\node_modules\async-lock\lib\index.js:170:9)
[WindowsDriver@42c3 (2db8c7c9)]     at AsyncLock.acquire (C:\Users\mathieu.chouinardlav\.appium\node_modules\async-lock\lib\index.js:189:3)
[WindowsDriver@42c3 (2db8c7c9)]     at WindowsDriver.executeCommand (C:\Users\mathieu.chouinardlav\.appium\node_modules\@appium\base-driver\lib\basedriver\driver.ts:122:39)
[WindowsDriver@42c3 (2db8c7c9)]     at processTicksAndRejections (node:internal/process/task_queues:95:5)
[WindowsDriver@42c3 (2db8c7c9)]     at defaultBehavior (C:\Users\mathieu.chouinardlav\AppData\Roaming\npm\node_modules\appium\lib\appium.js:1109:14)
[WindowsDriver@42c3 (2db8c7c9)]     at AppiumDriver.executeWrappedCommand (C:\Users\mathieu.chouinardlav\AppData\Roaming\npm\node_modules\appium\lib\appium.js:1215:16)
[WindowsDriver@42c3 (2db8c7c9)]     at AppiumDriver.executeCommand (C:\Users\mathieu.chouinardlav\AppData\Roaming\npm\node_modules\appium\lib\appium.js:1121:17)
[WindowsDriver@42c3 (2db8c7c9)]     at asyncHandler (C:\Users\mathieu.chouinardlav\AppData\Roaming\npm\node_modules\appium\node_modules\@appium\base-driver\lib\protocol\protocol.js:393:19)

Looking at it, it seems like the server receives something extra, that “2db8c7c9-c45e-47e6-b9bc-19fa7d32cacd” thing… is that an elementId? I never provide elementId anywhere in my code, so not sure where that’s coming from… maybe that’s the source of the exception?

I tried without feeding any coordinates, which gave me this: Calling AppiumDriver.execute() with args: [“windows: click”,[[{“times”:2}]],“6abd11f5-10e6-428d-8077-50dd3540d586”]

And then the same error.

Hopefully the server log is appropriate, thank you for your response :slight_smile:

It looks like the client is nests arguments into two arrays. At least one of them is not expected, so the original handler receives zero arguments (e.g. the server expects either {"x":62,"y":292,"times":2} or [{"x":62,"y":292,"times":2}])
Consider asking at the client forum what the correct arguments format is (I assume it’s wdio)

I feel like a bit of a fool saying this, but… I’m not sure what a “client” is, in this infrastructure. I assume wdio is WebDriverIO… But I’ve little idea what that is. I’ve been using this like “By2”, “Actions”, “driver” as the… prefix? of my methods? but I’ve few ideas what they represent, or where they come from.

executeScript does seem to come from “selenium appium” and refers to WebDriver.executeScript, which itself comes from “selenium-webdriver”, so it’s probably from WebDriver?

I’ll go dig around to see if I can find some sort of WebDriverIO forums, or if they have discussion boards or the likes… however, in case it’s not a wdio thing, how can I know what my “client” is? Apologies to waste your time with such basic questions.

no problem. Yes, I’ve used the abbreviation wdio for WebDriverIO. I had to guess it since the initial issue description did not contain any mentions of the actual client library.

Hey, so after asking around WebDriverIO, I’ve been promptly ignored every time I asked questions on this topic… I guess I’m coming back here because I actually get answers here.

You mentioned I haven’t provided any mentions of the actual client library; how can I tell what client library I’m using?

Here is a list of current Appium Client Libraries:

Thank you, I had missed that page during my searches. I’m using Node.js and have no clue what Nightwatch is, so clearly it’s WDIO. Where should I be asking questions like that, for how to write syntax or find out what’s wrong? I went on their official discord but was ignored repeatedly…

Also, one of the links Mykola shared has himself fixing the exact same error by recommending to update the driver, which seemingly worked. I’m pretty sure I’ve the latest versions, but just to make sure, was the driver being updated there the Windows Application Driver?

There is a clickable link on that page for each client. All have different resources, but the page has many resources for help and documentation. I would recommend familiarizing yourself with the github page for starters.

Hey. I went to WDIO’s, and had a good conversation. Unfortunately, it seems I’m not using WDIO at all; nor Nightwatch. So I’m really uncertain what client library I’ve got going. “Selenium-webdriver” is the closest. I’ve contacted Selenium for help, but now I’m curious if it’s possible my setup is missing something? I’ve followed various installation guides but don’t have WDIO nor Nightwatch currently. Could that cause problems, and potentially, my problem?