How to prevent detection with Selenium in Java? [duplicate]

How to prevent detection with Selenium in Java? [duplicate] - java

I'm trying to automate a very basic task in a website using selenium and chrome but somehow the website detects when chrome is driven by selenium and blocks every request. I suspect that the website is relying on an exposed DOM variable like this one https://stackoverflow.com/a/41904453/648236 to detect selenium driven browser.
My question is, is there a way I can make the navigator.webdriver flag false? I am willing to go so far as to try and recompile the selenium source after making modifications, but I cannot seem to find the NavigatorAutomationInformation source anywhere in the repository https://github.com/SeleniumHQ/selenium
Any help is much appreciated
P.S: I also tried the following from https://w3c.github.io/webdriver/#interface
Object.defineProperty(navigator, 'webdriver', {
get: () => false,
});
But it only updates the property after the initial page load. I think the site detects the variable before my script is executed.

First the update 1
execute_cdp_cmd(): With the availability of execute_cdp_cmd(cmd, cmd_args) command now you can easily execute google-chrome-devtools commands using Selenium. Using this feature you can modify the navigator.webdriver easily to prevent Selenium from getting detected.
Preventing Detection 2
To prevent Selenium driven WebDriver getting detected a niche approach would include either / all of the below mentioned steps:
Adding the argument --disable-blink-features=AutomationControlled
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://www.website.com")
You can find a relevant detailed discussion in Selenium can't open a second page
Rotating the user-agent through execute_cdp_cmd() command as follows:
#Setting up Chrome/83.0.4103.53 as useragent
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36'})
Change the property value of the navigator for webdriver to undefined
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
Exclude the collection of enable-automation switches
options.add_experimental_option("excludeSwitches", ["enable-automation"])
Turn-off useAutomationExtension
options.add_experimental_option('useAutomationExtension', False)
Sample Code 3
Clubbing up all the steps mentioned above and effective code block will be:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36'})
print(driver.execute_script("return navigator.userAgent;"))
driver.get('https://www.httpbin.org/headers')
History
As per the W3C Editor's Draft the current implementation strictly mentions:
The webdriver-active flag is set to true when the user agent is under remote control which is initially set to false.
Further,
Navigator includes NavigatorAutomationInformation;
It is to be noted that:
The NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.
The NavigatorAutomationInformation interface is defined as:
interface mixin NavigatorAutomationInformation {
readonly attribute boolean webdriver;
};
which returns true if webdriver-active flag is set, false otherwise.
Finally, the navigator.webdriver defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, so that alternate code paths can be triggered during automation.
Caution: Altering/tweaking the above mentioned parameters may block the navigation and get the WebDriver instance detected.
Update (6-Nov-2019)
As of the current implementation an ideal way to access a web page without getting detected would be to use the ChromeOptions() class to add a couple of arguments to:
Exclude the collection of enable-automation switches
Turn-off useAutomationExtension
through an instance of ChromeOptions as follows:
Java Example:
System.setProperty("webdriver.chrome.driver", "C:\\Utility\\BrowserDrivers\\chromedriver.exe");
ChromeOptions options = new ChromeOptions();
options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
options.setExperimentalOption("useAutomationExtension", false);
WebDriver driver = new ChromeDriver(options);
driver.get("https://www.google.com/");
Python Example
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("https://www.google.com/")
Ruby Example
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument("--disable-blink-features=AutomationControlled")
driver = Selenium::WebDriver.for :chrome, options: options
Legends
1: Applies to Selenium's Python clients only.
2: Applies to Selenium's Python clients only.
3: Applies to Selenium's Python clients only.

ChromeDriver:
Finally discovered the simple solution for this with a simple flag! :)
--disable-blink-features=AutomationControlled
navigator.webdriver=true will no longer show up with that flag set.
For a list of things you can disable, check them out here

Do not use cdp command to change webdriver value as it will lead to inconsistency which later can be used to detect webdriver. Use the below code, this will remove any traces of webdriver.
options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")

Before (in browser console window):
> navigator.webdriver
true
Change (in selenium):
// C#
var options = new ChromeOptions();
options.AddExcludedArguments(new List<string>() { "enable-automation" });
// Python
options.add_experimental_option("excludeSwitches", ['enable-automation'])
After (in browser console window):
> navigator.webdriver
undefined
This will not work for version ChromeDriver 79.0.3945.16 and above. See the release notes here

To exclude the collection of enable-automation switches as mentioned in the 6-Nov-2019 update of the top voted answer doesn't work anymore as of April 2020. Instead I was getting the following error:
ERROR:broker_win.cc(55)] Error reading broker pipe: The pipe has been ended. (0x6D)
Here's what's working as of 6th April 2020 with Chrome 80.
Before (in the Chrome console window):
> navigator.webdriver
true
Python example:
options = webdriver.ChromeOptions()
options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")
After (in the Chrome console window):
> navigator.webdriver
undefined

Nowadays you can accomplish this with cdp command:
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
driver.get(some_url)
by the way, you want to return undefined, false is a dead giveaway.

Finally this solved the problem for ChromeDriver, Chrome greater than v79.
ChromeOptions options = new ChromeOptions();
options.addArguments("--disable-blink-features");
options.addArguments("--disable-blink-features=AutomationControlled");
ChromeDriver driver = new ChromeDriver(options);
Map<String, Object> params = new HashMap<String, Object>();
params.put("source", "Object.defineProperty(navigator, 'webdriver', { get: () => undefined })");
driver.executeCdpCommand("Page.addScriptToEvaluateOnNewDocument", params);

Since this question is related to selenium a cross-browser solution to overriding navigator.webdriver is useful. This could be done by patching browser environment before any JS of target page runs, but unfortunately no other browsers except chromium allows one to evaluate arbitrary JavaScript code after document load and before any other JS runs (firefox is close with Remote Protocol).
Before patching we needed to check how the default browser environment looks like. Before changing a property we can see it's default definition with Object.getOwnPropertyDescriptor()
Object.getOwnPropertyDescriptor(navigator, 'webdriver');
// undefined
So with this quick test we can see webdriver property is not defined in navigator. It's actually defined in Navigator.prototype:
Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver');
// {set: undefined, enumerable: true, configurable: true, get: ƒ}
It's highly important to change the property on the object that owns it, otherwise the following can happen:
navigator.webdriver; // true if webdriver controlled, false otherwise
// this lazy patch is commonly found on the internet, it does not even set the right value
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
});
navigator.webdriver; // undefined
Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get.apply(navigator);
// true
A less naive patch would first target the right object and use right property definition, but digging deeper we can find more inconsistences:
const defaultGetter = Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get;
defaultGetter.toString();
// "function get webdriver() { [native code] }"
Object.defineProperty(Navigator.prototype, 'webdriver', {
set: undefined,
enumerable: true,
configurable: true,
get: () => false
});
const patchedGetter = Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get;
patchedGetter.toString();
// "() => false"
A perfect patch leaves no traces, instead of replacing getter function it would be good if we could just intercept the call to it and change the returned value. JavaScript has native support for that throught Proxy apply handler:
const defaultGetter = Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get;
defaultGetter.apply(navigator); // true
defaultGetter.toString();
// "function get webdriver() { [native code] }"
Object.defineProperty(Navigator.prototype, 'webdriver', {
set: undefined,
enumerable: true,
configurable: true,
get: new Proxy(defaultGetter, { apply: (target, thisArg, args) => {
// emulate getter call validation
Reflect.apply(target, thisArg, args);
return false;
}})
});
const patchedGetter = Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get;
patchedGetter.apply(navigator); // false
patchedGetter.toString();
// "function () { [native code] }"
The only inconsistence now is in the function name, unfortunately there is no way to override the function name shown in native toString() representation. But even so it can pass generic regular expressions that searches for spoofed browser native functions by looking for { [native code] } at the end of its string representation. To remove this inconsistence you can patch Function.prototype.toString and make it return valid native string representations for all native functions you patched.
To sum up, in selenium it could be applied with:
chrome.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {'source': """
Object.defineProperty(Navigator.prototype, 'webdriver', {
set: undefined,
enumerable: true,
configurable: true,
get: new Proxy(
Object.getOwnPropertyDescriptor(Navigator.prototype, 'webdriver').get,
{ apply: (target, thisArg, args) => {
// emulate getter call validation
Reflect.apply(target, thisArg, args);
return false;
}}
)
});
"""})
The playwright project maintains a fork of Firefox and WebKit to add features for browser automation, one of them is equivalent to Page.addScriptToEvaluateOnNewDocument, but there is no implementation for Python of the communication protocol but it could be implemented from scratch.

Simple hack for python:
options = webdriver.ChromeOptions()
options.add_argument("--disable-blink-features=AutomationControlled")

As mentioned in the above comment - https://stackoverflow.com/a/60403652/2923098 the following option totally worked for me (in Java)-
ChromeOptions options = new ChromeOptions();
options.addArguments("--incognito", "--disable-blink-features=AutomationControlled");

I would like to add a Java alternative to the cdp command method mentioned by pguardiario
Map<String, Object> params = new HashMap<String, Object>();
params.put("source", "Object.defineProperty(navigator, 'webdriver', { get: () => undefined })");
driver.executeCdpCommand("Page.addScriptToEvaluateOnNewDocument", params);
In order for this to work you need to use the ChromiumDriver from the org.openqa.selenium.chromium.ChromiumDriver package. From what I can tell that package is not included in Selenium 3.141.59 so I used the Selenium 4 alpha.
Also, the excludeSwitches/useAutomationExtension experimental options do not seem to work for me anymore with ChromeDriver 79 and Chrome 79.

For those of you who've tried these tricks, please make sure to also check that the user-agent that you are using is the user agent that corresponds to the platform (mobile / desktop / tablet) your crawler is meant to emulate. It took me a while to realize that was my Achilles heel ;)

Python
I tried most of the stuff mentioned in this post and i was still facing issues.
What saved me for now is https://pypi.org/project/undetected-chromedriver
pip install undetected-chromedriver
import undetected_chromedriver.v2 as uc
from time import sleep
from random import randint
driver = uc.Chrome()
driver.get('www.your_url.here')
driver.maximize_window()
sleep(randint(3,9))
A bit slow but i will take slow over non working.
I guess if every interested could go over the source code and see what provides the win there.

If you use a Remote Webdriver , the code below will set navigator.webdriver to undefined.
work for ChromeDriver 81.0.4044.122
Python example:
options = webdriver.ChromeOptions()
# options.add_argument("--headless")
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')
driver = webdriver.Remote(
'localhost:9515', desired_capabilities=options.to_capabilities())
script = '''
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
'''
driver.execute_script(script)

Use --disable-blink-features=AutomationControlled to disable navigator.webdriver

Related

How to Prevent Selenium 3.0 (Geckodriver) from Creating Temporary Firefox Profiles?

I'm running the latest version of Selenium WebDriver with Geckodriver. I want to prevent Selenium from creating temporary Firefox Profiles in the temporary files directory when launching a new instance of WebDriver. Instead I want to use the original Firefox Profile directly. This has double benefit. First, it saves time (it takes significant amount of time for the profile to be copied to the temporary directory). Second, it ensures that cookies created during session are saved to the original profile. Before Selenium started relying on Geckodriver I was able to solve this problem by editing the class FirefoxProfile.class in SeleniumHQ as seen below:
public File layoutOnDisk() {
File profileDir;
if (this.disableTempProfileCreation) {
profileDir = this.model;
return profileDir;
} else {
try {
profileDir = TemporaryFilesystem.getDefaultTmpFS().createTempDir("ABC", "XYZ");
File userPrefs = new File(profileDir, "user.js");
this.copyModel(this.model, profileDir);
this.installExtensions(profileDir);
this.deleteLockFiles(profileDir);
this.deleteExtensionsCacheIfItExists(profileDir);
this.updateUserPrefs(userPrefs);
return profileDir;
} catch (IOException var3) {
throw new UnableToCreateProfileException(var3);
}
}
}
This would stop Selenium from creating a temporary Firefox Profile when the parameter disableTempProfileCreation was set to true.
However, now that Selenium is being controlled by Geckodriver this solution no longer works as the creation (and launch) of Firefox Profile is controlled by Geckodriver.exe (which is written in Rust language). How can I achieve the same objective with Geckodriver? I don't mind editing the source code. I'm using Java.
Thanks
Important Update:
I would like to thank everyone for taking the time to respond to this question. However, as stated in some of the comments, the first 3 answers do not address the question at all - for two reasons. First of all, using an existing Firefox Profile will not prevent Geckodriver from copying the original profile to a temporary directory (as indicated in the OP and clearly stated by one or more of the commentators below). Second, even if it did it is not compatible with Selenium 3.0.
I'm really not sure why 3 out of 4 answer repeat the exact same answer with the exact same mistake. Did they read the question? The only answer the even attempts to address the question at hand is the answer by #Life is complex however it is incomplete. Thanks.

UPDATED POST 05-30-2021
This is the hardest question that I have every tried to answer on Stack Overflow. Because it involved the interactions of several code bases written in multiple languages (Java, Rust and C++). This complexity made the question potentially unsolvable.
My last crack at this likely unsolvable question:
Within the code in your question you are modifying the file user.js This file is still used by Selenium.
public FirefoxProfile() {
this(null);
}
/**
* Constructs a firefox profile from an existing profile directory.
* <p>
* Users who need this functionality should consider using a named profile.
*
* #param profileDir The profile directory to use as a model.
*/
public FirefoxProfile(File profileDir) {
this(null, profileDir);
}
#Beta
protected FirefoxProfile(Reader defaultsReader, File profileDir) {
if (defaultsReader == null) {
defaultsReader = onlyOverrideThisIfYouKnowWhatYouAreDoing();
}
additionalPrefs = new Preferences(defaultsReader);
model = profileDir;
verifyModel(model);
File prefsInModel = new File(model, "user.js");
if (prefsInModel.exists()) {
StringReader reader = new StringReader("{\"frozen\": {}, \"mutable\": {}}");
Preferences existingPrefs = new Preferences(reader, prefsInModel);
acceptUntrustedCerts = getBooleanPreference(existingPrefs, ACCEPT_UNTRUSTED_CERTS_PREF, true);
untrustedCertIssuer = getBooleanPreference(existingPrefs, ASSUME_UNTRUSTED_ISSUER_PREF, true);
existingPrefs.addTo(additionalPrefs);
} else {
acceptUntrustedCerts = true;
untrustedCertIssuer = true;
}
// This is not entirely correct but this is not stored in the profile
// so for now will always be set to false.
loadNoFocusLib = false;
try {
defaultsReader.close();
} catch (IOException e) {
throw new WebDriverException(e);
}
}
So in theory you should be able to modify capabilities.rs in the geckodriver source code. That file contains the temp_dir.
As I stated this in only a theory, because when I looked at the Firefox source, which has temp_dir spread throughout the code base.
ORIGINAL POST 05-26-2021
I'm not sure that you can prevent Selenium from creating a temporary Firefox Profile.
From the gecko documents:
"Profiles are created in the systems temporary folder. This is also where the encoded profile is extracted when profile is provided. By default geckodriver will create a new profile in this location."
The only solution that I see at the moment would require you modify the Geckodriver source files to prevent the creation of temporary folders/profiles.
I'm currently looking at the source. These files might be the correct ones, but I need to look at the source more:
https://searchfox.org/mozilla-central/source/browser/app/profile/firefox.js
https://searchfox.org/mozilla-central/source/testing/mozbase/mozprofile/mozprofile/profile.py
Here are some other files that need to be combed through:
https://searchfox.org/mozilla-central/search?q=tempfile&path=
This looks promising:
https://searchfox.org/mozilla-central/source/testing/geckodriver/doc/Profiles.md
"geckodriver uses [profiles] to instrument Firefox’ behaviour. The
user will usually rely on geckodriver to generate a temporary,
throwaway profile. These profiles are deleted when the WebDriver
session expires.
In cases where the user needs to use custom, prepared profiles,
geckodriver will make modifications to the profile that ensures
correct behaviour. See [Automation preferences] below on the
precedence of user-defined preferences in this case.
Custom profiles can be provided two different ways:
1. by appending --profile /some/location to the [args capability],
which will instruct geckodriver to use the profile in-place;
I found this question on trying to do this: how do I use an existing profile in-place with Selenium Webdriver?
Also here is an issue that was raised in selenium on Github concerning the temp directory. https://github.com/SeleniumHQ/selenium/issues/8645
Looking through the source of geckodriver v0.29.1 I found a file where the profile is loaded.
source: capabilities.rs
fn load_profile(options: &Capabilities) -> WebDriverResult<Option<Profile>> {
if let Some(profile_json) = options.get("profile") {
let profile_base64 = profile_json.as_str().ok_or_else(|| {
WebDriverError::new(ErrorStatus::InvalidArgument, "Profile is not a string")
})?;
let profile_zip = &*base64::decode(profile_base64)?;
// Create an emtpy profile directory
let profile = Profile::new()?;
unzip_buffer(
profile_zip,
profile
.temp_dir
.as_ref()
.expect("Profile doesn't have a path")
.path(),
)?;
Ok(Some(profile))
} else {
Ok(None)
}
}
source: marionette.rs
fn start_browser(&mut self, port: u16, options: FirefoxOptions) -> WebDriverResult<()> {
let binary = options.binary.ok_or_else(|| {
WebDriverError::new(
ErrorStatus::SessionNotCreated,
"Expected browser binary location, but unable to find \
binary in default location, no \
'moz:firefoxOptions.binary' capability provided, and \
no binary flag set on the command line",
)
})?;
let is_custom_profile = options.profile.is_some();
let mut profile = match options.profile {
Some(x) => x,
None => Profile::new()?,
};
self.set_prefs(port, &mut profile, is_custom_profile, options.prefs)
.map_err(|e| {
WebDriverError::new(
ErrorStatus::SessionNotCreated,
format!("Failed to set preferences: {}", e),
)
})?;
let mut runner = FirefoxRunner::new(&binary, profile);
runner.arg("--marionette");
if self.settings.jsdebugger {
runner.arg("--jsdebugger");
}
if let Some(args) = options.args.as_ref() {
runner.args(args);
}
// https://developer.mozilla.org/docs/Environment_variables_affecting_crash_reporting
runner
.env("MOZ_CRASHREPORTER", "1")
.env("MOZ_CRASHREPORTER_NO_REPORT", "1")
.env("MOZ_CRASHREPORTER_SHUTDOWN", "1");
let browser_proc = runner.start().map_err(|e| {
WebDriverError::new(
ErrorStatus::SessionNotCreated,
format!("Failed to start browser {}: {}", binary.display(), e),
)
})?;
self.browser = Some(Browser::Host(browser_proc));
Ok(())
}
pub fn set_prefs(
&self,
port: u16,
profile: &mut Profile,
custom_profile: bool,
extra_prefs: Vec<(String, Pref)>,
) -> WebDriverResult<()> {
let prefs = profile.user_prefs().map_err(|_| {
WebDriverError::new(
ErrorStatus::UnknownError,
"Unable to read profile preferences file",
)
})?;
for &(ref name, ref value) in prefs::DEFAULT.iter() {
if !custom_profile || !prefs.contains_key(name) {
prefs.insert((*name).to_string(), (*value).clone());
}
}
prefs.insert_slice(&extra_prefs[..]);
if self.settings.jsdebugger {
prefs.insert("devtools.browsertoolbox.panel", Pref::new("jsdebugger"));
prefs.insert("devtools.debugger.remote-enabled", Pref::new(true));
prefs.insert("devtools.chrome.enabled", Pref::new(true));
prefs.insert("devtools.debugger.prompt-connection", Pref::new(false));
}
prefs.insert("marionette.log.level", logging::max_level().into());
prefs.insert("marionette.port", Pref::new(port));
prefs.write().map_err(|e| {
WebDriverError::new(
ErrorStatus::UnknownError,
format!("Unable to write Firefox profile: {}", e),
)
})
}
}
After looking through the gecko source it looks like mozprofile::profile::Profile is coming from FireFox and not geckodriver
It seems that you might have issues with profiles when you migrate to Selenium 4.
ref: https://github.com/SeleniumHQ/selenium/issues/9417
For Selenium 4 we have deprecated the use of profiles as there are other mechanisms that we can do to make the start up faster.
Please use the Options class to set preferences that you need and if you need to use an addon use the driver.install_addon("path/to/addon")
you can install selenium 4, which is in beta, via pip install selenium --pre
I noted in your code you were writing to user.js, which is a custom file for FireFox. Have you considered creating on these files manually outside of Gecko?
Also have you looked at mozprofile?

Thanks to source code provided in answer of Life is complex in link!. I have the chance to look through geckodriver source.
EXPLANATION
I believe that the reason you could not find out any rust_tmp in source because it is generated randomly by Profile::new() function.
When I look deeper in code structure, I saw that browser.rs is the place where the browser is actually loaded which is called through marionette.rs. If you noticing carefully, LocalBrowser::new method will be called whenever a new session is initialized and the profile will be loaded in that state also. Then by checking browser.rs file, there will be a block code line 60 - 70 used to actually generate profile for new session instance. Now, what need to do is modifying this path to load your custom profile.
SHORT ANSWER
Downloading zip file of geckodriver-0.30.0, extracting it by your prefer zip program :P
Looking on src/browser.rs of geckodriver source, in line 60 - 70, hoping you will see something like this:
let is_custom_profile = options.profile.is_some();
let mut profile = match options.profile {
Some(x) => x,
None => Profile::new()?,
};
Change it to your prefer folder ( hoping you know some rust code ), example:
/*
let mut profile = match options.profile {
Some(x) => x,
None => Profile::new()?,
};
*/
let path = std::path::Path::new("path-to-profile");
let mut profile = Profile::new_from_path(path)?;
Re-compile with prefer rust compiler, example:
Cargo build
NOTE
Hoping this info will help you someway. This is not comprehensive but hoping it is good enough hint for you like it is possible to write some extra code to load profile from env or pass from argument, it is possible but I'm not rust developer so too lazy for providing one in here.
The above solution is work fine for me and I could load and use directly my profile from that. Btw, I work on Archlinux with rust info: cargo 1.57.0.
TBH, this is first time I push comment on stackoverflow, so feel free to correct me if I'm wrong or produce unclear answer :P
Update
I worked in geckodriver 0.30.0 which will not be the same as geckodriver 0.29.1 mentioned by Life is complex. But the change between 2 versions just be split action, so the similar modify path in version 0.29.1 will be included in method MarionetteHandler::start_browser in file src/marionette.rs.
Since my starting point is Life is complex answer, please looking through it for more information.

I've come up with a solution that 1) works with Selenium 4.7.0--however, I don't see why it wouldn't work with 3.x as well, 2) allows the user to pass in an existing Firefox profile dynamically via an environment variable--and if this environment variable doesn't exist, simply acts "normally", and 3) if you do not want a temporary copy of the profile directory, simply do not pass the source profile directory to Selenium.
I downloaded Geckodriver 0.32.0 and made it so that you simply need to provide the Firefox profile directory via the environment variable FIREFOX_PROFILE_DIR. For example, in C#, before you create the FirefoxDriver, call:
Environment.SetEnvironmentVariable("FIREFOX_PROFILE_DIR", myProfileDir);
The change to Rust is in browser.rs, line 88, replacing:
let mut profile = match options.profile {
ProfileType::Named => None,
ProfileType::Path(x) => Some(x),
ProfileType::Temporary => Some(Profile::new(profile_root)?),
};
with:
let mut profile = if let Ok(profile_dir) = std::env::var("FIREFOX_PROFILE_DIR") {
Some(Profile::new_from_path(Path::new(&profile_dir))?)
} else {
match options.profile {
ProfileType::Named => None,
ProfileType::Path(x) => Some(x),
ProfileType::Temporary => Some(Profile::new(profile_root)?),
}
};
You may refer to my Git commit to see the diff against the original Geckodriver code.

The new driver by default creates a new profile if no options are set. To use a existing profile, one way to do this is to set the system property webdriver.firefox.profile before creating the firefox driver. A small code snippet that can create a firefox driver (given you have locations for geckodriver, and the firefox profile):
System.setProperty("webdriver.gecko.driver","path_to_gecko_driver");
System.setProperty("webdriver.firefox.profile", "path_to_firefox_profile");
WebDriver driver = new FirefoxDriver();
You could even set these system properties using the env. variables and skip defining them everywhere.
Another way to do this is to use the FirefoxOptions class which allows you to configure a lot of options. To start with, take a look at org.openqa.selenium.firefox.FirefoxDriver and org.openqa.selenium.firefox.FirefoxOptions. A small example:
FirefoxOptions options = new FirefoxOptions();
options.setProfile(new FirefoxProfile(new File("path_to_your_profile")));
WebDriver driver = new FirefoxDriver(options);
Hope this is helpful.

You can create firefox profile which will be clean and name it as SELENIUM
So When initializing the Webdriver get the profile which you have already created through the code, so that it wont create any new temp profiles all the time.
ProfilesIni allProfiles = new ProfilesIni();
FirefoxProfile desiredProfile = allProfiles.getProfile("SELENIUM");
WebDriver driver = new FirefoxDriver(desiredProfile);
That way, you assure that this profile will be used anytime you do the tests.
-Arjun

You can handle this by using --
FirefoxProfile profile = new FirefoxProfile(new File("D:\\Selenium Profile..."));
WebDriver driver = new FirefoxDriver(profile);
There is one more option but it inherits all the cookies, cache contents, etc. of the previous uses of the profile let’s see how it will be --
System.setProperty("webdriver.firefox.profile", "MySeleniumProfile");
WebDriver driver = new FirefoxDriver(...);
Hope this answers your question in short.

Robot Framework - set Protected Mode settings for IE

I'm encountering the following error: "Unexpected error launching Internet Explorer. Protected Mode settings are not the same for all zones. Enable Protected Mode must be set to the same value (enabled or disabled for all zones)." when opening IE using Selenium WebDriver.
In Java (using selenium-server 3.8.1), I solved this by using:
InternetExplorerOptions options = new InternetExplorerOptions();
options.setCapability(InternetExplorerDriver.INTRODUCE_FLAKINESS_BY_IGNORING_SECURITY_DOMAINS, true);
driver = new InternetExplorerDriver(options);
How do I do this for Robot Framework (using Java port of SeleniumLibrary: robotframework-seleniumlibrary-3.8.1.0-jar-with-dependencies)?
${ie_options}= Create Dictionary InternetExplorerDriver.INTRODUCE_FLAKINESS_BY_IGNORING_SECURITY_DOMAINS=true
Open Browser ${url} ie None None ${ie_options} None
I tried the one above but I still encounter the error. Changed it to ignoreProtectedModeSettings to no avail. Any ideas?

I have written Custom Keyword which updates the Windows Registry to enable ProtectedMode for all Zones.
Below is Python code :
from winreg import *
def Enable_Protected_Mode():
"""
# 0 is the Local Machine zone
# 1 is the Intranet zone
# 2 is the Trusted Sites zone
# 3 is the Internet zone
# 4 is the Restricted Sites zone
# CHANGING THE SUBKEY VALUE "2500" TO DWORD 0 ENABLES PROTECTED MODE FOR THAT ZONE.
# IN THE CODE BELOW THAT VALUE IS WITHIN THE "SetValueEx" FUNCTION AT THE END AFTER "REG_DWORD".
"""
try:
keyVal = r'Software\Microsoft\Windows\CurrentVersion\Internet Settings\Zones\1'
key = OpenKey(HKEY_CURRENT_USER, keyVal, 0, KEY_ALL_ACCESS)
SetValueEx(key, "2500", 0, REG_DWORD, 0)
except Exception:
print("Failed to enable protected mode")
You can write the same code in Java.Check here for more help !!!

To do this directly in the Robot Framework:
${ie_dc} = Evaluate
... sys.modules['selenium.webdriver'].DesiredCapabilities.INTERNETEXPLORER
... sys, selenium.webdriver
${ieOptions} = Create Dictionary ignoreProtectedModeSettings=${True}
Set To Dictionary ${ie_dc} se:ieOptions ${ieOptions}
Open Browser ${url} ie desired_capabilities=${ie_dc}
At some point the ignoreProtectedModeSettings got placed inside the se:ieOptions dictionary within the capabilities dictionary. You can see this if you debug Selenium's Python library, specifically webdriver/remote/webdriver.py and look at the response in start_session.

I was facing the same issue and tried to use Dinesh Pundkar's answer but it did not work. Finally, I was able to find this https://stackoverflow.com/a/63543398/3297490 and it worked like a charm.
One thing to note however, after running the vbs script I checked in the IE settings and the protected mode settings were still shown the way they were and they did not really come back to the normal levels.

Is there a non-remote way to specify geckodriver location when you cannot specify it by System property or Path?

In my application I cannot set geckodriver executable location using System.setProperty and I cannot set it in the path.
Why? Because my app is multi-tenant... and each tenant has their own directory where Firefox and Geckodriver is copied and ran. This is due to bugs in the Firefox + Geckodriver, where infinite javascript loops and several other situations cause Firefox to hang until manual kill. Sometimes quit fails to kill things completely as well. So we need to supply a custom geckodriver location within the JVM per-tenant. Thus the problem.
So I am instead using:
driverService = new GeckoDriverService.Builder()
.usingDriverExecutable(new File(geckoDriverBinaryPath))
.build();
driverService.start();
RemoteWebDriver driver = new RemoteWebDriver(driverServiceUrl, capabilities);
But that is making me use a RemoteWebDriver when I am not remote.
Is there a better way to do this?

Rather than calling start() on the FirefoxDriverService object, why not simply use the FirefoxDriver constructor that takes the service?
driverService = new GeckoDriverService.Builder()
.usingDriverExecutable(new File(geckoDriverBinaryPath))
.build();
WebDriver driver = new FirefoxDriver(driverService);

As the questions stands it is still too broad. There are some unknowns: How are you running this? JUnit?, Maven?, Jenkins? And I am still not clear where this per-tenat geckoDriverBinaryPath comes from and how it is passed around.
What is wrong with just using:
System.setProperty("webdriver.gecko.driver", geckoDriverBinaryPath);
You could set an environment variable in your OS. Something like export geckoDriverBinary=/some/path and then in your code read it back using:
String geckoDriverBinaryPath = System.getenv("geckoDriverBinary");
System.setProperty("webdriver.gecko.driver", geckoDriverBinaryPath);
...
If you are running it from command line, either straight up or using Maven, you could pass the variable in like -DgeckoDriverBinaryPath=/some/path and then in your code read it back using:
String geckoDriverBinaryPath = System.getProperty("geckoDriverBinary");
System.setProperty("webdriver.gecko.driver", geckoDriverBinaryPath);
...
If the different tenants have the path fixed, you could write a utility function that would detect which tenant it is being run on, and set the property accordingly.
This answer is probably going to get closed as not-answer, but more of a discussion. :(

Selenium WebDriver - changing browser's language on linux

Opening a browser (google chrome) in a different language through selenium WebDriver works fine when running on PC, as described here. But when trying it on linux based systems, or mac-os, it simply doesn't work, and the browser opens on it's default language. I tried using different language code, such as "es_ES" or "es-ES" instead of "es", but nothing helped. Is it a different language code for linux, or is it a different way to manipulate the web driver and not use the "--lang" command?
Thanks.

I haven't try it but I think you can change the setting from chrome itself as :-
settings -> Lamguages -> Add languages.
Add your language there and give a try. remove other language if required.
For IE refer below link :-
http://www.reliply.org/info/internet/http-accept-lang.html
I have also found a code on same link you shared. Have you tried it?
DesiredCapabilities jsCapabilities = DesiredCapabilities.chrome();
ChromeOptions options = new ChromeOptions();
Map<String, Object> prefs = new HashMap<>();
prefs.put("intl.accept_languages", language);
options.setExperimentalOption("prefs", prefs);
jsCapabilities.setCapability(ChromeOptions.CAPABILITY, options);
Source :-
Set Chrome's language using Selenium ChromeDriver

Maybe you also need to set prefs > intl > accept_language: en-GB
"desiredCapabilities": {
"browserName": "chrome",
"chromeOptions": {
"args": ["--lang=en-GB"],
"prefs": {
"intl": {
"accept_languages": "en-GB"
}
}
}
}

As you can read at developer.chrome.com, there is a system-dependand way to set language for Chrome. On Linux, a environment variable is required.
I created a Bash script like this:
#!/bin/sh
LANGUAGE="en" "/home/plap/projects/pdf-exporter/chromedriver" $*
Then I use the path of the script in the place of the path of the real chromedriver executable.
Moreover, since I need to swich language programmatically, I made code to save scripts like that programmatically at each new language required. Indeed the code calls:
System.setProperty("webdriver.chrome.driver", executableFile.getAbsolutePath());
in a synchronized block together with new ChromeDriver(options)... Yes, it's horrible!

Using Selenium WebDriver JavascriptExecutor to manipulate one JS variable in separate scripts

I need to reach the following scenario:
1) Initializing JS var with JavascriptExecutor that will indicate if some operation is done.
2) Do some ordinary manipulation with the renderer page.
3) Verify the change to the var created in point (1).
For example:
jsc.executeScript("var test = false;");
Now, some manipulation is done.
And then:
String testVal = jsc.executeScript("return test;").toString
I get the error:
org.openqa.selenium.WebDriverException: {"errorMessage":"Can't find
variable: test","request":{"headers":{"Accept":"application/json,
image/png","Connection":"Keep-Alive","Content-Length":"35","Content-Type":"application/json;
charset=utf-8","Host":"localhost:14025"},"httpVersion":"1.1","method":"POST","post":"{\"args\":[],\"script\":\"return
test;\"}","url":"/execute","urlParsed":{"anchor":"","query":"","file":"execute","directory":"/","path":"/execute","relative":"/execute","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/execute","queryKey":{},"chunks":["execute"]},"urlOriginal":"/session/7e2c8ab0-b781-11e4-8a54-6115c321d700/execute"}}
When i'm running them in the same execution, it works correctly.:
String testVal = jsc.executeScript("var test = false; return test;").toString;
From JavascriptExecutor doc i found the explanation i needed:
Within the script, use document to refer to the current document. Note
that local variables will not be available once the script has
finished executing, though global variables will persist.
What is my alternative/workaround to this?

Not sure what is the motivation behind it, but you can use a globally available window object:
jsc.executeScript("window.test = false;");
String testVal = jsc.executeScript("return window.test;").toString
It might also be a use case for executeAsyncSript(), see:
Understanding execute async script in Selenium

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to prevent detection with Selenium in Java? [duplicate] - java

ChromeDriver: Finally discovered the simple solution for this with a simple flag! :) --disable-blink-features=AutomationControlled navigator.webdriver=true will no longer show up with that flag set. For a list of things you can disable, check them out here

Nowadays you can accomplish this with cdp command: driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", { "source": """ Object.defineProperty(navigator, 'webdriver', { get: () => undefined }) """ }) driver.get(some_url) by the way, you want to return undefined, false is a dead giveaway.

Simple hack for python: options = webdriver.ChromeOptions() options.add_argument("--disable-blink-features=AutomationControlled")

As mentioned in the above comment - https://stackoverflow.com/a/60403652/2923098 the following option totally worked for me (in Java)- ChromeOptions options = new ChromeOptions(); options.addArguments("--incognito", "--disable-blink-features=AutomationControlled");

For those of you who've tried these tricks, please make sure to also check that the user-agent that you are using is the user agent that corresponds to the platform (mobile / desktop / tablet) your crawler is meant to emulate. It took me a while to realize that was my Achilles heel ;)

Use --disable-blink-features=AutomationControlled to disable navigator.webdriver

Related

How to Prevent Selenium 3.0 (Geckodriver) from Creating Temporary Firefox Profiles?

Robot Framework - set Protected Mode settings for IE

Is there a non-remote way to specify geckodriver location when you cannot specify it by System property or Path?

Selenium WebDriver - changing browser's language on linux

Using Selenium WebDriver JavascriptExecutor to manipulate one JS variable in separate scripts

Categories

Resources