How to copy Google Translate's Chinese transliteration using Selenium? - java

I'm trying to extract Google Translate's pinyin transliteration of a Chinese word using Selenium but am having some trouble finding its WebElement.
For example, the word I look up is "事". My code would be as follows:
String word = "事";
WebDriver driver = new HtmlUnitDriver();
driver.get("http://translate.google.com/#zh-CN/zh-CN/" + word);
When I go to the actual page using my browser, I can see that its pinyin is "Shì" and that its id, according to Inspect Element is src-translit. However, when I go to view source, though the id="src-translit" is present, you don't see anything resembling "Shì" nearby. It's simply empty.
Thinking that the page has had no time to load properly. I implemented a waiting period of 30 seconds (kind of a long wait, I know, but I just wanted to know if it would work).
int timeoutInSeconds = 30;
WebDriverWait wait = new WebDriverWait(driver, timeoutInSeconds);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.id("src-translit")));
Unfortunately, even with the wait time, transliteration and its text still returns as empty.
WebElement transliteration = driver.findElement(By.id("src-translit"));
String pinyin = transliteration.getText();
My question, then, is: what's happened to the src-translit? Why won't it display in the html code and how can I go about finding it and copying it from Google Translate?

Sounds like javascript isn't being executed. Looking at the docs, you can enable javascript like this
HtmlUnitDriver driver = new HtmlUnitDriver();
driver.setJavascriptEnabled(true);
or
HtmlUnitDriver driver = new HtmlUnitDriver(true);
See if that makes a difference.
EDIT:
I still think the problem is related to javascript. When I run it using FirefoxDriver, it works fine: the AJAX request is made, and src-translit element has been updated with Shi.
Workaround:
In any case, monitoring the network traffic, you can see that when you want to translate 事 , it makes an AJAX call to
http://translate.google.com/translate_a/t?client=t&sl=zh-CN&tl=zh-CN&hl=en&sc=2&ie=UTF-8&oe=UTF-8&pc=1&oc=1&otf=1&rom=1&srcrom=1&ssel=0&tsel=0&q=%E6%B2%92%E4%BA%8B
Which returns JSON:
[[["事","事","Shì","Shì"]],,"zh-CN",,[["事",,false,false,0,0,0,0]],,,,[],10]
Maybe you could parse that instead for now.

Related

code, sometimes works some times doesnt. different error messages

i have the following issue, after perform the login, the system load the main page, its like 5 seconds doing it, so after that, the script should type over 3 fields and perform a tab to get more info, but the thing is that sometimes is working and sometimes does not, in 5 chances, only one worked, i do not know if its something about the time to get all fields or something like that, am using IE 9 cause the app only works on IE
Here´s the code:
`
System.setProperty("webdriver.ie.driver","C:\\Apps\\eclipse\\IEDriverServer.ex;
WebDriver driver = new InternetExplorerDriver();
driver.get("http://15.192.41.95/Cabina/asp/Login.asp");
WebElement text1 = driver.findElement(By.id("text1"));
text1.sendKeys("xxxx");
WebElement password1 = driver.findElement(By.id("password1"));
password1.sendKeys("xxxx");
WebElement aceptar = driver.findElement(By.id("ok1"));
aceptar.click();
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
driver.switchTo().frame(driver.findElement(By.name("menu")));
WebElement txtNumPolSol = driver.findElement (By.id("txtNumPolSol"));
txtNumPolSol.sendKeys("877885");
WebElement Text8 = driver.findElement(By.name("txtNumofic"));
Text8.sendKeys("228");
WebElement txtCveInc = driver.findElement(By.name("txtCveInc"));
txtCveInc.sendKeys("1");
WebElement clave = driver.findElement(By.id("txtCveInc"));
clave.sendKeys(Keys.TAB);
driver.switchTo().frame(driver.findElement(By.name("dest")));
WebElement txtNomrepo = driver.findElement(By.id("txtNomrepo"));
txtNomrepo.sendKeys("Jorge Villarreal");
driver.findElement(By.id("txtRelacion")).sendKeys("Conductor");
WebElement txtTelrepo = driver.findElement(By.id("txtTelrepo"));
txtTelrepo.sendKeys("83029090");`
Here the different issues i got:
1) Unable to find element with name == txtOficina
2) Element is no longer valid
3) Unable to find element with id == txtCveInc (the field is there)
4) Unable to find element with name == txtCveInc (the field is there)
The steps that the system to get the issues are:
1) Log in (so far so good)
2) The system load the main page (the page has frames and gets all fields in 5 secs...)
3) The script type over the txtNumPolSol, txtNumofic and txtCveInc (most of the issues are in the last two fields)
4) The script performs a tab
5) The system load the some information regarding the record and the script continue...
Note: Almost all the issues occur on step 3...
Thanks for your feedback!
This sounds like a timing problem. Your simulated user is proceeding faster than the page becomes ready. Depending on the timing of the page loading, different problems occur.
The solution is to add a wait after steps that trigger a DOM change than influences your next step, but doesn't cause WebDriver to wait before returning. Google 'webdriver wait for element' to get lots of ways to do it.
I was also facing the similar kind of problem but i tried to find the elements with Css_Selectors and X paths instead of ids, that worked for me
The key here is to add atleast 1 second of implicit wait after every action, i.e. send keys etc. It will allow 'driver.findElement' enough time to find the element. Though I would suggest keeping this code in a testBase or a reusable method.
driver.manage().timeouts().implicitlyWait(1, TimeUnit.SECONDS);
It will help in making your webtests less flaky.

Selenium webdriver 2.47.1 how to wait for a page reload

I'm using webdriver(java) for a unique test where a page reloads itself when you log on, I've been unable to wait for the element to load because it is already there before the reload. So far the only way I've been able to get it to work is to use a thread.sleep, is there a way to listen for a page refresh?
One way to solve this is, to get a reference to the element you need, that appears both on the login-page and the reloaded page.
Then you can use the ExpectedConditions.stalenessOf to occur, and then you can be sure, that the element is removed from the DOM and a new element is created. Well, the last part is not guaranteed by this method, but at least you know that the old element is gone.
The code could look something like this:
WebElement elementOldPage = driver.findElement(By.id("yourid"));
... do login etc ...
WebDriverWait wait = new WebDriverWait(driver, 10);
wait.until(ExpectedConditions.stalenessOf(elementOldPage));
WebElement elementNewPage = driver.findElement(By.id("yourid"));
Building upon the accepted answer from Kim Schiller one might be interested in the following piece of code. It is surely not perfect due to the sleeps, so be free to suggest improvements to make it more bulletproof. Also note I'm no expert with selenium.
The if branch waits for the top level node in the html to go stale in case of a page reload. The else branch simply waits until the drivers url matches the request url in case we load a different page.
def safe_page_load(url):
if driver.current_url == url:
tmp = driver.find_element_by_xpath('/html')
driver.get(url)
WebDriverWait(driver, 2).until(ExpectedConditions.staleness_of(tmp))
else:
driver.get(url)
while(driver.current_url) != url:
sleep(0.3)
sleep(0.3)
Happy if I could help someone.

WebDriver : SendKeys(Integer) NOT working in Firefox 29

I am using Firefox 29 and WebDriver java 2.41.0 bindings to automate test scenarios. Have one scenario to input an integer to a input-box which was working FINE with Firefox 28 and now failing with v29 i.e latest FF version. The code I wrote for the same is:
int inputString = 123456;
driver.FindElement(By.Id("tinymce")).SendKeys(inputString);
Please help me getting through of this.
This will be the result of this issue:
https://code.google.com/p/selenium/issues/detail?id=7291
Fixed by this revision in the Selenium code:
https://code.google.com/p/selenium/source/detail?r=afde40cbbf5c
Quick test below worked for me. I understand JS is not the right way to do browser simulation, one should always FIRST use webdriver methods since they use browsers native api, but thought it would unblock you while the bug is fixed in selenium
DesiredCapabilities desiredCapabilities = DesiredCapabilities.firefox();
desiredCapabilities.setCapability(CapabilityType.HAS_NATIVE_EVENTS,true);
WebDriver driver = new FirefoxDriver(desiredCapabilities);
driver.get("http://yizeng.me/2014/01/31/test-wysiwyg-editors-using-selenium-webdriver/");
WebElement frame = driver.findElement(By.id("tinymce-editor_ifr"));
driver.switchTo().frame(frame);
WebElement body = driver.findElement(By.id("tinymce"));
JavascriptExecutor js = (JavascriptExecutor)driver;
js.executeScript("arguments[0].innerHTML = '<h1>Heading</h1>Hello There'",body);
Can you check following points
Can you make sure there are no application changes, I mean the locator is returning single node element.
Can you also post the exception that you are getting. To troubleshoot this problem, need these details.
One thing I do before sending a value to a box is to clear it, also, the value I send is always a string, the parsing should be done by the page/code as you need to validate what the user has entered.
But I do agree with skv, we need to see the actual error being thrown.

How to get content to show up in WebDriver .get request?

I wanted to automate some processes on www.imgur.com, and I decided to use the Selenium WebDriver library for Java. I have been able to get much of my code to work with one hitch: when I access imgur directly only a white screen shoes up and will not change upon refresh. Accessing the sign in page directly yields an SSL error.
System.setProperty("webdriver.chrome.driver","C:\\workspace\\Test\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("https://www.imgur.com/signin");
WebElement username = driver.findElement(By.id("username"));
username.sendKeys("username");
WebElement password = driver.findElement(By.id("password"));
String pass = "password";
password.sendKeys(pass);
password.submit();
driver.get("http://www.imgur.com");
I have been able to work around this by using links google searches provide to imgur, but adding more features will require I be able to manage the URL directly.
Thanks in advance!
It's just http://imgur.com/, not http://www.imgur.com. That's why Google's links work, they are linking to the first one - a different url.
The www prefix is not required by any technical policy. Some choose to have urls both with and without the prefix point to the same server. Some choose to use only one or the other. It seems imgur is going without the prefix.
Here's a little more info on the www prefix:
http://en.wikipedia.org/wiki/World_Wide_Web#WWW_prefix

selecting pulldown in htmlunit

I am using htmlunit in jython and am having trouble selecting a pull down link. The page I am going to has a table with other ajax links, and I can click on them and move around and it seems okay but I can't seem to figure out how to click on a pulldown menu that allows for more links on the page(this pulldown affects the ajax table so its not redirecting me or anything).
Here's my code:
selectField1 = page.getElementById("pageNumSelection")
options2 = selectField1.getOptions()
theOption3 = options2[4]
This gets the option I want, I verify its right. so I select it:
MoreOnPage = selectField1.setSelectedAttribute(theOption3, True)
and I am stuck here(not sure if selecting it works or not because I don't get any message, but I'm not sure what to do next. How do I refresh the page to see the larger list? When clicking on links all you have to do is find the link and then select linkNameVariable.click() into a variable and it works. but I'm not sure how to refresh a pulldown. when I try to use the webclient to create an xml page based on the the select variable, I still get the old page.
to make it a bit easier, I used htmlunit scripter and got some code that should work but its java and I'm not sure how to port it to jython. Here it is:
try
{
page = webClient.getPage( url );
HtmlSelect selectField1 = (HtmlSelect) page.getElementById("pageNumSelection");
List<HtmlOption> options2 = selectField1.getOptions();
HtmlOption theOption3 = null;
for(HtmlOption option: options2)
{
if(option.getText().equals("100") )
{
theOption3 = option;
break;
}
}
selectField1.setSelectedAttribute(theOption3, true );
Have a look at HtmlForm getSelectedByName
HtmlSelect htmlSelect = form.getSelectByName("stuff[1].type");
HtmlOption htmlOption = htmlSelect.getOption(3);
htmlOption.setSelected(true);
Be sure that WebClient.setJavaScriptEnabled is called. The documentation seems to indicate that it is on by default, but I think this is wrong.
Alternatively, you can use WebDriver, which is a framework that supports both HtmlUnit and Selenium. I personally find the syntax easier to deal with than HtmlUnit.
If I understand correctly, the selection of an option in the select box triggers an AJAX calls which, once finished, modifies some part of the page.
The problem here is that since AJAX is, by definition, asynchronous, you can't really know when the call is finished and when you may inspect the page again to find the new content.
HtmlUnit has a class named NicelyResynchronizingAjaxController, which you can pass an instance of to the WebClient's setAjaxController method. As indicated in the javadoc, using this ajax controller will automatically make the asynchronous calls coming from a direct user interaction synchronous instead of asynchronous. Once the setSelectedAttribute method is called, you'll thus be able to see the changed made to the original page.
The other option is to use WebClient's waitForBackgrounfJavascript method after the selection is done, and inspect he page once the background JavaScript has ended, or the timeout has been reached.
This isn't really an answer to the question because I've not used HtmlUnit much before, but you might want to look at Selenium, and in particular Selenium RC. With Selenium RC you are able to control the interactions with a page displayed in a native browser (Firefox for example). It has developer API's for Java and Python amongst others.
I understand that HtmlUnit uses its own javascript and web browser rendering engine and I'm wondering whether that may be a problem.

Categories