I need to automate this process:
- login into a web site
- navigate to a page
- retrieve a piece of information
This should happen into a Java program. I know there are some open source Java libraries (JSpider, crawler4j and others) but I cannot understand if they fit my needs.
I a spider/crawler what I need? If not, what do I need, and does it exist ever?
Related
I have created a Java program that utilizes Chrome Driver, Selenium, and Java Excel API. The program is used to automate a few different processes on Google Chrome. Currently, setting up this automation is more complicated than I would like it to be: the user needs to download a zipped folder, unzip it, download the Java Runtime Environment, and launch the program using the executable.
My goal is to simplify the installation of the automation. Ideally, a user would come to a SharePoint website, fill out a form with the parameters of the automation (potentially upload an Excel Workbook), click an "execute" button, and the automation would run. As a result, the automation would run seamlessly across platforms (Windows and MacOS) without any modifications.
I have researched changing the programming language to achieve this functionality. I concluded that a different language could remove the need for a Java Runtime Environment download, but it would still require some type of installation process. Additionally, I have researched using HTML/JavaScript, but I concluded that this is not possible because the functionality (triggering a web automation from a website) could be used maliciously without the user's knowledge. Finally, I began researching containerization through Docker. This solution seems promising but I do not know enough about it to determine if it is the appropriate solution.
What would be the best route to achieve the results that I am looking for (outlined in the second paragraph)? I have access to enterprise-grade databases that I thought may be useful. Would it be possible to have the form trigger a virtual machine to run the automation on a remote database and then output the result to the user once it has finished?
Thank you in advance for any guidance you can provide. I do not know much about making a Java program into an enterprise-grade application so any information about what to research is extremely useful. Finally, please do not hesitate to correct my logic at any point in this question as I may have drawn the wrong conclusions from my research.
You want to look into creating a jar file with your selenium code.
I want to build a application like TwitterCount. Do anyone know any open source application like it? I want to find a open source application which can show some diagram or graph on website. Moreover, I want to build a web application, in which I can copy URL to my application, and it will show some diagrams. Now, my main difficulty about it is how to build a website like this, so I want to get some open source code for reference. If it is implement by java, that is better. Please give me some suggestions.
Twitter4J is a Twitter library for Java.
I have a random Twitter bot that I made if you want to see the source.
Here
Good luck!
My supervisor has tasked me with programmatically reducing a website's content by looking at the HTML tags to reveal only the core content. Importantly, this particular piece of the project must be written in Java.
Now having learnt about the differences betweenPlugins, Extensions, Applets, and Widgets, I think I want to use an Extension that calls a client-side Applet. My approach was going to be this:
Using the Google-Chrome API, I was going to display a button that
the user can click.
If clicked, the action is to launch a new browser tab that has the
Applet embedded within it.
The applet automatically sources the called tab's HTML code and
filters it.
Once filtered, the reduced copy of the original site appears.
So I have a few questions. To start, is it even possible to use an Extension with an Applet? Moreover, is it possible for an applet to look # another tabs HTML code? If not, is it possible to just reload the original tab with the Applet now embedded within it and complete the function. Thanks.
Javascript is already on most mobile web platforms. Java is not, and there is no reasonable way mobile customers will be able to install Java. Android, which runs many, but not all, mobile devices has a Java run time environment, and is basically a loader for Java apps. But an Apple iPhone is not an Android device... nor is a Windows Phone.
If you want to summarize content on the client, and in Javascript, as I see it you have two choices:
Succeed with some inner burst of genius where dozens of the best expert PhDs in Natural Language Computing have just begun exploring how to extract "true meaning" from text; OR
look at document.title and be done with it.
The 2nd approach assumes that the authors of web pages set titles and set a title appropriate for summarizing their website. This isn't a perfect assumption, but it is OK
most of the time. It is also a lot less expensive than #1
With the 1st approach you can get a head start with a "natural language toolkit" that can do things like scan text for unusual words and phrases. To get a rough idea of the kinds of software that have been built in this area, review wikipedia: Outline of natural language processing:: toolkits. A popular tookit for python is called NLTK. Whether you use a toolkit from java, or python, it means working on the server because the client will not have the storage, network speed, or CPU. For python there are server side app frameworks like django or web2py that can make building out a server app faster, and on Java there are servlets frameworks. Ultimately you'll need a lot of help, training, or luck and as I have hinted above it can easily be beyond the capabilities of a small team of fresh hires, and certainly way beyond what a single new developer eager to prove his/her capabilities can do in a few weeks on their own with limited help.
Most web pages have titles set like this near the beginning of the downloaded HTML:
<head><title>My Furry Kittens!</title></head>
You don't need to write a parser. If you are running in the browser, the title has been parsed into the DOM or Document Object Model already. The string "My Furry Kittens!" in this example would be available in the global variable document.title.
If you like, you could put a button into a plugin and let people push it to summarize the website. Or, they could just look up at the title. It is already on the page. Of course, if the goal is to scrape titles one can avoid writing a parser and use a "fake" headless scriptable browser like phantomJS or similar.
You can read more about document.title on the Mozilla Developer Network. MDN is a great reference for learning how web browsers work. They are the maintainers of the Mozilla Firefox browser. Most of what you can learn there will also work on Chrome, Internet Explorer, and various mobile platforms.
Good Luck!
How about implementing a local proxy server on the mobile device. The browser would just need to be configured to use the proxy, while the custom proxy implementation can transform the requested html however it likes.
How can I open a webpage from a java application and enter username and password into it? I have seen questions here where people have referred to lobo or DJ Native Swing.
But as I am very new to java these libraries seem quite complex to me and I can not find a good tutorial in it, please refer to some good library with a solid tutorial which can be a beginning ground for me.
Note: I am developing a Java Swing application and show the user a page opening and user name and password being submitted.
What application do you develop? Web? Desktop/swing? Console? Mobile?
Take a look at Apache's HttpComponents project
It is not clear what you want to do.
1) Do you want to show it to the user?
join to WebKit browser in Java app on multiple platforms
2) You want to fetch some data from a web page that requires login?
then you should join to HttpClient login, search and get the XML content
Your question is not clear, are you trying to write a java application that will open a browser and enter data automatically?
If this is what you want then you can use selenium, a plugin for Mozilla. You can record your mouse movements, all your actions will get recorded as a jUnit test case.
You can then modify this junit test and read in the usernanme and password programatically from a file (or what ever you are trying to do.). This test can be run as a stand alone java application.
I would like to create a folder selector for my application which will only run on Internet Explorer Browsers (IE6+).
I would like to get the full folder path a user wishes to use via a HTML browse button or similar and then pass this to my server side application which is written in PHP! This can not be done via JavaScript for security reasons so I am looking for any other alternative that will work.
I had implemented a solution using a Java Applet but this did not work out as it didn't work on IE6, plus the browser security is locked down where I am deploying this app, meaning I am unlikely to get away with an applet.
The current solution is getting the user to paste in the folder location into a textfiel, this isn't acceptable any more.
Any implementation advice welcome!
Thanks all
If your app is IE only you can create a simple ActiveX Control with only one method:
HRESULT BrowseForFolder([out] BSTR folderName);
ATL Wizard will give you major portion of code.
But I would consider using flash for this:
It is more widely used then Java (I
think so).
ActiveX is strongly depends on
browser's security options and more annoying to install.
Your task is rather small, so it can
be implemented without deep skills, if you not familiar with flash