I want to open a webpage (whose URL is given as the commandline argument) and then want to save the content of that webpage as a .txt file.
Remember, I need the .txt file and not the source of the webpage.
I tried my hand with selenium and it works fine. But now I want something that doesn't open the real browser as opening the browser and loading a page in it is a time consuming task.
I want to do it in java.
By content, I mean the text (without markups) which we get when we save a webpage in IE by going to "Save As" and then selecting ".txt" as the output format of the file.
If I understand correctly your question, you want to render the page and copy the rendered text without using a navigator.
For this, you'll need a headless browser. HTMLUnit would be a good choice.
To get the text content, you could do it like this (not tested) :
WebClient c = new WebClient(BrowserVersion.INTERNET_EXPLORER_6);
TextPage tp = c.getPage("yoururl");
String content = tp.getContent();
(see Javadoc)
Hmm, I'd even code that from scratch, does not seem as a complex thing and might not be even worth adding a dependency on another library to your project:
Open a URLConnection to that URL
Get a stream from the connection, apply regex to strip out all the HTML to the data. If the page is not expected to be too large for you memory requirements :) read the page into a String then apply the regex. Alternatively, give a shoot to what's described here (I have no experience with the way described there though).
Save output to a txt.
Related
I am working on selenium. I need to print an html link in the form of link/hyperlink.
For example:
System.out.println("https://www.google.co.in");
Reporter.log("https://www.google.co.in");
The above one prints the google link in text format but how can we print above one as in hyperlink/link format.
Is there any possible way to do this in selenium and also in java?
If you use System.out.println(...) then you can only output raw text to the standard output. Now, if you want to be able to click on links in this standard output it is very important to know, how you're viewing this output. The standard way would be on the console or, with logging, maybe in a log file.
Links have to be supported by whatever displays those links if you want to be able to click them; some consoles do that automatically but many don't. And the same goes for text viewers. This has nothing to do with Java or Selenium.
What may make sense is if you're creating HTML files and viewing those with Selenium; in that case, you could of course create links, as any HTML browser will support them. But I doubt that's what you're currently attempting. (Correct me if I'm wrong.)
I have simple code to execute commands from cmd in windows xp
I would like to display output in IE and Chrome browser instead of notepad
Thank you for the tips,
PrintWriter output = new PrintWriter(
("C:\\Documents and Settings\\jszpakow\\Desktop\\ping.txt"));
thank you very much for advices. I'm new here so I know in the future I should be more specific. it's nothing wrong with notepad++, or browser as a text viewer (however when I create html text is not raw like in notepad)
My idea was not to open each time CMD and copy ping output to my case notes which is in web browser system (based on Liferay)
My problem is that I need to paste this ping output in my case notes in specific field textarea in the browser tab, but each time url and textarea ID is different.
(source html) textarea id="xx:caseViewForm:caseViewTabView:caseNotesInput"
so maybe I can send output to buffer and paste it using ctrl + v
the other thing when I tried to use xml or docx file as output, it creates them but I can't open it. (I'm receiving message corrupted file)
I am assuming that you dont want to write some sort of web server or web services. If your just double clicking on a file to see its output, rename the output textfile to .html
You can even print out html tags and format the text to make it look nicer.
PrintWriter output = new PrintWriter( ("C:\Documents and Settings\jszpakow\Desktop\ping.html"));
If you mean that you'd like to use a web browser as a text file viewer, they can do that by default. The URL format is file://$path, so in this case:
file://C:\Documents and Settings\jszpakow\Desktop\ping.txt
Renaming the file extension to .html would make your browser the default viewer, but you'll loose formatting on raw text.
However, if you mean you'd like to make the output available to other people on the web, you'd need a publicly-accessible web server to upload to. This means you'd either have to install and configure WAMP on your local machine, or get a web hosting account and FTP the file.
Whats wrong with notepad? You haven't used Notepad++, have you? Use it, it's will be far better than any browser.
I have a JSF Web application, and at some point i present the client a big chunk of information, I want to have a save as link, that allows the client to save this information on his computer as a .txt file.
Information on how to achieve this or a good tutorial would be great.
Does this work for you? You probably would need to set the ContentType to "application/octet-stream", otherwise the client's browser will display your text file instead of offering the option to "Save as".
I believe your best bet may be to have that link actually generate an Ajax call to generate the text file and set it as the src attribute of an iframe on the page. That will trigger (I think) the file download box.
I generate a html file using log4j WriterAppender file. I also takesnapshots of my screen using webdriver. Now I wish to append them together.
Any idea how to do that?
Thanks!
Apologies for not being clear and daft. My situation is that I have got a html file which is generated dynamically by my logger class and then there are some .png file which are also being created dynamically. Now I want them to appear together in one file. Am I clear now? Please ask for more information if needed
It's possible to embed graphics data in a couple of ways. Most modern browsers accept the data: url notation. An image can be embedded straight into a url.
I took an example from this site. Cut and paste the whole line into the url bar:
data:image/gif;base64,R0lGODlhEAAOALMAAOazToeHh0tLS/7LZv/0jvb29t/f3//Ub//ge8WSLf/rhf/3kdbW1mxsbP//mf///yH5BAAAAAAALAAAAAAQAA4AAARe8L1Ekyky67QZ1hLnjM5UUde0ECwLJoExKcppV0aCcGCmTIHEIUEqjgaORCMxIC6e0CcguWw6aFjsVMkkIr7g77ZKPJjPZqIyd7sJAgVGoEGv2xsBxqNgYPj/gAwXEQA7
You should see a folder graphic. Some older browsers don't accept this, and some such as IE8 restrict content in various ways, to static content for security reasons.
The second way of doing the same is for the server to serve multi-part MIME. Basically a server would shove out a multi-part mime document consisting of the HTML body and then any inline images base64 encoded as separate parts. This is more suitable for email HTML although it might work through a web browser.
It's not quite clear what you're asking here, but let's assume that you want to manually add an image to the log output HTML file.
If you want to include an image in your HTML file, just save the snapshot PNG file in a place relative to where the HTML is generated, then include it using standard HTML syntax:
<img src="images/snapshot.png" alt="snapshot description">
Update: the requirement is to add dynamically generated PNG files to a dynamically created HTML log file.
If one process is creating both the PNG and the log output, you should be fine - just keep note of the appropriate PNG filename and include it in the logger output in an IMG tag (as described above).
If they are generated by separate processes, this may be more difficult; you would need to either stick to a known naming convention, have the process generating the log query the filesystem to determine the appropriate PNG file to include, or build some sort of message-passing between the two processes.
Please stop posting the same comment to each and any of the different answers given to you, when all of those answers basically tell you that the notion of concatenating two different file formats into a single file is not meaningful.
Let me repeat that again for clarity: Copying a PNG file into a HTML document makes no sense.
You either save the PNG in a directory where it's accessible in the HTML document and add an img tag so it can be referenced (see the answer by stark), which would be the recommended way in terms of portability and usage of the files as they were intended to be used.
If you really, really want to end up with a single file for whatever reasons, there are bascially two options: You follow the advice of locka and encode the PNG image with Base64 and insert an img tag with a data URI at a meaningful position. This probably involves parsing the HTML "a little" to come up with a good place to insert it.
The other option is to not create HTML, but MHTML files. MHTML is a file format that allows saving HTML source code and resources like images into a single file. MHTML is supported by the most popular browsers nowadays, you may find info on the file format here: http://people.dsv.su.se/~jpalme/ietf/mhtml.html
In the code where you are generating the html you should just include the img using the img html tag
If you want the picture to appear in the html, add the tag
<img src=./img.png /> to your html.
If you want the 2 files in one, you'll need to zip them into an archive or something?
It makes no sense to append a HTML file to a PNG file, or vice-versa. Neither file format allows this, so if you do this you will end up with a "corrupt" document that a typical web browser or image viewer won't understand.
"I want them to appear together in one file".
That's still pretty vague, I'm afraid.
Assuming that you want the image to appear embedded in the HTML document when you open the HTML document in a browser, the simple solution is create separate HTML and PNG files, and have the HTML file link to the PNG file using an <img> element.
If you want, you can bundle up the files (and others) as a ZIP or TAR file, so that you can deliver everything as a single file. However, a ZIP/TAR file typically needs to be extracted before the document can be viewed. (A typical web browser won't "display" a ZIP file. Rather it will open it in some kind of archive extractor or directory browser, allowing the user to access the individual files.)
It might also be possible to embed an image file in a HTML file by base64 encoding the image, and using embedded javascript to decode the image and then insert it into the DOM ... But this is probably waaay to complicated.
I have a java applet that creates a JPEG file. I want to pass that file to a Javascript where it can display and print it. The only way I can think of doing this is to save the jpeg to a temporary storage area on the user's computer and then pass the path of the file to the javascript which picks it up and displays it. This raises a two questions:
Where should the applet store the file. If you suggest the temporary internet files folder, then how do I find that path to that folder?
Is there a better way to do this? Can I pass the JPEG directly from java to javascript without first writing out to a disk?
Thank you in advance for your help.
To store file on users's machine your applete should be signed, and user should give necessary permissions to your applet (through special dialog window which is shown automatically).
Read this article about modifying DOM from applet
Another approach is to save your image on the server (pass it from your applet to the server) and then reload page (or use Ajax, but in this case you probably have to make ajax calls every few seconds to check if the image is available on the server).
Can't you just have an applet that displays the picture and prints it?
I don't think it'd be possible to do this in IE before IE8 (and it's wimpy even in IE8), but in other browsers your applet could make the image data available to Javascript (please don't say, "a Javascript"; it's like saying, "a FORTRAN" or "a Java") and then from Javascript you could create an <img> tag with a "data URI". See this reference: http://en.wikipedia.org/wiki/Data_URI_scheme