Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Does a web crawler return the extracted text from webpages only? Say, if there are some pdf/doc files stored in the web server as well. Can a web crawler crawl through them and return their content as well? Anyway what are the suggestions for a good opensource Java web crawler?
Thank You!
Web crawler doesn't extract the text. It simply returns the htmls with some transformations [UTF-8 conversion for example] applied.
If you think of it that way for crawler it doesn't matter at the first hop. Of course for multiple hops it needs to look inside these documents and typical crawlers don't provide multiple hops in pdf/docs etc.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a table on my web-page with thousands of entries and it shows 20 entries at a time. I want to take screenshots of all the entries page by page and then create a video of them using code. What would be the best language and method to do this task? I know nothing of this thing so I am open to any language like python, Java, Go, etc.
You can user Selenium library in either java or python, it has some tools for doing these operations.
here is an exapmle.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
What computer languages would I need to create a file sharing website? I think I would use HTML, CSS, Javascript, and PHP? Also how would I go about doing so?
You would most likely need to use a server-based language, like SQL. HTML, CSS and PHP will be required, and Javascript a great plus.
Mainly, what you would do is use move_uploaded_file and is_uploaded_file after a form POST. See here for more informations. You'd store the file name in your server, its location, its uploader, etc.
You'll need an authentification system, which the most primitive of will make use of session variables, and the most advanced your SQL server, a hash function and the likes.
Good luck!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am just learning java. I just want to make a simple application to access a web-site.
there is a website onto which i want to log-in through java:
and then interact with it through my interface, basically after log in, i would be writing in some text boxes and sending it.
I tried many places to do it, studied HTTP protocol but still cant make it.
can someone help me out?
Accessing a web site, logging in and interacting with forms on it is somewhat complex work, so it might not be the best choice for a first java project.
But if you want to do it, you should probably use Apache HttpComponents/HttpClient.
There are useful examples at the above link as well, which may help you get started.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am developing a web application that will be sold later on, and was wondering how to add the concept of license to it, so that I can avoid it's piracy, and it can be used for given period of time only?
You can have a centralize licence server , have a DB to check user's status , expose this thing using webservice so that any of your app can consume it.
I've seen Java enterprise tools do the usual serial number/license file thing. Worked pretty well for them. All you'd need to do is put some static code in the application that would execute when the JavaEE container loads the WAR file and have that check the serial number.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I would like to host my own version of the boilerpipe web API (http://code.google.com/p/boilerpipe/). The appspot site is http://boilerpipe-web.appspot.com/
I would like to self host it. Can someone give me directions on how to use the Boilerpipe JAR to create a webpage ?
I am the author of boilerpipe.
Boilerpipe's demo web application boilerpipe-web is not part of the boilerpipe-core jar.
To imitate its functionality you will need to write some Java Servlet around boilerpipe-core.
I'll probably release the source of boilerpipe-web at some point, so you don't have to bother with.