A java program to search in a certain website - java

I want to write a Java code to parse a certain website. Each result in the website appear in a specified URL.
How can I start? Is there a good library to use? Could I benefit from your experience in this field?

Search for "web crawler" and you'll find many examples (e.g. Crawler4J or Crawler), how to solve this.
Besides Java, you'll often stumple upon Python when it comes to grepping stuff from web pages - I'm not a Python guy, but it seems to fit for the task.

Related

Can i use daringFireball Markdown within my JAVA application

Here's the link of daringfireball Markdown. Can i use this API in my java project? I read the whole site and did not find anything that could help me figure it out.
I also went to the wikipedia page to see if there was more API that do the same thing, but they're for web language like PHP or ruby.
I want my user to be able to use markdown as it is more simple for a non-programmer to type with this syntax.
If not, is there any other API that does the same thing for JAVA? Would it be hard to implement such thing my self as i am a starter experienced programmeer?
Thanks

Using XML in Java as language manager

So, I'd like to create a bilingual program. From what I have heard, xml files are way to go...
I couldn't really find anything useful with Google (when you enter Java in Google it sees the word "language" only as in "programming", instead of desired "spoken"), so if any of you could direct me to some tutorial page which explains this topic further, or even show it to me here (It can be that complicated, can it?), I'd be very grateful.
If xml is way to go, that is! If any of you have any better suggestions, I'm listening...
XML could be the way to go, but the usual approach to internationalise your Java applications is using ResourceBundles. When asking Google the right keywords are: Java i18n or Java Internationalization.
There is a basic java tutorial that can get you started with i18n. If you are writing a web application then you should check the documentation of your specific framework.
Java has a built in internationalization system that uses properties files.
Java Internationalization API Tutorial

FOSS java library to generate *.mobi ebooks?

Firstly there is an almost identical question but the answer is not really satisfactory.
Is there a Java or Ruby library for generating MOBI ebook documents?
The answer basically gives a link to amazon and discusses using command line tools which is not really satisfactory for a web app. I want a regular jar file w/ an api that i can invoke without any nasty process invocation.
Does anyone know of a FOSS library that provides this functionality ? I would rather simething like ITEXT that allows me to build the document and then writes the mobi file rather than something that converts an already ready PDF into the MOBI.
The best I've been able to find is a ruby library called KindleR. https://github.com/josh/kindler
I've only used it to convert basic HTML pages to mobi with pretty good success. I've never converted anything with more complicated formatting, so YMMV.

Continue with PHP or move to Java Framework

I'm building a website for my friend's startup.
I'm currently building it on PHP, MySQL, Apache. Everything is going pretty smooth till now, but one of my friend recommends i move to a Java framework because that'd be better when the site becomes bigger.
I don't have any Java Knowledge and I have adequate knowledge of PHP but I'm a fast learner
Should i continue with PHP (can PHP be used for big sites?) or should i move to a Java Framework?
Any suggestions please.
PHP can be used for big sites. Take a look at Facebook. End of story...
When someone says something like that, you need to have them justify what they are saying. PHP is scalable and versatile. Java has it's strengths and weaknesses too, just like any other programming language.
Since you are a novice in PHP, you probably won't reuse much of the code that you are using to build the initial site. I know you plan to, but the likelyhood that you will is very slim. Go with what you know.
PHP has been used by a plethora of sites. Google, Yahoo, Facebook, etc, etc all use some php for all their sites.
I think regardless of whether you go with PHP or a Java solution you're going to have to learn a new set of frameworks/libraries. If you don't have experience building anything more than a toy website (including any you've done through education) there will be plenty to learn from both paths.
Research some frameworks for both PHP and Java and make up your own mind based on what you've read. If it's such a long project you'll have plenty of time to familiarise yourself with whichever option you choose.
Everything for a website is possible with php. No need to worry. we can make heavy sites with php in a easier way then java framework. You can use php frameworks like zend framework, codeigniter. They are scalable and easy to learn have lot of Components (libraries). And provides client services to bigger sites like twitter,amazon,yahoo etc.
:)
At least 1/3 of the top 20 sites on the web are using PHP in one way or another. Languages used by the rest include Python and Ruby. I don't see that any of them are using Java.
PHP is fine to use, ive made websites before using PHP...
Id suggest you stick with PHP but challenge yourself and expand your knowledge.
Yes. Big sites can be build using PHP. Examples are Digg.com and Facebook (which compiles to c++, but indeed).

Handling CSS and JavaScript when building a Java browser

My task is to create a simple web browser in Java.
So far it can only read HTML pages.
I'm using standard JEditorPane component to display webpages.
Now I was wondering is there any way you could explain me how can I manage to display at least some simple pages that contain CSS/Javascript.
If you could point me to some useful links or appropriate examples I would be very happy.
Well, my advice would be to look at open source rendering engines such as Gecko - https://developer.mozilla.org/en/Gecko_FAQ
You can embed Gecko with Java using the JREX library - http://jrex.mozdev.org/
Starting from scratch with a problem like this is a very big task, and as your username is AmateurProgrammer, I wouldn't recommend it.
There alrady is some prior art for the Java browser segment.
concerning javascript, you will have to use a javascript interpreter in Java. A renowned one is Rhino (by Mozilla). Its integration may reveals to be an interesting challenge.
concerning CSS, it seems the question has already been asked ...

Categories