Automatic sitemap generation - java

We have recently installed a Google Search Appliance in order to power our internal search (via the Java API), and all seems to be well, however I have a question regarding 'automatic' site-map generation that I'm hoping you guys may know the answer to.
We are aware of the GSA's ability to auto-generate site maps for each of its collections, however this process is rather manual, and considering that we have around 10 regional sites that need to be updated as often as possible, its not ideal to have to log into the admin interface on a regular basis in order to export them to the site root where search engines can find them.
Unfortunately there doesn't seem to be any API support for this, at least none that I can find, so I was wondering if anyone had any ideas for a solution/workaround or, if all else fails, the best alternative.
At present I'm thinking that if we can get the full index back from the API in the form of a list, then we can write an XML file out using that the old fashioned way using a chronjob or similar, however this seems like a bit of a clumsy solution - any better ideas.

You could try the GSA Admin Toolkit, or simply write some code yourself which just logs in on the administration page and then uses that session to invoke the sitemap export URL (which is basically what the Admin Toolkit does).

Related

Evernote: Get a list of all edits

I am trying to develop a simple statistics tool to analyse various behaviours of collaborators within an Evernote Notebook using the Evernote Java API.
I need the informations which user edited which note and when.
Even though the documentation is quite good, I am still unable to find the required functionality inside the api.
(TLDR:)
Is there a way to access a list of edits of a evernote note using the API?
I am not bound to using the Java SDK so if there is a way, which is limited to using another language, it would be no problem to switch.
Andreas - Did you look into these methods in the API?
NoteStore.GetNote and NoteStore.getNoteApplicationData
It sounds like this would be a decent place to start at the very least. I cannot say for certain if this will return everything you are looking for though.
I hope this helps!
I'm not exactly sure what you are looking for but NoteStore#listNoteVersions might be the one you want. You can get a list of NoteVersionId and then use another API called NoteStore#getNoteVersion to get metadata to see which note is updated when.
Note that the API is probably only for premium accounts.

How to modify search result page given by Solr?

I intend to make a niche search engine. I am using apache-nutch-1.6 as the crawler and apache-solr-3.6.2 as the searcher. I must say there is very less updated information on web about these technologies.
I followed this tutorial http://wiki.apache.org/nutch/NutchTutorial and have successfully installed apache and solr on my ubuntu system. I was also successful in injecting seed url to webdb and perform the crawl.
Using solr interface at http://localhost:8983/solr/admin, I can also query the crawled results. But this is the output I receive. .
Am I missing something here, the earlier apache-nutch-0.7 had a war which generated a clear html output like this. . How do I achieve this... Or if anyone could point me to a latest tutorial or guidebook, highly appreciated.
A couple of things:
If you are just starting, do not use Solr 3.6, go straight to latest 4.1+. A bunch of things have changed and a lot of new features are added.
You seem to be saying that you will expose Solr + UI directly to general web - that's a really bad idea, as Solr is completely unsecured and allows web-based delete queries. You really want a business layer in a middle.
With Solr 4.1, there is a pretty Admin UI and, also, there is a /browse page that shows how to use Velocity to do the pages backed by Solr. Or have a look at something like Project Blacklight for an example of how to get UI over Solr.
I found below link
http://cmusphinx.sourceforge.net/2012/06/building-a-java-application-with-apache-nutch-and-solr/
which answered my query.
I agree after reading the content available on above link, I felt very angry at me.
Solr package provides all the required objects to query solr.
Infact, the essential jars are just solr-solrj-3.4.0.jar, commons-httpclient-3.1.jar and slf4j-api-1.6.4.jar.
Anyone can build a java search engine using these objects to query the database and have a fancy UI.
Thanks again.

Service similar to Airbrake.io for java applications?

We made our own api for airbrake.io in java. This works fine but airbrake is displaying parameters and stacktraces in some kind of Rails style. This is somewhat annoying. Anyone know of similar services made for java?
Example of how data is displayed:
Parameters
{"controller"=>"", "action"=>""}
Stacktrace
/testapp/app/models/user.rb:53:in `public'
/testapp/app/controllers/users_controller.rb:14:in `index'
UPDATE 2015-02-13: This service no longer exists. The GitHub account linked below is gone, as is the company website.
Have you tried using Coalmine https://github.com/coalmine/coalmine_java Its meant to be used with the Coalmine service: https://getcoalmine.com/
I work at Coalmine and we have been using this internally for some time now. We just open sourced the java connector this week and I would be happy to help you get started with it. You can send me an email at brad#builtfromsource.com
Have you tried using http://code.google.com/p/hoptoad/ . It's a little out of date, but it should just need to update an endpoint to http://api.airbrake.io .
A quick google lead me to http://logdigger.com/ which is designed specific for JAVA specific sites.
I work at Airbrake, and I would be happy to work with you to make our site more JAVA friendly. Please get in touch ben#airbrake.io, and I'll see how we can better display java specific information.
Just adding to the others suggested here, but Raygun (http://raygun.io) has first class support for Java.
Read more here: http://raygun.io/java
I work for Mindscape who built Raygun so can answer any questions you may have about it: jd#mindscape.co.nz. We already have a large number of organizations using Raygun with their Java apps, although Raygun does support other platforms (.NET, Node, Rails, PHP, etc)

Getting data from a website that needs you to log in (Java)

I don't even know if what I'm asking is possible and I don't know what to search for on Google.
Basically, there are multiple projects that would require me to fetch some data from websites. The example I'm thinking of right now is to grab my account info from a banking site http://www.americanexpress.ca I'd like to know how I'd make it so my login info is entered in the fields on the left and grab the data from the resulting page. I'd then make methods to parse that data.
Obviously, this would need to be secure as I don't want my banking info stolen.
Sorry if the solution is obvious as I've never tried grabbing data from websites.
As mentioned, Apache HttpClient is one option, though personally I've always found HtmlUnit to be a bit more convenient to work with (from an API standpoint) for doing things like this. HtmlUnit is built on top of HttpClient, and exposes a higher-level API for interacting with and manipulating page content.
You have to use Apache HttpClient (or same) library. It have all required classes for you.

Parse JSON response from Google Maps page

I'm trying to find the best way of parsing the response from a "normal" (i.e. not using the API) Google Maps page in my java code.
Reason: I want to submit a query string requesting a listing (be it hotels, restaurants etc.) and then parse the JSON that comes back. I had looked into using the Google Maps API, but it doesn't seem to cover what I want to do, as this type of URL:
http://maps.google.de/maps/geo?q=address&output=xml&oe=utf8&sensor=false&key=...
is OK but this isn't:
http://maps.google.de/maps/geo?q=address+hotels&output=xml&oe=utf8&sensor=false&key=...
(due to the "+hotels" term). So I think the only option is to use a google maps response e.g.
http://maps.google.de/maps?q=address+hotels
and parse the JSON information that is included at the end. Does anyone have some hints as to how best accomplish this?
You should first make absolutely sure that the API doesn't support what you need. Checking the docs and maybe even reaching a real Googler might pay off. It strikes me as odd that their API wouldn't support something as simple as adding in another term.
If you're forced to do it the "hard way", there are two main steps:
1) Find and learn a JSON parsing library for Java. I can recommend Jackson -- fast, sturdy, and just released a version 1.0.0.
2) Teach your code to understand the spec the Google uses in their response. This is by far the most challenging part. My apologies, but I know nothing about Google's spec in this area. If you can find official docs, that's best. Or find unofficial docs published by someone else who had to do similar work. Otherwise, you may have to "reverse engineer".
Re. the google api docs: it does seem that what you're trying to do goes against the intention of Google to make their product (= a map) available to you, the developer, for your custom enhancement (by adding business outlet information or whatever). There's plenty of stuff on the Google maps API site describing this. But to parse their data (coming out of their database) and to display it independently of their product would seem to be rather different: section 10.12 of the terms explicitly cover this:
...code.google.com/intl/de/apis/maps/terms.html
However, there are apps out there (the "Around Me" iPhone app, for example) that seem to do just that: there might be a special arrangement between Google and Apple in that regard.
EDIT: alternatively you could look at this problem another way and use the Google Base API feed, since this allows you to build query strings specifying resource, distance, location etc. - i.e. it returns the data you require without using the Maps API (which you don't need anyway, given your description).

Categories