Creating a knowledge base on top of provided webpages as a feed - java

I have some issues with my parts of final year projects. We are implementing a plagiarism detection framework. I'm working on internet sources detection part. Currently my internet search algorithm is completed. But I need to enhance it so that internet search delay is reduced.
My idea is like this:
First user is prompt to insert some web links as the initial knowledge feed for the system.
Then it crawl through internet and expand it's knowledge
Once the knowledge is fetch System don't need to query internet again. Can someone provide me some guidance to implement it? We are using Java. But any abstract detail will surely help me.

if the server side programming is you hand then you can manage a tabel having a boolean in database which shows whether the details were read before. every time your client connects to server, it will check the boolean first and if boolean was set false then it will mean that there is a need to send updates to client other wise no updates will be sent,
the boolean will become true every time when client downloads any data from server and will become false when ever the database is updated

I'm not quite sure that I understand what you're asking. Anyway:
if you're looking for a Java Web crawler, then you I recommend that you read this question
if you're looking for Java libraries to build a knowledge base (KB), then it really depends on (1) what kind of properties your KB should have, and (2) what kind of reasoning capabilities you expect from your KB. One option is to use the Jena framework, but this requires that you're comfortable with Semantic Web formalisms.
Good luck!

Related

How to get started reading data from magnetic card with a simple bi-direction swipe magnetic card reader?

Let's say I have a USB magnetic card reader
(http://image.ec21.com/image/szttce09/oimg_GC03950917_CA03950946/Triple_Track_USB_Magnetic_Credit_Card_Reader.jpg)
I am running windows 10 on my machine. All I want to do is read data from the magnetic card and use that data in a java Application. How can I do this ? I heard the java communications api is suitable for what I want. How exactly do I use this api or any other api to read data from the card reader and show that data in my java application. Assume I have eclipse opened. Now what? Do I import the communications api ? If yes , what do I do next ? If you can give a simple example of how to do this it would be greatly appreciated.
The first thing to do is search the internet for a Java library, which allows you to communicate with that device. Put in the model and serial number, and see what comes up. If nothing comes up, and that is quite likely to happen, you will need to find the native driver, and wrap it in Java.
This is not normally a straight forward or easy process. When dealing with third party hardware, most vendors don't supply a little Java library that you can use. If you can find the native drivers, you can wrap them from C to Java, but you might not even be able to find the drivers at all.
Start by going to the manufacturer's page, and looking around. If you find the driver, great. If you find documentation for the driver, even better. You will need to read the documentation, and understand how to use the driver from C code, which implicitly requires you to have a basic understanding of C.
Then you will need to link it in to Java, using the Java Native Interface.
In Summary: If it didn't say "For Java developers" in the description when you bought it, you're going to have to do a lot of work just to get it usable in Java.

How to input data programatically into a system that puts up a form but doesnt have data upload features

I'm not sure if this question is entirely appropriate for SO but it seemed to work better here than in SuperUser so apologies if its in the wrong place. Happy to move it if so.
I'm trying to figure out how I'd automate the input of data into a system that didn't accept data uploads, but rather used forms put up on a screen. Use cases are e.g. where an enterprise wide system does accept uploads but the user lacks admin rights to fill in data she is required to populate, or with very old and specialized legacy systems where the functionality just doesn't exist and a serial input-review-rollback-commit cycle is enforced.
I'm not a programmer by trade so this is partly thought experiment but also to answer a question that has arisen at a business that I'm involved in.
I'm reasonably familiar with python and java if libraries for keyboard emulation exist but would be happy interpreting a pseudo code response too.
Responses that point to existing providers of such functionality that is embeddable or that tell me if I'm barking up the wrong tree also gratefully accepted.
Once again apologies as I know this isnt intuitively the best spot for this. Please do point me to a better location if you know of one.
Thanks
Possible solutions exist but they're all pretty bad
Is it a desktop application or a web application? If it's a web application you can use ghost.py to automate the interaction and submission of new records/entries. This work will be a glorious bundle of fun for the lucky code jockey who draws the short straw.
If it's a desktop application, it will be a great deal more difficult. Is it on Windows? Linux? MacOSX? Is the software written in Java? Using the Swing toolkit? AWT? SWT?
If it is a native Windows application you might be able to use Autohotkey to automate desktop interaction. This can be as basic as automatic clicks in pre-recorded parts of the screen, automating TAB keypresses to move around the input cells and reading input text from a data file and writing that out into the input cells. This will be even more entertaining than the web-solution mentioned earlier: truly the necessary ingredients for an authentic war story worthy of the annals of internet lore.
This is likely to be a lot of intricate work, error-prone, and subject to failure in the future if the UI of the software is changed; and such changes are very likely. It would be a lot easier to help if you could add more detail to the question.
Before embarking down this road, if I were you I would beg and plead with the software vendor to either provide me with an upload API; I would even offer to pay the vendor to upload my data for me. I cannot imagine either of the solutions I mentioned will be any cheaper, unless the work time of your developers has no value.
Good luck.

Aggregating results from various sources - Java application architecture

I searched in google and stackoverflow for my problem, but couldn't find a good solution. Below is the description,
Our Java web application displays search results from our local database and external webservice API calls. So, the search logic should combine these results and display it in the result page. The problem is, the external API calls return the results slower than our local DB calls. Performance is crucial for our search results and the results should be live i.e. we should not cache or persist the external results in our local DB. Right now, we are spanning two threads, one for the DB call and another one for the exteral API, and combine these results and display it on the screen. But it kills the performance of our application, particularly when we call more than one external APIs.
Is there any architectural solution for this problem?
Any help would be greatly appreciated.
Thanks.
You cannot display data before you have it.
1) You can display your local data and as they come, add via ajax other data.
2) And if there are repeated questions, you could cache external answers for short time (and display them with warning that they are old and that they will be replaced by fresh answer) and as soon as fresh anwer arrive, push new answer.
With at least 1), system will be responsive, with 2) usable answer can be available imediately, even if its not current.
btw, if external source take long to answer, are you sure that their answer is not stale (eg. if they gather some data and wait for rest, then what they gathered so far can go stale)? So maybe (and maybe not) short term persisting is not as bad as you think.

System requirements for Cometd/Bayeux Usage on Android

I'm trying to implement a Cometd/Bayeux server on Android using iJetty. The Jetty implementation itself works just fine serving static pages along with servlets. I am trying to up the ante a bit and create a Bayeux application on the phone but I'm having some trouble. I can hit the page that has the dojo cometd scripts on it, but I am unable to subscribe to the channel. When I view firebug/chome developer tools, I see a series of posts/gets that last a couple of milliseconds (~14). However, when I run a cometd application on a normal machine, the posts/gets last several seconds (~14 seconds) before timing out and reopening the connection. This second scenario makes sense to me with my understanding of how continuation in HTTP works. So I'm thinking that something is not allowing those connections to hang open and prematurely returning a value and consequently closing the connection. I would post my source but I'm not sure what to post short of posting everything...(it is open source though so if you want to have a look it's at http://webtext-android.googlecode.com).
So my question is, does anybody think that there could be some underlying limitation imposed by the Android system that is preventing these servlets from working? Are there assumptions that are made by the Jetty Bayeux implementation with regards to the underlying system? Or is it more likely that somehow I have a bad implementation of the ContinuationCometdServelt? I should note that all of the posts/gets from the client return 200 OK messages so I'm not inclined to think that the Android system is simply terminating the connection.
I know this is a bit off the wall and I'm definitely trying to do something a bit out of the ordinary but any suggestions or tips would be greatly appreciated.
In case anybody discovers this and has similar problems (this applies to all cometd implementations regardless of host), I discovered that the issue was with using the Google js library. For some reason, the dojo scripts I was loading from Google (1.4) didn't have a valid implementation of cometd. I switched my dojo script to the one that was used by the jetty-1.6.23 example and it works perfectly.

Java VNC Applet vs Screen Capture

I am trying to make an application in which one component captures the screen of the user (for screen casting). I am aware that there are two options to achieve the same using a Java applet (please correct me if I am wrong). First is to use the java applet to take screen shots continuously and convert it into a video and upload it as a video file. And second is to create a java vnc server and record it as a .fbs file and play it using a player like: http://www.wizhelp.com/flashlight-vnc/index.html
I would like to know the best solution in terms of video quality, file size, cross-platform compatibility (windows and mac), firewall problems and finally ease of implementation.
I am very new to Java. Please tell me whats the best solution for my problem. Also, is it easy enough for me to program it on my own or should I get it developed via a freelancer. I have tons of programming experience (5+ years in LAMP) but none in Java.
Thank you very much.
I agree that this is pretty hard. I implemented those two solutions (VNC and onboard screen capture) plus a third (capture from an external VGA source via an Epiphan grabber) for a former employer. I had the best bandwidth-to-quality ratio with VNC, but I got higher framerate with VGA capture. In all three cases, I reduced the frames + capture times to PNGs and sequenced them in a QuickTime reference movie. Then I made flattened video (MPEG4 or SWF) of the results. In my case, I then synchronized the screen video with a DV stream.
In the end the technology worked (see a sample of the output) but our business model failed.
From what I know, the older versions of applet had security restrictions that may not allow for screen capture. Instead, a java application may be feasible.
Regarding the build-it-yourself vs the fire-a-coder, it depends on how you value your time compared to what you can find on a freelancer site.
I think you can find someone from India/Romania/Poland/Other countries that can make it for an affordable price
Given your Java knowledge and the difficulty of the task, have you considered taking an alternative approach? For example, how about a native VNC server for the end-user, which is just a small download and then they click "Run." And that native server is programmed to capture the screen and send it straight to your web server, which has a client like vnc2swf or other means of converting the VNC stream to a video or .fbs file? Does all that make sense?
Admittedly, without Java, you have to prepare one executable program per platform you want to support, however, I don't know. That still sounds easier to me. Consider Copilot.com. They are doing VNC but they still use small native apps for each platform.
Sorry but this seems the kind of job that requires a lot of experience. Even if you find code snippets all around the net to fix this and that, the overall result may be way worse than simply hiring an experienced Java programmer.

Categories