Turn HTML into XML and parse it -- Android Apps [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have been learning how to build android apps this summer. I am currently trying to work on xml parsing which falls under java in this case. I have a few questions that are mostly conceptual and one specific one.
First, in most of the examples I have seen pages already in xml are used. Can I use a page in regular html format and with whatever the program does turn it to xml and then parse it? Or is that what is normally done anyway?
Secondly, I could use a little explanation on how the parser actually works and saves the data so I will better know how to use it (extract it from whatever it is saved in), when the parsing is done.
So for my specific example I am trying to work with some weather data from the NWS. My program will take the data from this page, and after some user input take you to a page like this, which sometimes will have various alerts. I want to select certain ones. This is what I could use help with. I haven't really coded anything on that yet because I don't know what I am doing.
If I need to clarify or rephrase anything in here I am happy too and let me know. I am trying to be a good contributor on here!

Yes you can parse HTML and there are many parsers available too, there is a question about it here Parse HTML in Android, then we have an answer here about parsing html https://stackoverflow.com/a/7114346/826657
Although its a bad idea, as the tag names aren't well named, so you will have to write lots of code searching attributes for a specific data tag, so you always have to prefer XML,for saving lots of code space and also time.
Here is a text from CodingHorror which says at general parsing html is a bad idea.
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
Here is something which explains parsing an XML document using XML PullParser http://www.ibm.com/developerworks/library/x-android/

Related

How can I parse in Java an xml block to string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I've an xml like which looks like this one:
<rootElement>
<title> randmonString </title>
<subElement1>
<someInfo> info </someInfo>
<subElemntTrash> trash </subElementTrash>
<someInfo1> info1 </someInfo1>
</subElement1>
<trash>
<subtrash> trash </subtrash>
</trash>
<date> 19.03.15 </date>
</rootElement>
I need to extract only title, some /subElement1/subInfo, /subElement1/subInfo1 and date, rest should be automatically stored somewhere but without those elements, that were already extracted. I also should have possibility to marshal it back to the original xml.
It would be great if it can be done using annotation mapping.
Can someone give me the right direction to search?
You are asking about parsing, but then you want data extraction, data transformation and finally storing in some undefined form. Very broad question with many possible aporaches.
You can parse XML in java using DOM, SAX, StAX.
You can use XPath to extract interesting information, but it will not divide your document into the interesting bit and the 'rest'.
You can define XSLT templates, to initiate java Transformer, in order to split your input document into the interesting and 'the rest' parts.
You can use JAXB to map the xml into an java model (using your favourite the annotation mapping), and then you can build another representations containing your interesting and 'the rest' part. Then you can save both representation to different xml.

Best way to output Stanford NLP results [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Hi folks: I'm using the Stanford CoreNLP software to process hundreds of letters by different people (each about 10KB). After I get the output, I need to further process it and add information at the level of tokens, sentences, and letters. I'm quite new to NLP and was wondering what the typical or best way would be to output the pipeline results from Stanford CoreNLP to permit further processing?
I'm guessing the typical approach would be to output to XML. If I do, I estimate that will take about a GB of disk space, and I wonder, then, how quick and easy it would be to load that much XML back into Java for further processing and adding of information?
An alternative might be to have CoreNLP serialize the annotation objects it produces and load those back for processing. An advantage: not having to figure out how to convert a sentence parse string back into a tree for further processing. A disadvantage: annotation objects contain a lot of different types of objects I'm still quite rough on manipulating and the documentation on these in Stanford CoreNLP seems slim to me.
This is really matter of what you want to do afterwards. Doing serialization is probably the most straightforward and fast approach, the con is that you need to understand the CoreNLP data structure.
What if you want to read it in another language or read into your own data structure, save as XML.
I would go the first way.

How to drive XML generation from XSD and rules? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am looking for a standard technology to drive the generation of an XML document based on an XSD and a set of rules. Basically I have XSDs that tell me what the XML should look like and what elements are mandatory or optional. What is not in the XSDs is a set of business rules that say things like "if such element's value is this, that other element is actually mandatory" or "if such element's value is that, that other element should be omitted".
What I have in mind is something that would process the XSDs along with the rules (maybe expressed in something like XPath) and call back my code to generate the mandatory values. The structure of the final document would change dynamically depending on the values of the elements driving the conditions.
I guess I could do something close to what I want with XSLTs. I'd generate all the values with and then use an XSLT to enforce the conditions. But in my case some values maybe take long to produce so I want to avoid computing unnecessary values, meaning values that will be later discarded by the business rules.
Does such a technology exist? FYI I am coding in Java but I am hoping to find a generic technology if possible.
Cheers,
Tom
The problem you described can probably be handled by Schematron. It can be used with XML Schema, and if you already know XPath and XSLT you won't find it difficult to understand. If can specify complex relationships between unrelated nodes based on values and context beyond the abilities of XML Schema.
The specification and many tutorials you can find in the Schematron website.

Parse XML to a different format with SAX [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I have a huge problem with parsing an XML file to a different format.
I'm trying to get all the related data like stated in this link: http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/
(I searched stackoverflow before and found this link)
I use the interface XMLReader to parse and the XML Serializer for the output.
I just need to convert my XML with a DTD to another XML with a different DTD. The difference is that, instead of elements from my source XML, most of the children are now attributes in the target XML. There are no new elements, only a different arrangement.
Has anyone an idea how to deal with the problem with a SAX parser?
You can use XMLFilters for that. See Elliotte Rusty Harold's book for explanation and examples:
The basic idea of filters is that an XMLReader, instead of receiving
XML text directly from a file, socket, or other source, receives
already parsed events from another XMLReader. It can change these
events before passing them along to the client application through the
usual methods of ContentHandler and the other callback interfaces. For
example, it can add a unique ID attribute to every element or delete
all elements in the SVG namespace from the input stream.
BTW the mkyong tutorial glosses over how the characters method works, that tends to bite a lot of people when they find their element data getting truncated. There's a better tutorial on Oracle's site.

How to create google like instant search using JSP and servlets? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am working on a basic instant search function that basically searches the database and displays the results instantly just like google instant. This here http://woorkup.com/2010/09/13/how-to-create-your-own-instant-search/ looks promising but I want to know if there is a way to implement this using JSP, java/servlets.
Java and servlets alone will not be sufficient, you will need JavaScript on the client side. Basically you attach a listener to the input field and send an AJAX request to a JSP that does the search and returns the results which you then only have to format and display in a drop-down box below the input field.
This is also a very good tutorial about instant search:
http://www.w3schools.com/php/php_ajax_livesearch.asp
It uses Java Script and PHP. By reading / doing this tutorial you should get a idea how instant search works. So I hope this helps even if you want to use JSP.
You can do this using jQuery. The jQuery UI autocomplete is nice and easy to implement:
http://jqueryui.com/demos/autocomplete/
As previous posters have pointed out, you will have to use JavaScript to do this. The least painful way to use JavaScript here is to use JQuery UI
There is a fairly straightforward walkthrough here: http://blog.comperiosearch.com/2012/06/make-an-instant-search-application-using-json-ajax-and-jquery/
This is an oldie-but-goodie:
http://lab.abhinayrathore.com/autocomplete/
Combines Google,Bing,Yahoo,Wiki,Amazon, etc. all in 1 instant autocomplete. Allows you to easily add/remove websites.

Categories