Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I just started to figure out what regexp is, but I have really limited time!
I have a string in xml like : <myid>1234</myid>, for now my xml is in txt it used to be an xml document.
How can I make pattern to extract 1234 from <myid> tag.
If it really looks like this:
<myid>1234</myid>
...you can extract it like this:
Matcher match = Pattern.compile("<myid>(\d+)</myid>").matcher(str);
...and then use the matcher repeatedly, getting the value from the capture group.
But there's a reason everyone is telling you to use a proper parser. There are lots of ways the above can fail, both matching inappropriately and failing to match when it should.
The proper solution is to make the XML valid, and then parse it, and use XPath or similar to read the values.
If you really have some tool requiring you to send it invalid XML, you need to replace that tool. More likely, though, it's some misunderstanding.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I need to track down a java variable in a java file - which variable it got assigned to, which method it was passed to.
How should I begin with?
Should I use line by line parsing or is there any other method?
It looks like you are asked to build a huge mansion; and you start by asking: "should my shovel to dig the cellar be better round; or more rectangular". Meaning: if you don't understand that parsing a java program requires more than "line by line" reading; then you are doomed to fail.
Anyway, depending on your underlying requirements, there are two possible answers:
As suggested by duffymo, you might want to learn using an IDE which allows you to easily identify "variable usage" within a project; and make modifications via "reflection"
Start using a fully fledged Java parser; like https://code.google.com/p/javaparser/wiki/UsingThisParser
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I've an xml like which looks like this one:
<rootElement>
<title> randmonString </title>
<subElement1>
<someInfo> info </someInfo>
<subElemntTrash> trash </subElementTrash>
<someInfo1> info1 </someInfo1>
</subElement1>
<trash>
<subtrash> trash </subtrash>
</trash>
<date> 19.03.15 </date>
</rootElement>
I need to extract only title, some /subElement1/subInfo, /subElement1/subInfo1 and date, rest should be automatically stored somewhere but without those elements, that were already extracted. I also should have possibility to marshal it back to the original xml.
It would be great if it can be done using annotation mapping.
Can someone give me the right direction to search?
You are asking about parsing, but then you want data extraction, data transformation and finally storing in some undefined form. Very broad question with many possible aporaches.
You can parse XML in java using DOM, SAX, StAX.
You can use XPath to extract interesting information, but it will not divide your document into the interesting bit and the 'rest'.
You can define XSLT templates, to initiate java Transformer, in order to split your input document into the interesting and 'the rest' parts.
You can use JAXB to map the xml into an java model (using your favourite the annotation mapping), and then you can build another representations containing your interesting and 'the rest' part. Then you can save both representation to different xml.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
The schema for the XML file is changing and i need to create a utility that will take the xml file in format A and convert it to format B. How can i do it.
I am not able to figure out the starting point for it.
You will probably want to look into XSLT. You can write one for each iteration of changes, which hopefully you, or whoever is changing the XML, is versioning each change. If that is the case, you will easily be able to transform each version into the next.
On the chance that you do not have versions available to you for the XML, then you will probably have to do very strict matching on your XSLTs.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have been learning how to build android apps this summer. I am currently trying to work on xml parsing which falls under java in this case. I have a few questions that are mostly conceptual and one specific one.
First, in most of the examples I have seen pages already in xml are used. Can I use a page in regular html format and with whatever the program does turn it to xml and then parse it? Or is that what is normally done anyway?
Secondly, I could use a little explanation on how the parser actually works and saves the data so I will better know how to use it (extract it from whatever it is saved in), when the parsing is done.
So for my specific example I am trying to work with some weather data from the NWS. My program will take the data from this page, and after some user input take you to a page like this, which sometimes will have various alerts. I want to select certain ones. This is what I could use help with. I haven't really coded anything on that yet because I don't know what I am doing.
If I need to clarify or rephrase anything in here I am happy too and let me know. I am trying to be a good contributor on here!
Yes you can parse HTML and there are many parsers available too, there is a question about it here Parse HTML in Android, then we have an answer here about parsing html https://stackoverflow.com/a/7114346/826657
Although its a bad idea, as the tag names aren't well named, so you will have to write lots of code searching attributes for a specific data tag, so you always have to prefer XML,for saving lots of code space and also time.
Here is a text from CodingHorror which says at general parsing html is a bad idea.
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
Here is something which explains parsing an XML document using XML PullParser http://www.ibm.com/developerworks/library/x-android/
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to start from a given file(i.e. a.html) and if I see a pattern as like that:
<!--$include file="b.html"-->
I will go that file(b.html) and take whatever it has and all files will be written as into a final file(i.e. output.html)
If I see an include at b.html I should follow that include too and take whatever it has and I should repeat it recursively at Java?
Any ideas?
PS: It is similar to what jsp:include does but I want to implement it myself. I will implement it as a Maven plugin and I constructed a maven plugin for my need however using recursion or not and using a regex pattern or any other efficient way is what I am looking for.
You need to create a function to get files list, e.g. getFileList(htmlFile:File): File[];
Create a readline function and create a function to parse line which pattern is like "^.*<!\\-\\-\\$include file\\=\"(.+)\\.(html|htm)\" \\-\\->.*$", this is a regular expression, it can match what your searched regex. let's defined the function's name as checkRule(line:String):boolean
If checkRule return true, and get file name, then recursively invoke getFileList by passing just found file name.
Be careful about infinite loop. For example, a.html includes b.html, and b.html includes a.html, it would become infinite loop. So you need to check file list to ignore the file.
Good luck!!!