String Operation to get particular value inside the string using java

String Operation to get particular value inside the string using java - java

I have a string as below.
<employees>
<emp>
<name>yaakobu</name>
<sal>$20000</sal>
<designation>Manager</designation>
</emp>
<emp>
<name>daaniyelu</name>
<sal>$2000</sal>
<designation>Operator</designation>
</emp>
<emp>
<name>paadam</name>
<sal>$7000</sal>
<designation>Engineer</designation>
</emp>
</employees>
The above xml i am getting as a string.i was asked not to use parsing due to performance issue.I need to get the second employee 's salary($2000) using java's string operation.Please provide me some pointers.
Your help appreciated.

Your string is xml.
Although it might be tempting to use regex or other string manipulation to extract data from xml - don't do it - it's a bad practice.
You should use some XML parser instead.

After you've done this using your string operations, give the following a try:
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
public class Main {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("test.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
// get the salary from the employee at index 1
XPathExpression expr = xpath.compile("//emp[1]/sal");
Object salary = expr.evaluate(doc, XPathConstants.STRING);
System.out.println(salary);
}
}
which should output:
$20000
I'm not guaranteeing it will be faster, but it won't differ all that much I think. And doing it like this will be far less fragile than doing this with indexOf(...) and substring(...) calls.

I doubt there will be performance issues using an xml parser, but if you want to do it by string parsing, use str.indexOf("<sal>", str.indexOf("<sal>") + 5); Then it will be easy.

Use xml parser or JAXB api for unmarshal the String into object, you can do it through this way also.
private static Object getObject(String yourXml) throws Exception {
JAXBContext jcUnmarshal = null;
Unmarshaller unmarshal = null;
javax.xml.stream.XMLStreamReader rdr = null;
//Object obj = null;
try {
jcUnmarshal = JAXBContext.newInstance("com.test.dto");
unmarshal = jcUnmarshal.createUnmarshaller();
rdr = javax.xml.stream.XMLInputFactory.newInstance().createXMLStreamReader(new StringReader(yourXml));
//obj = (Object) unmarshal.unmarshal(rdr);
return (Object) unmarshal.unmarshal(rdr);
} catch (JAXBException jaxbException) {
jaxbException.printStackTrace();
log.error(jaxbException);
throw new ServiceException(jaxbException.getMessage());
}
finally{
jcUnmarshal = null;
unmarshal = null;
rdr.close();
rdr = null;
}
//return obj;
}

you might use xstream
http://x-stream.github.io/
you put your xml in an object structure and get it from there.
check the samples ,is very easy to use
This if you don't want to parse yourself...:)

Related

Read few xml elements only in an efficient way

I want to read only few XML tag values .I have written the below code.XML is big and a bit complex. But for example I have simplified the xml . Is there any other efficient way to solve it ?I am using JAVA 8
DocumentBuilderFactory dbfaFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = dbfaFactory.newDocumentBuilder();
Document doc = documentBuilder.parse("xml_val.xml");
System.out.println(doc.getElementsByTagName("date_added").item(0).getTextContent());
<item_list id="item_list01">
<numitems_intial>5</numitems_intial>
<item>
<date_added>1/1/2014</date_added>
<added_by person="person01" />
</item>
<item>
<date_added>1/6/2014</date_added>
<added_by person="person05" />
</item>
<numitems_current>7</numitems_current>
<manager person="person48" />
</item_list>

Using XPAth and passing a specific expression to get the desired element
public class MainJaxbXpath {
public static void main(String[] args) {
try {
FileInputStream fileIS;
fileIS = new FileInputStream("/home/luis/tmp/test.xml");
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
builder = builderFactory.newDocumentBuilder();
Document xmlDocument;
xmlDocument = builder.parse(fileIS);
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "//item_list[#id=\"item_list01\"]//date_added[1]";
String nodeList =(String) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.STRING);
System.out.println(nodeList);
} catch (SAXException | IOException | ParserConfigurationException | XPathExpressionException e3) {
e3.printStackTrace();
}
}
}
Result:
1/1/2014
To look for more than one element on the same operation
String expression01 = "//item_list[#id=\"item_list01\"]//date_added[1]";
String expression02 = "//item_list[#id=\"item_list02\"]//date_added[2]";
String expression = String.format("%s | %s", expression01, expression02);
NodeList nodeList =(NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
System.out.println(currentNode.getTextContent());
}
}

Some suggestions.
Firstly, don't use DOM. There's a wide range of dom-like XML tree representations available in Java; DOM is the first and the worst. Later third-party models like JDOM2 and XOM are much better designed.
Secondly, consider doing the whole thing in an XML-oriented language like XSLT or XQuery rather than in Java. In XQuery, using Saxon's XQuery API, this would be:
Processor proc = new Processor(false);
XQueryCompiler comp = proc.newXQueryCompiler();
XQueryExecutable exec = comp.compile("//date_added");
XQueryEvaluator eval = exec.load();
eval.setSource(new StreamSource(new File("/home/luis/tmp/test.xml")));
for (XdmItem item : eval.evaluate()) {
System.out.println(item.getStringValue());
}
But since the query is so simple, Saxon also has a direct map/reduce style API to access the tree. This would be:
Processor proc = new Processor(false);
XdmNode doc = proc.newDocumentBuilder().build(
new StreamSource(new File("/home/luis/tmp/test.xml")));
for (XdmItem item : doc.select(descendant("date_added")).asList()) {
System.out.println(item.getStringValue());
}
A suggestion that has nothing to do with efficiency: please use international standard dates. 1/6/2014 could be 1st June or 6th January. Writing it as 2014-06-01 (or 2014-01-06 if that's what you intended) not only avoids the kind of dangerous bugs that arise if you use an ambiguous format, it also means you can use standard date-and-time processing libraries, such as the XPath 2.0+ function library.

Not able to parse inner elements of XML using DocumentBuilderFactory in Java

I'm having a response as XML. I'm trying to parse the XML object to get inner details. Im using DocumentBuilderFactory for this. The parent object is not null, but when I try to get the deepnode list elements, its returning null. Am I missing anything
Here is my response XML
ResponseXML
<DATAPACKET REQUEST-ID = "1">
<HEADER>
</HEADER>
<BODY>
<CONSUMER_PROFILE2>
<CONSUMER_DETAILS2>
<NAME>David</NAME>
<DATE_OF_BIRTH>1949-01-01T00:00:00+03:00</DATE_OF_BIRTH>
<GENDER>001</GENDER>
</CONSUMER_DETAILS2>
</CONSUMER_PROFILE2></BODY></DATAPACKET>
and Im parsing in the following way
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(responseXML));
// Consumer details.
if(doc.getDocumentElement().getElementsByTagName("CONSUMER_DETAILS2") != null) {
Node consumerDetailsNode = doc.getDocumentElement().getElementsByTagName("CONSUMER_DETAILS2").item(0); -->This is coming as null
dateOfBirth = getNamedItem(consumerDetailsNode, "DATE_OF_BIRTH");
System.out.println("DOB:"+dateOfBirth);
}
getNamedItem
private static String getNamedItem(Node searchResultNode, String param) {
return searchResultNode.getAttributes().getNamedItem(param) != null ? searchResultNode.getAttributes().getNamedItem(param).getNodeValue() : "";
}
Any ideas would be greatly appreciated.

The easiest way to search for individual elements within an XML document is with XPAth. It provides search syntax similar to file system notation.
Here is a solution to the specific problem of you document:
EDIT: solution adopted to support multiple CONSUMER_PROFILE2 elements. You just need to get and parse NodeList instread of one Node
import java.io.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;
import org.xml.sax.*;
public class XpathDemo
{
public static void main(String[] args)
{
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document xmlDoc = builder.parse(new InputSource(new FileReader("C://Temp/xx.xml")));
// Selects all CONSUMER_PROFILE2 elements no matter where they are in the document
String cp2_nodes = "//CONSUMER_PROFILE2";
// Selects first DATE_OF_BIRTH element somewhere under current element
String dob_nodes = "//DATE_OF_BIRTH[1]";
// Selects text child node of current element
String text_node = "/child::text()";
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList dob_list = (NodeList)xPath.compile(cp2_nodes + dob_nodes + text_node)
.evaluate(xmlDoc, XPathConstants.NODESET);
for (int i = 0; i < dob_list.getLength() ; i++) {
Node dob_node = dob_list.item(i);
String dob_text = dob_node.getNodeValue();
System.out.println(dob_text);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

How do I turn a single word into valid xml?

I have the following code which turns a string, that I pass into the function, into a document:
DocumentBuilderFactory dbFactory_ = DocumentBuilderFactory.newInstance();
Document doc_;
void toXml(String s)
{
documentBuild();
DocumentBuilder dBuilder = dbFactory_.newDocumentBuilder();
StringReader reader = new StringReader(s);
InputSource inputSource = new InputSource(reader);
doc_ = dBuilder.parse(inputSource);
}
The problem is that some of the legacy code that I'm using passes into this toXml function a single word like RANDOM or FICTION. I would like to turn these calls into valid xml before trying to parse it. Right now if I call the function with s = FICTION it returns a SAXParseExeption error. Could anyone advise me on the right way to do this? If you have any questions let me know.
Thank you for your time
-Josh

This creates an XmlDocument with an element test
function buildXml(string s) {
XmlDocument d = new XmlDocument();
d.AppendChild(d.CreateElement(s));
StringWriter sw = new StringWriter();
XmlTextWriter xw = new XmlTextWriter(sw);
d.WriteTo(xw);
return sw.ToString();
}
buildXml("Test"); //This will return <Test />
Its a bit ugly but it will create the XML without having to do any string work on your own ;)
You could add this in a try catch in your method so if it fails to load it as an XML directly it passes the string to this and then tries to load it.

Have you tried the seemingly obvious <FICTION/> or <FICTION></FICTION>?

How do you traverse and store XML in Blackberry Java app?

I'm having a problem accessing the contents of an XML document.
My goal is this:
Take an XML source and parse it into a fair equivalent of an associative array, then store it as a persistable object.
the xml is pretty simple:
<root>
<element>
<category_id>1</category_id>
<name>Cars</name>
</element>
<element>
<category_id>2</category_id>
<name>Boats</name>
</element>
</root>
Basic java class below. I'm pretty much just calling save(xml) after http response above. Yes, the xml is properly formatted.
import java.io.IOException;
import java.util.Hashtable;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import java.util.Vector;
import net.rim.device.api.system.PersistentObject;
import net.rim.device.api.system.PersistentStore;
import net.rim.device.api.xml.parsers.DocumentBuilder;
import net.rim.device.api.xml.parsers.DocumentBuilderFactory;
public class database{
private static PersistentObject storeVenue;
static final long key = 0x2ba5f8081f7ef332L;
public Hashtable hashtable;
public Vector venue_list;
String _node,_element;
public database()
{
storeVenue = PersistentStore.getPersistentObject(key);
}
public void save(Document xml)
{
venue_list = new Vector();
storeVenue.setContents(venue_list);
Hashtable categories = new Hashtable();
try{
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory. newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
docBuilder.isValidating();
xml.getDocumentElement ().normalize ();
NodeList list=xml.getElementsByTagName("*");
_node=new String();
_element = new String();
for (int i=0;i<list.getLength();i++){
Node value=list.item(i).getChildNodes().item(0);
_node=list.item(i).getNodeName();
_element=value.getNodeValue();
categories.put(_element, _node);
}
}
catch (Exception e){
System.out.println(e.toString());
}
venue_list.addElement(categories);
storeVenue.commit();
}
The code above is the work in progress, and is most likely heavily flawed. However, I have been at this for days now. I can never seem to get all child nodes, or the name / value pair.
When I print out the vector as a string, I usually end up with results like this:
[{ = root, = element}]
and that's it. No "category_id", no "name"
Ideally, I would end up with something like
[{1 = cars, 2 = boats}]
Any help is appreciated.
Thanks

Here's a fixed version of your program. Changes that I made are as follows:
I removed the DocBuilder-stuff from the save() method. These calls are needed to construct a new Document. Once you have such an object (and you do since it is passed in as an argument) you don't need the DocumentBuilder anymore. A proper use of DocumentBuilder is illustrated in the main method, below.
_node,_element need not be fields. They get new values with each pass through the loop inside save so I made them local variables. In addition I changed their names to category and name to reflect their association with the elements in the XML document.
There's never a need to create a new String object by using new String(). A simple "" in enough (see the initialization of the category and name variables).
Instead of looping over everything (via "*") the loop now iterates over element elements. Then there is a an inner loop that iterates over the children of each element, namely: its category_id and name elements.
In each pass of the inner we set either the category or the name variable depending on the name of the node at hand.
The actual value that is set to these variables is obtained by via node.getTextContent() which returns the stuff between the node's enclosing tags.
class database:
public class database {
private static PersistentObject storeVenue;
static final long key = 0x2ba5f8081f7ef332L;
public Hashtable hashtable;
public Vector venue_list;
public database() {
storeVenue = PersistentStore.getPersistentObject(key);
}
public void save(Document xml) {
venue_list = new Vector();
storeVenue.setContents(venue_list);
Hashtable categories = new Hashtable();
try {
xml.getDocumentElement().normalize();
NodeList list = xml.getElementsByTagName("element");
for (int i = 0; i < list.getLength(); i++) {
String category = "";
String name = "";
NodeList children = list.item(i).getChildNodes();
for(int j = 0; j < children.getLength(); ++j)
{
Node n = children.item(j);
if("category_id".equals(n.getNodeName()))
category = n.getTextContent();
else if("name".equals(n.getNodeName()))
name = n.getTextContent();
}
categories.put(category, name);
System.out.println("category=" + category + "; name=" + name);
}
} catch (Exception e) {
System.out.println(e.toString());
}
venue_list.addElement(categories);
storeVenue.commit();
}
}
Here's a main method:
public static void main(String[] args) throws Exception {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
docBuilder.isValidating();
Document xml = docBuilder.parse(new File("input.xml"));
database db = new database();
db.save(xml);
}

Thank you so much. With only slight modification I was able to do exactly what I was looking for.
Here are the modifications I had to do:
Even though I am building in 1.5, getTextContent was not available. I had to use category = n.getFirstChild().getNodeValue(); to obtain the value of each node. Though there may have been a simple solution like updating my build settings, I am not familiar enough with BB requirements to know when it is safe to stray from the default recommended build settings.
In the main, I had to alter this line:
Document xml = docBuilder.parse(new File("input.xml"));
so that it was reading from an InputStream delivered from my web server, and not necessarily a local file - even though I wonder if storing the xml local would be more efficient than storing a vector full of hash tables.
...
InputStream responseData = connection.openInputStream();
Document xmlParsed = docBuilder.parse(result);
Obviously I skipped over the HTTP connection portion for the sake of keeping this readable.
Your help has saved me a full weekend of blind debugging. Thank you very much! Hopefully this post will help someone else as well.

//res/xml/input.xml
private static String _xmlFileName = "/xml/input.xml";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = getClass().getResourceAsStream( _xmlFileName );
Document document = builder.parse( inputStream );

How do I extract child element from XML to a string in Java?

If I have an XML document like
<root>
<element1>
<child attr1="blah">
<child2>blahblah</child2>
<child>
</element1>
</root>
I want to get an XML string with the first child element. My output string would be
<element1>
<child attr1="blah">
<child2>blahblah</child2>
<child>
</element1>
There are many approaches, would like to see some ideas. I've been trying to use Java XML APIs for it, but it's not clear that there is a good way to do this.
thanks

You're right, with the standard XML API, there's not a good way - here's one example (may be bug ridden; it runs, but I wrote it a long time ago).
import javax.xml.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.w3c.dom.*;
import java.io.*;
public class Proc
{
public static void main(String[] args) throws Exception
{
//Parse the input document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File("in.xml"));
//Set up the transformer to write the output string
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
transformer.setOutputProperty("indent", "yes");
StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);
//Find the first child node - this could be done with xpath as well
NodeList nl = doc.getDocumentElement().getChildNodes();
DOMSource source = null;
for(int x = 0;x < nl.getLength();x++)
{
Node e = nl.item(x);
if(e instanceof Element)
{
source = new DOMSource(e);
break;
}
}
//Do the transformation and output
transformer.transform(source, result);
System.out.println(sw.toString());
}
}
It would seem like you could get the first child just by using doc.getDocumentElement().getFirstChild(), but the problem with that is if there is any whitespace between the root and the child element, that will create a Text node in the tree, and you'll get that node instead of the actual element node. The output from this program is:
D:\home\tmp\xml>java Proc
<?xml version="1.0" encoding="UTF-8"?>
<element1>
<child attr1="blah">
<child2>blahblah</child2>
</child>
</element1>
I think you can suppress the xml version string if you don't need it, but I'm not sure on that. I would probably try to use a third party XML library if at all possible.

Since this is the top google answer and For those of you who just want the basic:
public static String serializeXml(Element element) throws Exception
{
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
StreamResult result = new StreamResult(buffer);
DOMSource source = new DOMSource(element);
TransformerFactory.newInstance().newTransformer().transform(source, result);
return new String(buffer.toByteArray());
}
I use this for debug, which most likely is what you need this for

I would recommend JDOM. It's a Java XML library that makes dealing with XML much easier than the standard W3C approach.

public String getXML(String xmlContent, String tagName){
String startTag = "<"+ tagName + ">";
String endTag = "</"+ tagName + ">";
int startposition = xmlContent.indexOf(startTag);
int endposition = xmlContent.indexOf(endTag, startposition);
if (startposition == -1){
return "ddd";
}
startposition += startTag.length();
if(endposition == -1){
return "eee";
}
return xmlContent.substring(startposition, endposition);
}
Pass your xml as string to this method,and in your case pass 'element' as parameter tagname.

XMLBeans is an easy to use (once you get the hang of it) tool to deal with XML without having to deal with the annoyances of parsing.
It requires that you have a schema for the XML file, but it also provides a tool to generate a schema from an exisint XML file (depending on your needs the generated on is probably fine).

If your xml has schema backing it, you could use xmlbeans or JAXB to generate pojo objects that help you marshal/unmarshal xml.
http://xmlbeans.apache.org/
https://jaxb.dev.java.net/

As question is actually about first occurrence of string inside another string, I would use String class methods, instead of XML parsers:
public static String getElementAsString(String xml, String tagName){
int beginIndex = xml.indexOf("<" + tagName);
int endIndex = xml.indexOf("</" + tagName, beginIndex) + tagName.length() + 3;
return xml.substring(beginIndex, endIndex);
}

You can use following function to extract xml block as string by passing proper xpath expression,
private static String nodeToString(Node node) throws TransformerException
{
StringWriter buf = new StringWriter();
Transformer xform = TransformerFactory.newInstance().newTransformer();
xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
xform.transform(new DOMSource(node), new StreamResult(buf));
return(buf.toString());
}
public static void main(String[] args) throws Exception
{
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
XPath xPath = XPathFactory.newInstance().newXPath();
Node result = (Node)xPath.evaluate("A/B/C", doc, XPathConstants.NODE); //"A/B[id = '1']" //"//*[#type='t1']"
System.out.println(nodeToString(result));
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

String Operation to get particular value inside the string using java - java

Your string is xml. Although it might be tempting to use regex or other string manipulation to extract data from xml - don't do it - it's a bad practice. You should use some XML parser instead.

I doubt there will be performance issues using an xml parser, but if you want to do it by string parsing, use str.indexOf("<sal>", str.indexOf("<sal>") + 5); Then it will be easy.

you might use xstream http://x-stream.github.io/ you put your xml in an object structure and get it from there. check the samples ,is very easy to use This if you don't want to parse yourself...:)

Related

Read few xml elements only in an efficient way

Not able to parse inner elements of XML using DocumentBuilderFactory in Java

How do I turn a single word into valid xml?

How do you traverse and store XML in Blackberry Java app?

How do I extract child element from XML to a string in Java?

Categories

Resources