Java, Refactoring case - java

I was given exercise that I need to refactor several java projects.
Only those 2 left which I truly don't have an idea how to refactor.
csv.writer
public class CsvWriter {
public CsvWriter() {
}
public void write(String[][] lines) {
for (int i = 0; i < lines.length; i++)
writeLine(lines[i]);
}
private void writeLine(String[] fields) {
if (fields.length == 0)
System.out.println();
else {
writeField(fields[0]);
for (int i = 1; i < fields.length; i++) {
System.out.print(",");
writeField(fields[i]);
}
System.out.println();
}
}
private void writeField(String field) {
if (field.indexOf(',') != -1 || field.indexOf('\"') != -1)
writeQuoted(field);
else
System.out.print(field);
}
private void writeQuoted(String field) {
System.out.print('\"');
for (int i = 0; i < field.length(); i++) {
char c = field.charAt(i);
if (c == '\"')
System.out.print("\"\"");
else
System.out.print(c);
}
System.out.print('\"');
}
}
csv.writertest
public class CsvWriterTest {
#Test
public void testWriter() {
CsvWriter writer = new CsvWriter();
String[][] lines = new String[][] {
new String[] {},
new String[] { "only one field" },
new String[] { "two", "fields" },
new String[] { "", "contents", "several words included" },
new String[] { ",", "embedded , commas, included",
"trailing comma," },
new String[] { "\"", "embedded \" quotes",
"multiple \"\"\" quotes\"\"" },
new String[] { "mixed commas, and \"quotes\"", "simple field" } };
// Expected:
// -- (empty line)
// only one field
// two,fields
// ,contents,several words included
// ",","embedded , commas, included","trailing comma,"
// """","embedded "" quotes","multiple """""" quotes"""""
// "mixed commas, and ""quotes""",simple field
writer.write(lines);
}
}
test
public class Configuration {
public int interval;
public int duration;
public int departure;
public void load(Properties props) throws ConfigurationException {
String valueString;
int value;
valueString = props.getProperty("interval");
if (valueString == null) {
throw new ConfigurationException("monitor interval");
}
value = Integer.parseInt(valueString);
if (value <= 0) {
throw new ConfigurationException("monitor interval > 0");
}
interval = value;
valueString = props.getProperty("duration");
if (valueString == null) {
throw new ConfigurationException("duration");
}
value = Integer.parseInt(valueString);
if (value <= 0) {
throw new ConfigurationException("duration > 0");
}
if ((value % interval) != 0) {
throw new ConfigurationException("duration % interval");
}
duration = value;
valueString = props.getProperty("departure");
if (valueString == null) {
throw new ConfigurationException("departure offset");
}
value = Integer.parseInt(valueString);
if (value <= 0) {
throw new ConfigurationException("departure > 0");
}
if ((value % interval) != 0) {
throw new ConfigurationException("departure % interval");
}
departure = value;
}
}
public class ConfigurationException extends Exception {
private static final long serialVersionUID = 1L;
public ConfigurationException() {
super();
}
public ConfigurationException(String arg0) {
super(arg0);
}
public ConfigurationException(String arg0, Throwable arg1) {
super(arg0, arg1);
}
public ConfigurationException(Throwable arg0) {
super(arg0);
}
}
configuration.test
public class ConfigurationTest{
#Test
public void testGoodInput() throws IOException {
String data = "interval = 10\nduration = 100\ndeparture = 200\n";
Properties input = loadInput(data);
Configuration props = new Configuration();
try {
props.load(input);
} catch (ConfigurationException e) {
assertTrue(false);
return;
}
assertEquals(props.interval, 10);
assertEquals(props.duration, 100);
assertEquals(props.departure, 200);
}
#Test
public void testNegativeValues() throws IOException {
processBadInput("interval = -10\nduration = 100\ndeparture = 200\n");
processBadInput("interval = 10\nduration = -100\ndeparture = 200\n");
processBadInput("interval = 10\nduration = 100\ndeparture = -200\n");
}
#Test
public void testInvalidDuration() throws IOException {
processBadInput("interval = 10\nduration = 99\ndeparture = 200\n");
}
#Test
public void testInvalidDeparture() throws IOException {
processBadInput("interval = 10\nduration = 100\ndeparture = 199\n");
}
#Test
private void processBadInput(String data) throws IOException {
Properties input = loadInput(data);
boolean failed = false;
Configuration props = new Configuration();
try {
props.load(input);
} catch (ConfigurationException e) {
failed = true;
}
assertTrue(failed);
}
#Test
private Properties loadInput(String data) throws IOException {
InputStream is = new StringBufferInputStream(data);
Properties input = new Properties();
input.load(is);
is.close();
return input;
}
}

Ok, here some advice regarding the code.
CsvWriter
The bad thing is that you print everything to System.out. It will be hard to test without mocks. Instead I suggest you to add field PrintStream which defines where all data will go.
import java.io.PrintStream;
public class CsvWriter {
private final PrintStream printStream;
public CsvWriter() {
this.printStream = System.out;
}
public CsvWriter(PrintStream printStream) {
this.printStream = printStream;
}
...
You then write everything to this stream. This refactoring easy since you use replace function(Ctrl+R in IDEA). Here is the example how you do it.
private void writeField(String field) {
if (field.indexOf(',') != -1 || field.indexOf('\"') != -1)
writeQuoted(field);
else
printStream.print(field);
}
Others stuff seems ok in this class.
CsvWriterTest
First thing first you don't check all logic in a single method. Make small methods with different kind of tests. It's ok to keep your current test though. Sometimes it's useful to check most of the logic in a complex scenario.
Also pay attention to the names of the methods. Check this
Obviously you test doesn't check the results. That's why we need this functionality with PrintStream. We can build a PrintStream on top of the instance of ByteArrayOutputStream. We then construct a string and check if the content is valid. Here is how you can easily check what was written
public class CsvWriterTest {
private ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
private PrintStream printStream = new PrintStream(byteArrayOutputStream);
#Test
public void testWriter() {
CsvWriter writer = new CsvWriter(printStream);
... old logic here ...
writer.write(lines);
String result = new String(byteArrayOutputStream.toByteArray());
Assert.assertTrue(result.contains("two,fields"));
Configuration
Make fields private
Make messages more concise
ConfigurationException
Seems good about serialVersionUID. This thing is needed for serialization/deserialization.
ConfigurationTest
Do not use assertTrue(false/failed); Use Assert.fail(String) with some message which is understandable.
Tip: if you don't have much experience and need to refactor code like this, you may want to read some chapters of Effective Java 2nd edition by Joshua Bloch. The book is not so big so you can read it in a week and it has some rules how to write clean and understandable code.

Related

Apache Hbase MapReduce job take too much time while reading the datastore

I have setup Apache Hbase, Nutch and Hadoop cluster. I have crawled few documents i.e., about 30 Million. There are 3 workers in the cluster and 1 master. I have write my own Hbase mapreduce job to read crawled data and change some score little bit based on some logic.
For this purpose, I have combined the documents of same domain and found their effective bytes and found some score. Later, in reducer, I have assigned that score to each URL of that domain (via cache). This portion of job takes took much time i.e., 16 hours. Following is the code snippet
for ( int index = 0; index < Cache.size(); index++) {
String Orig_key = Cache.get(index);
float doc_score = log10;
WebPage page = datastore.get(Orig_key);
if ( page == null ) {
continue;
}
page.setScore(doc_score);
if (mark) {
page.getMarkers().put( Queue, Q1);
}
context.write(Orig_key, page);
}
If I remove that document read statement from datastore then job is finished in 2 to 3 hours only. That why, I think the statement WebPage page = datastore.get(Orig_key); is causing this problem. Is'nt it ?
If that is the case then what is best approach. The Cache object is simply a list that contains URLs of same domain.
DomainAnalysisJob.java
...
...
public class DomainAnalysisJob implements Tool {
public static final Logger LOG = LoggerFactory
.getLogger(DomainAnalysisJob.class);
private static final Collection<WebPage.Field> FIELDS = new HashSet<WebPage.Field>();
private Configuration conf;
protected static final Utf8 URL_ORIG_KEY = new Utf8("doc_orig_id");
protected static final Utf8 DOC_DUMMY_MARKER = new Utf8("doc_marker");
protected static final Utf8 DUMMY_KEY = new Utf8("doc_id");
protected static final Utf8 DOMAIN_DUMMY_MARKER = new Utf8("domain_marker");
protected static final Utf8 LINK_MARKER = new Utf8("link");
protected static final Utf8 Queue = new Utf8("q");
private static URLNormalizers urlNormalizers;
private static URLFilters filters;
private static int maxURL_Length;
static {
FIELDS.add(WebPage.Field.STATUS);
FIELDS.add(WebPage.Field.LANG_INFO);
FIELDS.add(WebPage.Field.URDU_SCORE);
FIELDS.add(WebPage.Field.MARKERS);
FIELDS.add(WebPage.Field.INLINKS);
}
/**
* Maps each WebPage to a host key.
*/
public static class Mapper extends GoraMapper<String, WebPage, Text, WebPage> {
#Override
protected void setup(Context context) throws IOException ,InterruptedException {
Configuration conf = context.getConfiguration();
urlNormalizers = new URLNormalizers(context.getConfiguration(), URLNormalizers.SCOPE_DEFAULT);
filters = new URLFilters(context.getConfiguration());
maxURL_Length = conf.getInt("url.characters.max.length", 2000);
}
#Override
protected void map(String key, WebPage page, Context context)
throws IOException, InterruptedException {
String reversedHost = null;
if (page == null) {
return;
}
if ( key.length() > maxURL_Length ) {
return;
}
String url = null;
try {
url = TableUtil.unreverseUrl(key);
url = urlNormalizers.normalize(url, URLNormalizers.SCOPE_DEFAULT);
url = filters.filter(url); // filter the url
} catch (Exception e) {
LOG.warn("Skipping " + key + ":" + e);
return;
}
if ( url == null) {
context.getCounter("DomainAnalysis", "FilteredURL").increment(1);
return;
}
try {
reversedHost = TableUtil.getReversedHost(key.toString());
}
catch (Exception e) {
return;
}
page.getMarkers().put( URL_ORIG_KEY, new Utf8(key) );
context.write( new Text(reversedHost), page );
}
}
public DomainAnalysisJob() {
}
public DomainAnalysisJob(Configuration conf) {
setConf(conf);
}
#Override
public Configuration getConf() {
return conf;
}
#Override
public void setConf(Configuration conf) {
this.conf = conf;
}
public void updateDomains(boolean buildLinkDb, int numTasks) throws Exception {
NutchJob job = NutchJob.getInstance(getConf(), "rankDomain-update");
job.getConfiguration().setInt("mapreduce.task.timeout", 1800000);
if ( numTasks < 1) {
job.setNumReduceTasks(job.getConfiguration().getInt(
"mapred.map.tasks", job.getNumReduceTasks()));
} else {
job.setNumReduceTasks(numTasks);
}
ScoringFilters scoringFilters = new ScoringFilters(getConf());
HashSet<WebPage.Field> fields = new HashSet<WebPage.Field>(FIELDS);
fields.addAll(scoringFilters.getFields());
StorageUtils.initMapperJob(job, fields, Text.class, WebPage.class,
Mapper.class);
StorageUtils.initReducerJob(job, DomainAnalysisReducer.class);
job.waitForCompletion(true);
}
#Override
public int run(String[] args) throws Exception {
boolean linkDb = false;
int numTasks = -1;
for (int i = 0; i < args.length; i++) {
if ("-rankDomain".equals(args[i])) {
linkDb = true;
} else if ("-crawlId".equals(args[i])) {
getConf().set(Nutch.CRAWL_ID_KEY, args[++i]);
} else if ("-numTasks".equals(args[i]) ) {
numTasks = Integer.parseInt(args[++i]);
}
else {
throw new IllegalArgumentException("unrecognized arg " + args[i]
+ " usage: updatedomain -crawlId <crawlId> [-numTasks N]" );
}
}
LOG.info("Updating DomainRank:");
updateDomains(linkDb, numTasks);
return 0;
}
public static void main(String[] args) throws Exception {
final int res = ToolRunner.run(NutchConfiguration.create(),
new DomainAnalysisJob(), args);
System.exit(res);
}
}
DomainAnalysisReducer.java
...
...
public class DomainAnalysisReducer extends
GoraReducer<Text, WebPage, String, WebPage> {
public static final Logger LOG = DomainAnalysisJob.LOG;
public DataStore<String, WebPage> datastore;
protected static float q1_ur_threshold = 500.0f;
protected static float q1_ur_docCount = 50;
public static final Utf8 Queue = new Utf8("q"); // Markers for Q1 and Q2
public static final Utf8 Q1 = new Utf8("q1");
public static final Utf8 Q2 = new Utf8("q2");
#Override
protected void setup(Context context) throws IOException,
InterruptedException {
Configuration conf = context.getConfiguration();
try {
datastore = StorageUtils.createWebStore(conf, String.class, WebPage.class);
}
catch (ClassNotFoundException e) {
throw new IOException(e);
}
q1_ur_threshold = conf.getFloat("domain.queue.threshold.bytes", 500.0f);
q1_ur_docCount = conf.getInt("domain.queue.doc.count", 50);
LOG.info("Conf updated: Queue-bytes-threshold = " + q1_ur_threshold + " Queue-doc-threshold: " + q1_ur_docCount);
}
#Override
protected void cleanup(Context context) throws IOException, InterruptedException {
datastore.close();
}
#Override
protected void reduce(Text key, Iterable<WebPage> values, Context context)
throws IOException, InterruptedException {
ArrayList<String> Cache = new ArrayList<String>();
int doc_counter = 0;
int total_ur_bytes = 0;
for ( WebPage page : values ) {
// cache
String orig_key = page.getMarkers().get( DomainAnalysisJob.URL_ORIG_KEY ).toString();
Cache.add(orig_key);
// do not consider those doc's that are not fetched or link URLs
if ( page.getStatus() == CrawlStatus.STATUS_UNFETCHED ) {
continue;
}
doc_counter++;
int ur_score_int = 0;
int doc_ur_bytes = 0;
int doc_total_bytes = 0;
String ur_score_str = "0";
String langInfo_str = null;
// read page and find its Urdu score
langInfo_str = TableUtil.toString(page.getLangInfo());
if (langInfo_str == null) {
continue;
}
ur_score_str = TableUtil.toString(page.getUrduScore());
ur_score_int = Integer.parseInt(ur_score_str);
doc_total_bytes = Integer.parseInt( langInfo_str.split("&")[0] );
doc_ur_bytes = ( doc_total_bytes * ur_score_int) / 100; //Formula to find ur percentage
total_ur_bytes += doc_ur_bytes;
}
float avg_bytes = 0;
float log10 = 0;
if ( doc_counter > 0 && total_ur_bytes > 0) {
avg_bytes = (float) total_ur_bytes/doc_counter;
log10 = (float) Math.log10(avg_bytes);
log10 = (Math.round(log10 * 100000f)/100000f);
}
context.getCounter("DomainAnalysis", "DomainCount").increment(1);
// if average bytes and doc count, are more than threshold then mark as q1
boolean mark = false;
if ( avg_bytes >= q1_ur_threshold && doc_counter >= q1_ur_docCount ) {
mark = true;
for ( int index = 0; index < Cache.size(); index++) {
String Orig_key = Cache.get(index);
float doc_score = log10;
WebPage page = datastore.get(Orig_key);
if ( page == null ) {
continue;
}
page.setScore(doc_score);
if (mark) {
page.getMarkers().put( Queue, Q1);
}
context.write(Orig_key, page);
}
}
}
In my testing and debugging, I have found that the statement WebPage page = datastore.get(Orig_key); is major cause of too much time. It took about 16 hours to complete the job but when I replaced this statement with WebPage page = WebPage.newBuilder().build(); the time was reduced to 6 hours. Is this due to IO ?

Why an object doesn't change when I send it through writeObject method?

I'm making networking program with Java. As the title, the object which server is trying to send is changed in client which receives it. I'm trying to change the object which exists in client before I receive the new one from server.
Here's my codes. First one is Server.sendIdea and second is Client.rcvIdea.
void sendIdea(Idea _idea) throws IOException {
objectOS.flush();
Idea idea = _idea;
//when I look into 'idea' it's fine
objectOS.writeObject(idea);
}
..
Idea rcvIdea(int _ideaCode) throws ClassNotFoundException, IOException {
objectOS.writeObject("sendIdea");
objectOS.writeObject(_ideaCode);
Idea returnValue = (Idea) objectIS.readObject();
//when I look into 'returnValue', it is not the one 'sendIdea' has sent.
return returnValue;
}
As you can see, sendIdea(Idea _idea) is sending an object from the class Idea by using writeObject method. And rcvIdea() is receiving the object by using readObject() method. (I'm sure you don't have to know about class Idea in detail). The client actually received some Ideas at start of this program by this method and there was no problem. But when I try to receive the same but slightly changed object Idea by this method, in Client class the object does not change, not like in Server class where the object which is going to be sent by sendIdea method is changed correctly. I tried about 5 hours to solve this problem. I checked all the codes line by line and found nothing. I'm pretty sure that writeObject or readObject method have problem. I tried objectOS.flush() to make clear of the stream and many other trials. I hope that I can find the problem. Below is some codes in my program
Client.class
package ideaOcean;
import java.awt.HeadlessException;
import java.io.IOException;
import java.io.InputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.OutputStream;
import java.net.Socket;
import java.net.UnknownHostException;
import java.util.ArrayList;
import javax.swing.JOptionPane;
import data.Idea;
import data.Opinion;
import data.Profile;
public class Client {
Socket socket;
OutputStream os;
ObjectOutputStream objectOS;
InputStream is;
ObjectInputStream objectIS;
MainWindow mainWindow;
int idCode;
String email, password;
Profile myProfile;
ArrayList<Idea> myIdeas;
ArrayList<Opinion> myOpinions;
ArrayList<Integer> newIdeasCodes, hotIdeasCodes;
ArrayList<Idea> newIdeas, hotIdeas;
String command;
static final String SERVER_IP = "127.0.0.1";//
static final int SERVER_PORT_NUM = 5000;
public static void main(String[] args) {
Client client = new Client();
client.mainWindow = new MainWindow();
client.mainWindow.setVisible(true);
client.mainWindow.showLoginPg();
try {
while (!client.loginCheck()) {// login
continue;
}
} catch (HeadlessException | NumberFormatException | ClassNotFoundException | IOException e) {
e.printStackTrace();
}
System.out.println("[login complete]");
try {
client.myProfile = client.rcvProfile(client.idCode);// get myProfile
int i;
for (i = 0; i < client.myProfile.myIdeaCode.size(); i++) {
client.myIdeas.add(client.rcvIdea(client.myProfile.myIdeaCode.get(i)));
}
for (i = 0; i < client.myProfile.myOpinionCode.size(); i++) {
client.myOpinions.add(client.rcvOpinion(client.myProfile.myOpinionCode.get(i)));
}
// ***************************
} catch (ClassNotFoundException | IOException e1) {
e1.printStackTrace();
}
try {
client.rcvNewIdeas(12);
client.mainWindow.newOcean.floatingIdeas = client.newIdeas;
client.mainWindow.newOcean.arrangeFloatingPanels();
client.rcvHotIdeas(12);
client.mainWindow.hotOcean.floatingIdeas = client.hotIdeas;
client.mainWindow.hotOcean.arrangeFloatingPanels();
} catch (ClassNotFoundException | IOException e) {
e.printStackTrace();
}
client.mainWindow.setMyPg(client.myProfile, client.myIdeas, client.myOpinions);
client.mainWindow.showMainPg();
client.start();
}
public Client() {
try {
socket = new Socket(SERVER_IP, SERVER_PORT_NUM);
System.out.println("Connected to Server!");
os = socket.getOutputStream();
objectOS = new ObjectOutputStream(os);
is = socket.getInputStream();
objectIS = new ObjectInputStream(is);
myIdeas = new ArrayList<>();
myOpinions = new ArrayList<>();
newIdeasCodes = new ArrayList<>();
hotIdeasCodes = new ArrayList<>();
newIdeas = new ArrayList<>();
hotIdeas = new ArrayList<>();
} catch (UnknownHostException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
void start() {
while (true) {
try {
if (mainWindow.newBtnClicked) {
rcvNewIdeas(12);
mainWindow.newOcean.floatingIdeas = newIdeas;
mainWindow.newOcean.arrangeFloatingPanels();
mainWindow.newBtnClicked = false;
} else if (mainWindow.hotBtnClicked) {
rcvHotIdeas(12);
mainWindow.hotOcean.floatingIdeas = hotIdeas;
mainWindow.hotOcean.arrangeFloatingPanels();
mainWindow.hotBtnClicked = false;
} else if (mainWindow.newOcean.detailBtnClicked) {
updateIdeaDetailFrame(mainWindow.newOcean.clickedIdea);
mainWindow.newOcean.detailBtnClicked = false;
} else if (mainWindow.hotOcean.detailBtnClicked) {
updateIdeaDetailFrame(mainWindow.hotOcean.clickedIdea);
mainWindow.hotOcean.detailBtnClicked = false;
} else if (mainWindow.ideaDetailFrame.saveOpinionBtnClicked) {
sendOpinion(mainWindow.ideaDetailFrame.newOpinion);
updateIdeaDetailMainPanel(rcvIdea(mainWindow.ideaDetailFrame.idea.ideaCode));
mainWindow.ideaDetailFrame.saveOpinionBtnClicked = false;
} else if (mainWindow.writeIdeaPg.postIdeaBtnClicked) {
sendIdea(mainWindow.writeIdeaPg.thisIdea);
mainWindow.writeIdeaPg.postIdeaBtnClicked = false;
} else if (mainWindow.newOcean.plusBtnClicked) {
objectOS.writeObject("plusBtnClicked");
objectOS.writeObject(mainWindow.newOcean.plusMinusClickedIdeaCode);
mainWindow.newOcean.plusBtnClicked = false;
} else if (mainWindow.newOcean.minusBtnClicked) {
objectOS.writeObject("minusBtnClicked");
objectOS.writeObject(mainWindow.newOcean.plusMinusClickedIdeaCode);
mainWindow.newOcean.minusBtnClicked = false;
} else if (mainWindow.hotOcean.plusBtnClicked) {
objectOS.writeObject("plusBtnClicked");
objectOS.writeObject(mainWindow.hotOcean.plusMinusClickedIdeaCode);
mainWindow.hotOcean.plusBtnClicked = false;
} else if (mainWindow.hotOcean.minusBtnClicked) {
objectOS.writeObject("minusBtnClicked");
objectOS.writeObject(mainWindow.hotOcean.plusMinusClickedIdeaCode);
mainWindow.hotOcean.minusBtnClicked = false;
} else if (mainWindow.myBtnClicked) {
mainWindow.setMyPg(myProfile, myIdeas, myOpinions);
mainWindow.myBtnClicked = false;
}
} catch (ClassNotFoundException | IOException e) {
e.printStackTrace();
}
}
}
int i = 0;
Idea rcvIdea(int _ideaCode) throws ClassNotFoundException, IOException {
objectOS.writeObject("sendIdea");
objectOS.writeObject(_ideaCode);
Idea returnValue = (Idea) objectIS.readObject();
return returnValue;
}
Opinion rcvOpinion(int _opinionCode) throws ClassNotFoundException, IOException {
objectOS.writeObject("sendOpinion");
objectOS.writeObject(_opinionCode);
return (Opinion) objectIS.readObject();
}
Profile rcvProfile(int _idCode) throws IOException, ClassNotFoundException {
objectOS.writeObject("sendProfile");
objectOS.writeObject(_idCode);
return (Profile) objectIS.readObject();
}
void rcvNewIdeasCodes() throws ClassNotFoundException, IOException {
objectOS.writeObject("sendNewIdeasCodes");
newIdeasCodes = (ArrayList<Integer>) objectIS.readObject();
}
void rcvHotIdeasCodes() throws IOException, ClassNotFoundException {
objectOS.writeObject("sendHotIdeasCodes");
hotIdeasCodes = (ArrayList<Integer>) objectIS.readObject();
}
void rcvNewIdeas(int num) throws ClassNotFoundException, IOException {
int i;
rcvNewIdeasCodes();
newIdeas = new ArrayList<>();
if (num <= newIdeasCodes.size()) {
for (i = 0; i < num; i++) {
newIdeas.add(rcvIdea(newIdeasCodes.get(i)));
}
} else {
for (i = 0; i < newIdeasCodes.size(); i++) {
newIdeas.add(rcvIdea(newIdeasCodes.get(i)));
}
}
}
void rcvHotIdeas(int num) throws ClassNotFoundException, IOException {
int i;
rcvHotIdeasCodes();
hotIdeas = new ArrayList<>();
if (num <= hotIdeasCodes.size()) {
for (i = 0; i < num; i++) {
hotIdeas.add(rcvIdea(hotIdeasCodes.get(i)));
}
} else {
for (i = 0; i < hotIdeasCodes.size(); i++) {
hotIdeas.add(rcvIdea(hotIdeasCodes.get(i)));
}
}
}
void sendIdea(Idea _idea) throws IOException {
objectOS.writeObject("rcvIdea");
objectOS.writeObject(_idea);
}
void sendOpinion(Opinion _opinion) throws IOException {
objectOS.writeObject("rcvOpinion");
objectOS.writeObject(_opinion);
}
void sendProfile(Profile _profile) throws IOException {
objectOS.writeObject(_profile);
}
boolean loginCheck() throws HeadlessException, NumberFormatException, IOException, ClassNotFoundException {
objectOS.writeObject("loginCheck");// send command
while (!mainWindow.loginBtnClicked) {
continue;
}
mainWindow.loginBtnClicked = false;
email = mainWindow.emailField.getText().trim();
password = mainWindow.passwordField.getText().trim();
objectOS.writeObject(email);
objectOS.writeObject(password);
boolean valid;
valid = (boolean) objectIS.readObject();
if (valid == false) {
JOptionPane.showMessageDialog(mainWindow, "ID or Password is not correct");
mainWindow.emailField.setText("");
mainWindow.passwordField.setText("");
return false;
} else if (valid == true) {
idCode = (int) objectIS.readObject();
return true;
} else {
return false;
}
}
void updateIdeaDetailMainPanel(Idea clickedIdea) throws ClassNotFoundException, IOException {
ArrayList<Opinion> opinions = new ArrayList<>();
for (int j = 0; j < clickedIdea.opinionCode.size(); j++) {
opinions.add(rcvOpinion(clickedIdea.opinionCode.get(j)));
}
mainWindow.ideaDetailFrame.updateMainPanel(opinions);
}
void updateIdeaDetailFrame(Idea clickedIdea) throws ClassNotFoundException, IOException {
ArrayList<Opinion> opinions = new ArrayList<>();
for (int j = 0; j < clickedIdea.opinionCode.size(); j++) {
opinions.add(rcvOpinion(clickedIdea.opinionCode.get(j)));
}
mainWindow.ideaDetailFrame = new IdeaDetailFrame(clickedIdea, opinions);
mainWindow.ideaDetailFrame.setVisible(true);
}
}
Idea.class
package data;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Date;
public class Idea implements Serializable {
private static final long serialVersionUID = 123123L;
public int idCode;
public int ideaCode;
public int plus = 0, minus = 0;
public String ideaName;
public String oneLineExp;
public String explanation;
public ArrayList<Integer> opinionCode;
public Date date;
public MyCanvas image;
int hotDegree;
public Idea(int _idCode,int _ideaCode, String _ideaName, String _oneLineExp, String _explanation, MyCanvas _image) {
this(_idCode,_ideaName,_oneLineExp,_explanation,_image);
ideaCode = _ideaCode;
}
public Idea(int _idCode, String _ideaName, String _oneLineExp, String _explanation, MyCanvas _image) {
this(_idCode,_ideaName,_oneLineExp,_explanation);
image = _image;
}
public Idea(int _idCode, String _ideaName, String _oneLineExp, String _explanation){
idCode = _idCode;
oneLineExp = new String(_oneLineExp);
ideaName = new String(_ideaName);
explanation = new String(_explanation);
date = new Date();
opinionCode = new ArrayList<>();
}
public void saveIdea() {
FileOutputStream fos = null;
ObjectOutputStream oos = null;
try {
fos = new FileOutputStream("Idea.dat");
oos = new ObjectOutputStream(fos);
oos.writeObject(this);
} catch (IOException e1) {
System.out.println("e1");
}
}
void addOpinionCode(int _opinion) {
opinionCode.add(opinionCode.size(), _opinion);
}
public void incPlus() {
plus++;
}
public void incMinus() {
minus++;
}
public int setHotDegree() {
hotDegree = plus - minus + opinionCode.size() * 2;
return hotDegree;
}
}
Opinion.class
package data;
import java.io.Serializable;
import java.util.Date;
public class Opinion implements Serializable{
int idCode;
public int opinionCode;//the intrinsic code of this opinion
public int commentedIdeaCode;
public String opinion;
public Date date;
int plus, minus;
public Opinion(int _idCode,int _commentedIdeaCode, String _opinion){
idCode = _idCode;
commentedIdeaCode = _commentedIdeaCode;
opinion = new String(_opinion);
date = new Date();
plus = 0;
minus = 0;
}// Opinion(int _idCode,int _commentedIdeaCode, String _opinion)
public Opinion(int _idCode,int _opinionCode,int _commentedIdeaCode, String _opinion){
this(_idCode, _commentedIdeaCode, _opinion);
opinionCode = _opinionCode;
}//Opinion(int _idCode,int _opinionCode,int _commentedIdeaCode, String _opinion)
void incPlus(){
plus++;
}
void incMinus(){
minus++;
}
}
ObjectOutputStream creates a graph of all objects already serialized, and uses references to previously serialized objects. Therefore, when you serialize an Idea instance multiple times, each time after the first, a reference to the first serialization is written instead of the full object.
You can use ObjectOutputStream.reset() after each serialization. This discards the object graph and forces ObjectOutputStream to create new object serializations, even for objects it had seen before.
Your sendIdea method should therefore look like this:
void sendIdea(Idea _idea) throws IOException {
objectOS.flush();
objectOS.reset();
Idea idea = _idea;
objectOS.writeObject(idea);
}
Very importantly, please note that after reset(), all object references are serialized anew. So if you have a complex object graph, you may end up with object duplicates after deserialization.
If you want to share transitive references for an object that is to be serialized multiple times, look into ObjectOutputStream.writeUnshared() instead.

How to eliminate repeat code in a for-loop?

I have implemented two member functions in the same class:
private static void getRequiredTag(Context context) throws IOException
{
//repeated begin
for (Record record : context.getContext().readCacheTable("subscribe")) {
String traceId = record.get("trace_id").toString();
if (traceSet.contains(traceId) == false)
continue;
String tagId = record.get("tag_id").toString();
try {
Integer.parseInt(tagId);
} catch (NumberFormatException e) {
context.getCounter("Error", "tag_id not a number").increment(1);
continue;
}
//repeated end
tagSet.add(tagId);
}
}
private static void addTagToTraceId(Context context) throws IOException
{
//repeated begin
for (Record record : context.getContext().readCacheTable("subscribe")) {
String traceId = record.get("trace_id").toString();
if (traceSet.contains(traceId) == false)
continue;
String tagId = record.get("tag_id").toString();
try {
Integer.parseInt(tagId);
} catch (NumberFormatException e) {
context.getCounter("Error", "tag_id not a number").increment(1);
continue;
}
//repeated end
Vector<String> ret = traceListMap.get(tagId);
if (ret == null) {
ret = new Vector<String>();
}
ret.add(traceId);
traceListMap.put(tagId, ret);
}
}
I will call that two member functions in another two member functions(so I can't merge them into one function):
private static void A()
{
getRequiredTag()
}
private static void B()
{
getRequiredTag()
addTagToTraceId()
}
tagSet is java.util.Set and traceListMap is java.util.Map.
I know DRY principle and I really want to eliminate the repeat code, so I come to this code:
private static void getTraceIdAndTagIdFromRecord(Record record, String traceId, String tagId) throws IOException
{
traceId = record.get("trace_id").toString();
tagId = record.get("tag_id").toString();
}
private static boolean checkTagIdIsNumber(String tagId)
{
try {
Integer.parseInt(tagId);
} catch (NumberFormatException e) {
return false;
}
return true;
}
private static void getRequiredTag(Context context) throws IOException
{
String traceId = null, tagId = null;
for (Record record : context.getContext().readCacheTable("subscribe")) {
getTraceIdAndTagIdFromRecord(record, traceId, tagId);
if (traceSet.contains(traceId) == false)
continue;
if (!checkTagIdIsNumber(tagId))
{
context.getCounter("Error", "tag_id not a number").increment(1);
continue;
}
tagSet.add(tagId);
}
}
private static void addTagToTraceId(Context context) throws IOException
{
String traceId = null, tagId = null;
for (Record record : context.getContext().readCacheTable("subscribe")) {
getTraceIdAndTagIdFromRecord(record, traceId, tagId);
if (traceSet.contains(traceId) == false)
continue;
if (!checkTagIdIsNumber(tagId))
{
context.getCounter("Error", "tag_id not a number").increment(1);
continue;
}
Vector<String> ret = traceListMap.get(tagId);
if (ret == null) {
ret = new Vector<String>();
}
ret.add(traceId);
traceListMap.put(tagId, ret);
}
}
It seems I got an new repeat... I have no idea to eliminate repeat in that case, could anybody give me some advice?
update 2015-5-13 21:15:12:
Some guys gives a boolean argument to eliminate repeat, but I know
Robert C. Martin's Clean Code Tip #12: Eliminate Boolean Arguments.(you can google it for more details).
Could you gives some comment about that?
The parts that changes requires the values of String tagId and String traceId so we will start by extracting an interface that takes those parameters:
public static class PerformingInterface {
void accept(String tagId, String traceId);
}
Then extract the common parts into this method:
private static void doSomething(Context context, PerformingInterface perform) throws IOException
{
String traceId = null, tagId = null;
for (Record record : context.getContext().readCacheTable("subscribe")) {
getTraceIdAndTagIdFromRecord(record, traceId, tagId);
if (traceSet.contains(traceId) == false)
continue;
if (!checkTagIdIsNumber(tagId))
{
context.getCounter("Error", "tag_id not a number").increment(1);
continue;
}
perform.accept(tagId, traceId);
}
}
Then call this method in two different ways:
private static void getRequiredTag(Context context) throws IOException {
doSomething(context, new PerformingInterface() {
#Override public void accept(String tagId, String traceId) {
tagSet.add(tagId);
}
});
}
private static void addTagToTraceId(Context context) throws IOException {
doSomething(context, new PerformingInterface() {
#Override public void accept(String tagId, String traceId) {
Vector<String> ret = traceListMap.get(tagId);
if (ret == null) {
ret = new Vector<String>();
}
ret.add(traceId);
traceListMap.put(tagId, ret);
}
});
}
Note that I am using lambdas here, which is a Java 8 feature (BiConsumer is also a functional interface defined in Java 8), but it is entirely possible to accomplish the same thing in Java 7 and less, it just requires some more verbose code.
Some other issues with your code:
Way too many things is static
The Vector class is old, it is more recommended to use ArrayList (if you need synchronization, wrap it in Collections.synchronizedList)
Always use braces, even for one-liners
You could use a stream (haven't tested):
private static Stream<Record> validRecords(Context context) throws IOException {
return context.getContext().readCacheTable("subscribe").stream()
.filter(r -> {
if (!traceSet.contains(traceId(r))) {
return false;
}
try {
Integer.parseInt(tagId(r));
return true;
} catch (NumberFormatException e) {
context.getCounter("Error", "tag_id not a number").increment(1);
return false;
}
});
}
private static String traceId(Record record) {
return record.get("trace_id").toString();
}
private static String tagId(Record record) {
return record.get("tag_id").toString();
}
Then could do just:
private static void getRequiredTag(Context context) throws IOException {
validRecords(context).map(r -> tagId(r)).forEach(tagSet::add);
}
private static void addTagToTraceId(Context context) throws IOException {
validRecords(context).forEach(r -> {
String tagId = tagId(r);
Vector<String> ret = traceListMap.get(tagId);
if (ret == null) {
ret = new Vector<String>();
}
ret.add(traceId(r));
traceListMap.put(tagId, ret);
});
}
tagId seems to be always null in your second attempt.
Nevertheless, one approach would be to extract the code that collects tagIds (this seems to be the same in both methods) into its own method. Then, in each of the two methods just iterate over the collection of returned tagIds and do different operations on them.
for (String tagId : getTagIds(context)) {
// do method specific logic
}
EDIT
Now I noticed that you also use traceId in the second method. The principle remains the same, just collect Records in a separate method and iterate over them in the two methods (by taking tagId and traceId from records).
Solution with lambdas is the most elegant one, but without them it involves creation of separate interface and two anonymous classes which is too verbose for this use case (honestly, here I would rather go with a boolean argument than with a strategy without lambdas).
Try this approach
private static void imYourNewMethod(Context context,Boolean isAddTag){
String traceId = null, tagId = null;
for (Record record : context.getContext().readCacheTable("subscribe")) {
getTraceIdAndTagIdFromRecord(record, traceId, tagId);
if (traceSet.contains(traceId) == false)
continue;
if (!checkTagIdIsNumber(tagId))
{
context.getCounter("Error", "tag_id not a number").increment(1);
continue;
}
if(isAddTag){
Vector<String> ret = traceListMap.get(tagId);
if (ret == null) {
ret = new Vector<String>();
}
ret.add(traceId);
traceListMap.put(tagId, ret);
}else{
tagSet.add(tagId);
}
}
call this method and pass one more parameter boolean true if you want to add otherwise false to get it.

JAVA: JUNIT testing of class type with string

So I have a test which is to test the addNewCustomer method which does so by reading in from a text file
#Test
public void testAddNewCustomer() {
System.out.println("addNewCustomer");
try {
File nFile = new File("ProductData.txt");
File file = new File("CustomerData.txt");
Scanner scan = new Scanner(file);
ElectronicsEquipmentSupplier ees = new ElectronicsEquipmentSupplier(1, 1, InputFileData.readProductDataFile(nFile));
ees.addNewCustomer(InputFileData.readCustomerData(scan));
CustomerDetailsList expResult = ees.getDetails();
CustomerDetailsList result = ees.getDetails();
assertEquals(expResult, result);
} catch (IllegalCustomerIDException | IOException | IllegalProductCodeException e) {
fail(e.getMessage());
}
}
The problem that I'm having is to what to have as the expected result? I tried putting a string with the values that I thought would be entered but it then said I can't compare type string with type CustomerDetailsList. Any ideas?
public class CustomerDetailsList {
private final ArrayList<CustomerDetails> customerCollection;
public CustomerDetailsList() {
customerCollection = new ArrayList<>();
}
public void addCustomer(CustomerDetails newCustomer) {
customerCollection.add(newCustomer);
}
public int numberOfCustomers() {
return customerCollection.size();
}
public void clearArray() {
this.customerCollection.clear();
}
/**
*
* #param givenID the ID of a customer
* #return the customer’s details if found, exception thrown otherwise.
* #throws supplierproject.CustomerNotFoundException
*/
public CustomerDetails findCustomer(String givenID) throws CustomerNotFoundException {
CustomerNotFoundException notFoundMessage
= new CustomerNotFoundException("Customer was not found");
int size = customerCollection.size();
int i = 0;
boolean customerFound = false;
while (!customerFound && i < size) {
customerFound = customerCollection.get(i).getCustomerID().equals(givenID);
i++;
}
if (customerFound) {
return customerCollection.get(i - 1);
} else {
throw notFoundMessage;
}
}
#Override
public String toString() {
StringBuilder customerDets = new StringBuilder();
for (int i = 0; i < numberOfCustomers(); i++) {
customerDets.append(customerCollection.get(i).toString()).append("\n");
}
return customerDets.toString();
}
}
The list itself
Generally, you should test if the new customer is in the list. However, the expResult and result from your test are just the same, because at that point the ees already contains the new customer. Therefore the assertion does not make sense.
However, you can test if the Customer List contains the customer with given email (or some unique property of that customer).

Reading multiple xml documents from a socket in java

I'm writing a client which needs to read multiple consecutive small XML documents over a socket. I can assume that the encoding is always UTF-8 and that there is optionally delimiting whitespace between documents. The documents should ultimately go into DOM objects. What is the best way to accomplish this?
The essense of the problem is that the parsers expect a single document in the stream and consider the rest of the content junk. I thought that I could artificially end the document by tracking the element depth, and creating a new reader using the existing input stream. E.g. something like:
// Broken
public void parseInputStream(InputStream inputStream) throws Exception
{
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document doc = documentBuilder.newDocument();
XMLEventWriter domWriter = xof.createXMLEventWriter(new DOMResult(doc));
XMLStreamReader xmlStreamReader = factory.createXMLStreamReader(inputStream);
XMLEventReader reader = factory.createXMLEventReader(xmlStreamReader);
int depth = 0;
while (reader.hasNext()) {
XMLEvent evt = reader.nextEvent();
domWriter.add(evt);
switch (evt.getEventType()) {
case XMLEvent.START_ELEMENT:
depth++;
break;
case XMLEvent.END_ELEMENT:
depth--;
if (depth == 0)
{
domWriter.add(eventFactory.createEndDocument());
System.out.println(doc);
reader.close();
xmlStreamReader.close();
xmlStreamReader = factory.createXMLStreamReader(inputStream);
reader = factory.createXMLEventReader(xmlStreamReader);
doc = documentBuilder.newDocument();
domWriter = xof.createXMLEventWriter(new DOMResult(doc));
domWriter.add(eventFactory.createStartDocument());
}
break;
}
}
}
However running this on input such as <a></a><b></b><c></c> prints the first document and throws an XMLStreamException. Whats the right way to do this?
Clarification: Unfortunately the protocol is fixed by the server and cannot be changed, so prepending a length or wrapping the contents would not work.
Length-prefix each document (in bytes).
Read the length of the first document from the socket
Read that much data from the socket, dumping it into a ByteArrayOutputStream
Create a ByteArrayInputStream from the results
Parse that ByteArrayInputStream to get the first document
Repeat for the second document etc
IIRC, XML documents can have comments and processing-instructions at the end, so there's no real way of telling exactly when you have come to the end of the file.
A couple of ways of handling the situation have already been mentioned. Another alternative is to put in an illegal character or byte into the stream, such as NUL or zero. This has the advantage that you don't need to alter the documents and you never need to buffer an entire file.
just change to whatever stream
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringReader;
import javax.xml.namespace.QName;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
public class LogParser {
private XMLInputFactory inputFactory = null;
private XMLStreamReader xmlReader = null;
InputStream is;
private int depth;
private QName rootElement;
private static class XMLStream extends InputStream
{
InputStream delegate;
StringReader startroot = new StringReader("<root>");
StringReader endroot = new StringReader("</root>");
XMLStream(InputStream delegate)
{
this.delegate = delegate;
}
public int read() throws IOException {
int c = startroot.read();
if(c==-1)
{
c = delegate.read();
}
if(c==-1)
{
c = endroot.read();
}
return c;
}
}
public LogParser() {
inputFactory = XMLInputFactory.newInstance();
}
public void read() throws Exception {
is = new XMLStream(new FileInputStream(new File(
"./myfile.log")));
xmlReader = inputFactory.createXMLStreamReader(is);
while (xmlReader.hasNext()) {
printEvent(xmlReader);
xmlReader.next();
}
xmlReader.close();
}
public void printEvent(XMLStreamReader xmlr) throws Exception {
switch (xmlr.getEventType()) {
case XMLStreamConstants.END_DOCUMENT:
System.out.println("finished");
break;
case XMLStreamConstants.START_ELEMENT:
System.out.print("<");
printName(xmlr);
printNamespaces(xmlr);
printAttributes(xmlr);
System.out.print(">");
if(rootElement==null && depth==1)
{
rootElement = xmlr.getName();
}
depth++;
break;
case XMLStreamConstants.END_ELEMENT:
System.out.print("</");
printName(xmlr);
System.out.print(">");
depth--;
if(depth==1 && rootElement.equals(xmlr.getName()))
{
rootElement=null;
System.out.println("finished element");
}
break;
case XMLStreamConstants.SPACE:
case XMLStreamConstants.CHARACTERS:
int start = xmlr.getTextStart();
int length = xmlr.getTextLength();
System.out
.print(new String(xmlr.getTextCharacters(), start, length));
break;
case XMLStreamConstants.PROCESSING_INSTRUCTION:
System.out.print("<?");
if (xmlr.hasText())
System.out.print(xmlr.getText());
System.out.print("?>");
break;
case XMLStreamConstants.CDATA:
System.out.print("<![CDATA[");
start = xmlr.getTextStart();
length = xmlr.getTextLength();
System.out
.print(new String(xmlr.getTextCharacters(), start, length));
System.out.print("]]>");
break;
case XMLStreamConstants.COMMENT:
System.out.print("<!--");
if (xmlr.hasText())
System.out.print(xmlr.getText());
System.out.print("-->");
break;
case XMLStreamConstants.ENTITY_REFERENCE:
System.out.print(xmlr.getLocalName() + "=");
if (xmlr.hasText())
System.out.print("[" + xmlr.getText() + "]");
break;
case XMLStreamConstants.START_DOCUMENT:
System.out.print("<?xml");
System.out.print(" version='" + xmlr.getVersion() + "'");
System.out.print(" encoding='" + xmlr.getCharacterEncodingScheme()
+ "'");
if (xmlr.isStandalone())
System.out.print(" standalone='yes'");
else
System.out.print(" standalone='no'");
System.out.print("?>");
break;
}
}
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
new LogParser().read();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private static void printName(XMLStreamReader xmlr) {
if (xmlr.hasName()) {
System.out.print(getName(xmlr));
}
}
private static String getName(XMLStreamReader xmlr) {
if (xmlr.hasName()) {
String prefix = xmlr.getPrefix();
String uri = xmlr.getNamespaceURI();
String localName = xmlr.getLocalName();
return getName(prefix, uri, localName);
}
return null;
}
private static String getName(String prefix, String uri, String localName) {
String name = "";
if (uri != null && !("".equals(uri)))
name += "['" + uri + "']:";
if (prefix != null)
name += prefix + ":";
if (localName != null)
name += localName;
return name;
}
private static void printAttributes(XMLStreamReader xmlr) {
for (int i = 0; i < xmlr.getAttributeCount(); i++) {
printAttribute(xmlr, i);
}
}
private static void printAttribute(XMLStreamReader xmlr, int index) {
String prefix = xmlr.getAttributePrefix(index);
String namespace = xmlr.getAttributeNamespace(index);
String localName = xmlr.getAttributeLocalName(index);
String value = xmlr.getAttributeValue(index);
System.out.print(" ");
System.out.print(getName(prefix, namespace, localName));
System.out.print("='" + value + "'");
}
private static void printNamespaces(XMLStreamReader xmlr) {
for (int i = 0; i < xmlr.getNamespaceCount(); i++) {
printNamespace(xmlr, i);
}
}
private static void printNamespace(XMLStreamReader xmlr, int index) {
String prefix = xmlr.getNamespacePrefix(index);
String uri = xmlr.getNamespaceURI(index);
System.out.print(" ");
if (prefix == null)
System.out.print("xmlns='" + uri + "'");
else
System.out.print("xmlns:" + prefix + "='" + uri + "'");
}
}
A simple solution is to wrap the documents on the sending side in a new root element:
<?xml version="1.0"?>
<documents>
... document 1 ...
... document 2 ...
</documents>
You must make sure that you don't include the XML header (<?xml ...?>), though. If all documents use the same encoding, this can be accomplished with a simple filter which just ignores the first line of each document if it starts with <?xml
Found this forum message (which you probably already saw), which has a solution by wrapping the input stream and testing for one of two ascii characters (see post).
You could try an adaptation on this by first converting to use a reader (for proper character encoding) and then doing element counting until you reach the closing element, at which point you trigger the EOM.
Hi
I also had this problem at work (so won't post resulting the code). The most elegant solution that I could think of, and which works pretty nicely imo, is as follows
Create a class for example DocumentSplittingInputStream which extends InputStream and takes the underlying inputstream in its constructor (or gets set after construction...).
Add a field with a byte array closeTag containing the bytes of the closing root node you are looking for.
Add a field int called matchCount or something, initialised to zero.
Add a field boolean called underlyingInputStreamNotFinished, initialised to true
On the read() implementation:
Check if matchCount == closeTag.length, if it does, set matchCount to -1, return -1
If matchCount == -1, set matchCount = 0, call read() on the underlying inputstream until you get -1 or '<' (the xml declaration of the next document on the stream) and return it. Note that for all I know the xml spec allows comments after the document element, but I knew I was not going to get that from the source so did not bother handling it - if you can not be sure you'll need to change the "gobble" slightly.
Otherwise read an int from the underlying inputstream (if it equals closeTag[matchCount] then increment matchCount, if it doesn't then reset matchCount to zero) and return the newly read byte
Add a method which returns the boolean on whether the underlying stream has closed.
All reads on the underlying input stream should go through a separate method where it checks if the value read is -1 and if so, sets the field "underlyingInputStreamNotFinished" to false.
I may have missed some minor points but i'm sure you get the picture.
Then in the using code you do something like, if you are using xstream:
DocumentSplittingInputStream dsis = new DocumentSplittingInputStream(underlyingInputStream);
while (dsis.underlyingInputStreamNotFinished()) {
MyObject mo = xstream.fromXML(dsis);
mo.doSomething(); // or something.doSomething(mo);
}
David
I had to do something like this and during my research on how to approach it, I found this thread that even though it is quite old, I just replied (to myself) here wrapping everything in its own Reader for simpler use
I was faced with a similar problem. A web service I'm consuming will (in some cases) return multiple xml documents in response to a single HTTP GET request. I could read the entire response into a String and split it, but instead I implemented a splitting input stream based on user467257's post above. Here is the code:
public class AnotherSplittingInputStream extends InputStream {
private final InputStream realStream;
private final byte[] closeTag;
private int matchCount;
private boolean realStreamFinished;
private boolean reachedCloseTag;
public AnotherSplittingInputStream(InputStream realStream, String closeTag) {
this.realStream = realStream;
this.closeTag = closeTag.getBytes();
}
#Override
public int read() throws IOException {
if (reachedCloseTag) {
return -1;
}
if (matchCount == closeTag.length) {
matchCount = 0;
reachedCloseTag = true;
return -1;
}
int ch = realStream.read();
if (ch == -1) {
realStreamFinished = true;
}
else if (ch == closeTag[matchCount]) {
matchCount++;
} else {
matchCount = 0;
}
return ch;
}
public boolean hasMoreData() {
if (realStreamFinished == true) {
return false;
} else {
reachedCloseTag = false;
return true;
}
}
}
And to use it:
String xml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<root>first root</root>" +
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<root>second root</root>";
ByteArrayInputStream is = new ByteArrayInputStream(xml.getBytes());
SplittingInputStream splitter = new SplittingInputStream(is, "</root>");
BufferedReader reader = new BufferedReader(new InputStreamReader(splitter));
while (splitter.hasMoreData()) {
System.out.println("Starting next stream");
String line = null;
while ((line = reader.readLine()) != null) {
System.out.println("line ["+line+"]");
}
}
I use JAXB approach to unmarshall messages from multiply stream:
MultiInputStream.java
public class MultiInputStream extends InputStream {
private final Reader source;
private final StringReader startRoot = new StringReader("<root>");
private final StringReader endRoot = new StringReader("</root>");
public MultiInputStream(Reader source) {
this.source = source;
}
#Override
public int read() throws IOException {
int count = startRoot.read();
if (count == -1) {
count = source.read();
}
if (count == -1) {
count = endRoot.read();
}
return count;
}
}
MultiEventReader.java
public class MultiEventReader implements XMLEventReader {
private final XMLEventReader reader;
private boolean isXMLEvent = false;
private int level = 0;
public MultiEventReader(XMLEventReader reader) throws XMLStreamException {
this.reader = reader;
startXML();
}
private void startXML() throws XMLStreamException {
while (reader.hasNext()) {
XMLEvent event = reader.nextEvent();
if (event.isStartElement()) {
return;
}
}
}
public boolean hasNextXML() {
return reader.hasNext();
}
public void nextXML() throws XMLStreamException {
while (reader.hasNext()) {
XMLEvent event = reader.peek();
if (event.isStartElement()) {
isXMLEvent = true;
return;
}
reader.nextEvent();
}
}
#Override
public XMLEvent nextEvent() throws XMLStreamException {
XMLEvent event = reader.nextEvent();
if (event.isStartElement()) {
level++;
}
if (event.isEndElement()) {
level--;
if (level == 0) {
isXMLEvent = false;
}
}
return event;
}
#Override
public boolean hasNext() {
return isXMLEvent;
}
#Override
public XMLEvent peek() throws XMLStreamException {
XMLEvent event = reader.peek();
if (level == 0) {
while (event != null && !event.isStartElement() && reader.hasNext()) {
reader.nextEvent();
event = reader.peek();
}
}
return event;
}
#Override
public String getElementText() throws XMLStreamException {
throw new NotImplementedException();
}
#Override
public XMLEvent nextTag() throws XMLStreamException {
throw new NotImplementedException();
}
#Override
public Object getProperty(String name) throws IllegalArgumentException {
throw new NotImplementedException();
}
#Override
public void close() throws XMLStreamException {
throw new NotImplementedException();
}
#Override
public Object next() {
throw new NotImplementedException();
}
#Override
public void remove() {
throw new NotImplementedException();
}
}
Message.java
#XmlAccessorType(XmlAccessType.FIELD)
#XmlRootElement(name = "Message")
public class Message {
public Message() {
}
#XmlAttribute(name = "ID", required = true)
protected long id;
public long getId() {
return id;
}
public void setId(long id) {
this.id = id;
}
#Override
public String toString() {
return "Message{id=" + id + '}';
}
}
Read multiply messages:
public static void main(String[] args) throws Exception{
StringReader stringReader = new StringReader(
"<Message ID=\"123\" />\n" +
"<Message ID=\"321\" />"
);
JAXBContext context = JAXBContext.newInstance(Message.class);
Unmarshaller unmarshaller = context.createUnmarshaller();
XMLInputFactory inputFactory = XMLInputFactory.newFactory();
MultiInputStream multiInputStream = new MultiInputStream(stringReader);
XMLEventReader xmlEventReader = inputFactory.createXMLEventReader(multiInputStream);
MultiEventReader multiEventReader = new MultiEventReader(xmlEventReader);
while (multiEventReader.hasNextXML()) {
Object message = unmarshaller.unmarshal(multiEventReader);
System.out.println(message);
multiEventReader.nextXML();
}
}
results:
Message{id=123}
Message{id=321}

Categories