Java file encoding magic - java

Strange thing happened in Java Kingdom...
Long story short: I use Java API V3 to connect to QuickBooks and fetch the data form there (services for example).
Everything goes fine except the case when a service contains russian symbols (or probably non-latin symbols).
Here is Java code that does it (I know it's far from perfect)
package com.mde.test;
import static com.intuit.ipp.query.GenerateQuery.$;
import static com.intuit.ipp.query.GenerateQuery.select;
import java.util.LinkedList;
import java.util.List;
import com.intuit.ipp.core.Context;
import com.intuit.ipp.core.ServiceType;
import com.intuit.ipp.data.Item;
import com.intuit.ipp.exception.FMSException;
import com.intuit.ipp.query.GenerateQuery;
import com.intuit.ipp.security.OAuthAuthorizer;
import com.intuit.ipp.services.DataService;
import com.intuit.ipp.util.Config;
public class TestEncoding {
public static final String QBO_BASE_URL_SANDBOX = "https://sandbox-quickbooks.api.intuit.com/v3/company";
private static String consumerKey = "consumerkeycode";
private static String consumerSecret = "consumersecretcode";
private static String accessToken = "accesstokencode";
private static String accessTokenSecret = "accesstokensecretcode";
private static String appToken = "apptokencode";
private static String companyId = "companyidcode";
private static OAuthAuthorizer oauth = new OAuthAuthorizer(consumerKey, consumerSecret, accessToken, accessTokenSecret);
private static final int PAGING_STEP = 500;
public static void main(String[] args) throws FMSException {
List<Item> res = findAllServices(getDataService());
System.out.println(res.get(1).getName());
}
public static List<Item> findAllServices(DataService service) throws FMSException {
Item item = GenerateQuery.createQueryEntity(Item.class);
List<Item> res = new LinkedList<>();
for (int skip = 0; ; skip += PAGING_STEP) {
String query = select($(item)).skip(skip).take(PAGING_STEP).generate();
List<Item> items = (List<Item>)service.executeQuery(query).getEntities();
if (items.size() > 0)
res.addAll(items);
else
break;
}
System.out.println("All services fetched");
return res;
}
public static DataService getDataService() throws FMSException {
Context context = getContext();
if (context == null) {
System.out.println("Context is null, something wrong, dataService also will null.");
return null;
}
return getDataService(context);
}
private static Context getContext() {
try {
return new Context(oauth, appToken, ServiceType.QBO, companyId);
} catch (FMSException e) {
System.out.println("Context is not loaded");
return null;
}
}
protected static DataService getDataService(Context context) throws FMSException {
DataService service = new DataService(context);
Config.setProperty(Config.BASE_URL_QBO, QBO_BASE_URL_SANDBOX);
return new DataService(context);
}
}
This file is saved in UTF-8. And it prints something like
All services fetched
Сэрвыс, отнюдь
But! When I save this file in UTF-8 with BOM.... I get the correct data!
All services fetched
Сэрвыс, отнюдь
Does anybody can explain what is happening? :)
// I use Eclipse to run the code

You are fetching data from a system that doesn't share the same byte ordering as you, so when you save the file with BOM, it adds enough information in the file that future programs will read it in the remote system's byte ordering.
When you save it without BOM, it wrote the file in the remote system's byte ordering without any indication of the stored byte order, so when you read it you read it with the local system's (different) byte order. This jumbles up the bytes within the multi-byte characters, making the output appear as nonsense.

Related

spring batch file writer to write directly to amazon s3 storage without PutObjectRequest

I'm trying to upload a file to amazon s3. Instead of uploading, I want to read the data from database using spring batch and write the file directly into the s3 storage. Is there anyway we can do that ?
Spring Cloud AWS adds support for the Amazon S3 service to load and write resources with the resource loader and the s3 protocol. Once you have configured the AWS resource loader, you can write a custom Spring Batch writer like:
import java.io.OutputStream;
import java.util.List;
import org.springframework.batch.item.ItemWriter;
import org.springframework.core.io.ResourceLoader;
import org.springframework.core.io.WritableResource;
public class AwsS3ItemWriter implements ItemWriter<String> {
private ResourceLoader resourceLoader;
private WritableResource resource;
public AwsS3ItemWriter(ResourceLoader resourceLoader, String resource) {
this.resourceLoader = resourceLoader;
this.resource = (WritableResource) this.resourceLoader.getResource(resource);
}
#Override
public void write(List<? extends String> items) throws Exception {
try (OutputStream outputStream = resource.getOutputStream()) {
for (String item : items) {
outputStream.write(item.getBytes());
}
}
}
}
Then you should be able to use this writer with an S3 resource like s3://myBucket/myFile.log.
Is there anyway we can do that ?
Please note that I did not compile/test the previous code. I just wanted to give you a starting point of how to do it.
Hope this helps.
The problem is that the OutputStream will only write the last List items sent by the step...
I think you might need to write a temporary file on file system and then send the whole file in a separate tasklet
See this example :
https://github.com/TerrenceMiao/AWS/blob/master/dynamodb-java/src/main/java/org/paradise/microservice/userpreference/service/writer/CSVFileWriter.java
I had the same thing to do. Because spring has no clas to write to a stream alone I made one my self like above example:
You need to classes for this. A Resource class which implements WriteableResource and extends AbstractResource:
...
public class S3Resource extends AbstractResource implements WritableResource {
ByteArrayOutputStream resource = new ByteArrayOutputStream();
#Override
public String getDescription() {
return null;
}
#Override
public InputStream getInputStream() throws IOException {
return new ByteArrayInputStream(resource.toByteArray());
}
#Override
public OutputStream getOutputStream() throws IOException {
return resource;
}
}
And your writer which extends ItemWriter:
public class AmazonStreamWriter<T> implements ItemWriter<T>{
private WritableResource resource;
private LineAggregator<T> lineAggregator;
private String lineSeparator;
public String getLineSeparator() {
return lineSeparator;
}
public void setLineSeparator(String lineSeparator) {
this.lineSeparator = lineSeparator;
}
AmazonStreamWriter(WritableResource resource){
this.resource = resource;
}
public WritableResource getResource() {
return resource;
}
public void setResource(WritableResource resource) {
this.resource = resource;
}
public LineAggregator<T> getLineAggregator() {
return lineAggregator;
}
public void setLineAggregator(LineAggregator<T> lineAggregator) {
this.lineAggregator = lineAggregator;
}
#Override
public void write(List<? extends T> items) throws Exception {
try (OutputStream outputStream = resource.getOutputStream()) {
StringBuilder lines = new StringBuilder();
Iterator var3 = items.iterator();
while(var3.hasNext()) {
T item = (T) var3.next();
lines.append(this.lineAggregator.aggregate(item)).append(this.lineSeparator);
}
outputStream.write(lines.toString().getBytes());
}
}
}
With this setup you will write your Item-Information you recieve from your database and write it to your Customresource via an OutputStream. The filled resource then can be used in one of your Steps zu open an InputStream and upload to S3 via Client.
I did it with: amazonS3.putObject(awsBucketName, awsBucketKey , resource.getInputStream(), new ObjectMetadata());
My solution may be not the perfect aproach, but from here on you can optimize it.

SQL4306N Java stored procedure or user-defined function could not call Java method

We just upgraded to DB2 10.5 from 9.5, this process was working fine until the upgrade was performed on the server. When i run the jar file from a linux server, I get the following error however when i run the exact same code from eclipse on my windows computer, it works just fine! I am also getting a similar error if I calll this sp from DB2 control center. I am looking to know what is causing this and how can i fix this error?
SQL4306N Java stored procedure or user-defined function "ESADBM.GETNEXTID",
specific name "WHDBRMM_UTILS" could not call Java method "GetNextID",
signature "(Ljava/lang/String;[I)V". SQLSTATE=42724
Explanation:
The Java method given by the EXTERNAL NAME clause of a CREATE PROCEDURE
or CREATE FUNCTION statement could not be found. Its declared argument
list may not match what the database expects, or it may not be a
"public" instance method.
User response:
Ensure that a Java instance method exists with the "public" flag and the
expected argument list for this call.
sqlcode: -4306
sqlstate: 42724.
Here is the code:
package pkgUtil_v4_0_0_0;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import org.hibernate.exception.JDBCConnectionException;
public class DBSequence {
public static final String SEQ_CONTACTID = "ContactIDSeq";
public static final String SEQ_PROJECTID = "ProjectIDSeq";
public static final String SEQ_LOCATIONID = "LocationIDSeq";
public static final String SEQ_SOURCEID = "SourceIDSeq";
public static final String SEQ_SURVEYID = "SurveyIDSeq";
public static final String SEQ_LOGICALSURVEYID = "WageAreaIDSeq";
public static final String SEQ_WAGEDETAILID = "WageDetailIDSeq";
public static final String SEQ_ORGID = "OrgIDSeq";
public static final String SEQ_OFFICEID = "RegionNumberSeq";
public static final String SEQ_LETTERID = "LetterIDSeq";
public static final String SEQ_DODGEID = "DodgeIDSeq";
public static final String SEQ_CRAFTID = "CraftIDSeq";
public static final String SEQ_CRAFTTITLEID = "CraftTitleIDSeq";
public static final String SEQ_ANALYSTID = "AnalystIDSeq";
public static final String SEQ_LETTERTEMPLATEID = "LetterTemplateIDSeq";
public static final String SEQ_RECRATESID = "RecRatesIDSeq";
public static final String SEQ_BRIDGESCDID = "BridgeSCDIDSeq";
public static String drvr = "";
public static Connection con = null;
// utility function
public static int getNextId(Connection lcon, String sequence) throws SQLException {
Boolean bFlag;
PreparedStatement stmt = null;
int id = 0;
String sql = "select next value for esadbm." +
sequence + " from SYSIBM.sysdummy1";
// System.out.println("String = "+sequence);
stmt = lcon.prepareStatement(sql);
ResultSet resultSet = stmt.executeQuery();
if (resultSet.next()) {
id = resultSet.getInt(1);
}
resultSet.close();
stmt.close();
return id;
}
// Stored Procedure Entry Point
public static void getNextId(String sequence, int[] seq) throws SQLException, Exception {
System.out.println("String = "+sequence);
System.out.println("Array = "+seq);
if (drvr.length() == 0) {
drvr = "jdbc:default:connection";
con = DriverManager.getConnection(drvr);
}
drvr = "";
seq[0] = getNextId(con, sequence);
con.close();
}
// test procedure
public static void main(String args[])throws SQLException, Exception {
try {
System.out.println("Connecting to DB " + args[0]);
Class.forName("com.ibm.db2.jcc.DB2Driver");
drvr = "jdbc:db2:" + args[0];
// System.out.println(drvr+args[1] + args[2]);
con = DriverManager.getConnection("jdbc:db2:" + args[0], args[1],args[2]);
// System.out.println(con);
System.out.println("DB Connection Successful");
con = DriverManager.getConnection(drvr, args[1], args[2]);
Statement st = con.createStatement();
String query = "set schema = 'ESADBM'";
st.execute(query);
System.out.println("Getting ID");
int id = getNextId(con, SEQ_SOURCEID);
System.out.println("Returned : " + Integer.toString(id));
}
catch (ClassNotFoundException cnfe) {
cnfe.printStackTrace();
}
catch (SQLException sqle) {
sqle.printStackTrace();
}
catch (JDBCConnectionException e) {
System.out.println("Unable to connect to database");
e.printStackTrace();
}
}
}
Here is the stored procedure:
CREATE PROCEDURE "ESADBM "."GETNEXTID"
(
IN SEQUENCE CHARACTER(40),
OUT ID INTEGER
)
DYNAMIC RESULT SETS 0
SPECIFIC WHDBRA_UTILS
EXTERNAL NAME 'pkgUtil_v4_0_0_0.DBSequence!getNextId()'
LANGUAGE JAVA
PARAMETER STYLE JAVA
NOT DETERMINISTIC
FENCED THREADSAFE
MODIFIES SQL DATA
NO DBINFO;
Libraries for external routines, including Java classes and JAR files for Java routines, must be present in a certain location in the DB2 instance directory. When you upgrade your DB2 version, a new instance is created, but those libraries are not copied automatically (which, by the way, makes sense as there is a good chance that they need to be rebuilt).
The error message indicates that the instance cannot find the Java class file that implements GETNEXTID -- that would be DBSequence.class. The class needs to be copied to the sqllib/function directory in the DB2 10.5 instance home on the database server, as explained in the manual. You will probably need also to create pkgUtil_v4_0_0_0 under sqllib/function for the correct package structure. Make sure you compile the Java source using the same JDK version as the one used by the DB2 instance to run the program.
Once you do that, execute CALL SQLJ.REFRESH_CLASSES() in the DB2 client of your choice to make sure DB2 reloads the most current version. After that your stored procedure should work correctly.
Having said that, I don't really understand why you use such a convoluted way of retrieving a SQL sequence value.

Detect file type based on content

Tried the following:
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.spi.FileTypeDetector;
import org.apache.tika.Tika;
import org.apache.tika.mime.MimeTypes;
/**
*
* #author kiriti.k
*/
public class TikaFileTypeDetector {
private final Tika tika = new Tika();
public TikaFileTypeDetector() {
super();
}
public String probeContentType(Path path) throws IOException {
// Try to detect based on the file name only for efficiency
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
// Then check the file content if necessary
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Specification says to return null if we could not
// conclusively determine the file type
return null;
}
public static void main(String[] args) throws IOException {
Tika tika = new Tika();
// expects file path as the program argument
if (args.length != 1) {
printUsage();
return;
}
Path path = Paths.get(args[0]);
TikaFileTypeDetector detector = new TikaFileTypeDetector();
// Analyse the file - first based on file name for efficiency.
// If cannot determine based on name and then analyse content
String contentType = detector.probeContentType(path);
System.out.println("File is of type - " + contentType);
}
public static void printUsage() {
System.out.print("Usage: java -classpath ... "
+ TikaFileTypeDetector.class.getName()
+ " ");
}
}
The above program is checking based on file extension only. How do I make it to check content type also(mime) and then determine the type. I am using tika-app-1.8.jar in netbean 8.0.2. What am I missing?
The code checks the file extension first and returns the MIME type based on that, if it finds a result. If you want it to check the content first, just switch the two statements:
public String probeContentType(Path path) throws IOException {
// Check contents first
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Try file name only if content search was not successful
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
// Specification says to return null if we could not
// conclusively determine the file type
return null;
}
Be aware that this may have huge performance impact.
You can use Files.probeContentType(path)

Mapping a URL to the most specific context root

i have the following situation:
I want to map incoming queries (i use a servlet filter to access the queries) to the suitable applications. For this, i have a table where i map the applications to their contextroots, e.g.:
/application1/ | Application1 Rootcontext
/application1/subcontext1 | Application1 Subcontext 1
/application1/subcontext2 | Application1 Subcontext 2
/application2/ | Application2
So when i have a query with the path /application1/subcontext1/someotherpath, i want to get Application1 Subcontext 1, when i have a query URL /application1/sompath, i want to get Application 1 Rootcontext.
My first guess was, that i build some sort of tree with my mappings of the contextroots (every part of that URL as a node), and then split up the query URL and walk down the tree to get the most specific application mapping.
Would that be the best solution, or do you have any other suggestions for my problem?
Instead of a tree and walking forwards you can have your map as a Map<String, ApplicationContext> and walk backwards until you find the first non-null fit. This code should give you a rough idea of how to do it:
import java.util.HashMap;
import java.util.Map;
public class Main {
public static final class ApplicationContext {
private final String app;
private final String ctx;
public ApplicationContext(final String app, final String ctx) {
this.app = app;
this.ctx = ctx;
}
#Override
public String toString() {
return "ApplicationContext[" + app + "/" + ctx + "]";
}
}
private static ApplicationContext ac(final String app, final String ctx) {
return new ApplicationContext(app, ctx);
}
private static ApplicationContext getApplicationContext(final String url,
final Map<String, ApplicationContext> urlMap) {
String specificUrl = url;
ApplicationContext result = null;
while (specificUrl != null && result == null) {
result = urlMap.get(specificUrl);
specificUrl = shortenUrl(specificUrl);
}
return result;
}
public static void main(final String[] args) throws Exception {
final Map<String, ApplicationContext> urlMap = new HashMap<String, ApplicationContext>();
urlMap.put("/application1", ac("Application1", "Root"));
urlMap.put("/application1/subcontext1", ac("Application1", "SubContext1"));
urlMap.put("/application1/subcontext2", ac("Application1", "SubContext2"));
urlMap.put("/application1/subcontext2/subcontext3", ac("Application1", "SubContext3"));
urlMap.put("/application2", ac("Application2", null));
System.out.println(getApplicationContext("/application1/", urlMap));
System.out.println(getApplicationContext("/application1/abc", urlMap));
System.out.println(getApplicationContext("/application1/subcontext2/abc", urlMap));
}
private static String shortenUrl(final String url) {
final int index = url.lastIndexOf('/');
if (index > 0) {
return url.substring(0, index);
}
else {
return null;
}
}
}
And a fiddle for it.

Simple K-Means doesnt handle iris.arff

I have this class below, i build it considering the examples given on the wiki and in a thesis, why can't SympleKMeans handle data? The class can print the Datasource dados, so its nothing wrong with processing file, the error is on the build.
package slcct;
import weka.clusterers.ClusterEvaluation;
import weka.clusterers.SimpleKMeans;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
public class Cluster {
public String path;
public Instances dados;
public String[] options = new String[2];
public Cluster(String caminho, int nclusters, int seed ){
this.path = caminho;
this.options[0] = String.valueOf(nclusters);
this.options[1] = String.valueOf(seed);
}
public void ledados() throws Exception{
DataSource source = new DataSource(path);
dados = source.getDataSet();
System.out.println(dados)
if(dados.classIndex()==-1){
dados.setClassIndex(dados.numAttributes()-1);
}
}
public void imprimedados(){
for(int i=0; i<dados.numInstances();i++)
{
Instance actual = dados.instance(i);
System.out.println((i+1) + " : "+ actual);
}
}
public void clustering() throws Exception{
SimpleKMeans cluster = new SimpleKMeans();
cluster.setOptions(options);
cluster.setDisplayStdDevs(true);
cluster.getMaxIterations();
cluster.buildClusterer(dados);
Instances ClusterCenter = cluster.getClusterCentroids();
Instances SDev = cluster.getClusterStandardDevs();
int[] ClusterSize = cluster.getClusterSizes();
ClusterEvaluation eval = new ClusterEvaluation();
eval.setClusterer(cluster);
eval.evaluateClusterer(dados);
for(int i=0;i<ClusterCenter.numInstances();i++){
System.out.println("Cluster#"+( i +1)+ ": "+ClusterSize[i]+" dados .");
System.out.println("Centróide:"+ ClusterCenter.instance(i));
System.out.println("STDDEV:" + SDev.instance(i));
System.out.println("Cluster Evaluation:"+eval.clusterResultsToString());
}
}
}
The error:
weka.core.WekaException: weka.clusterers.SimpleKMeans: Cannot handle any class attribute!
at weka.core.Capabilities.test(Capabilities.java:1097)
at weka.core.Capabilities.test(Capabilities.java:1018)
at weka.core.Capabilities.testWithFail(Capabilities.java:1297)
at weka.clusterers.SimpleKMeans.buildClusterer(SimpleKMeans.java:228)
at slcct.Cluster.clustering(Cluster.java:53)//Here.
at slcct.Clustering.jButton1ActionPerformed(Clustering.java:104)
I believe you need not set the class index, as you are doing clustering and not classification. Try following this guide for programmatic Java clustering.
In your "ledados()" function just remove the code block given below. It will work. Because you have no defined class in your data.
if(dados.classIndex()==-1){
dados.setClassIndex(dados.numAttributes()-1);
}
Your new function:
public void ledados() throws Exception{
DataSource source = new DataSource(path);
dados = source.getDataSet();
System.out.println(dados) }
You would not need a class attribute in the data while doing k clustering

Categories