directly convert CSV file to JSON file using the Jackson library - java

I am using following code:
CsvSchema bootstrap = CsvSchema.emptySchema().withHeader();
ObjectMapper mapper = new CsvMapper();
File csvFile = new File("input.csv"); // or from String, URL etc
Object user = mapper.reader(?).withSchema(bootstrap).readValue(new File("data.csv"));
mapper.writeValue(new File("data.json"), user);
It throws an error in my IDE saying cannot find symbol method withSchema(CsvSchema) but why? I have used the code from some examples.
I don't know what to write into mapper.reader() as I want to convert any CSV file.
How can I convert any CSV file to JSON and save it to the disk?
What to do next? The examples

I think, you should use MappingIterator to solve your problem. See below example:
import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
public class JacksonProgram {
public static void main(String[] args) throws Exception {
File input = new File("/x/data.csv");
File output = new File("/x/data.json");
List<Map<?, ?>> data = readObjectsFromCsv(input);
writeAsJson(data, output);
}
public static List<Map<?, ?>> readObjectsFromCsv(File file) throws IOException {
CsvSchema bootstrap = CsvSchema.emptySchema().withHeader();
CsvMapper csvMapper = new CsvMapper();
try (MappingIterator<Map<?, ?>> mappingIterator = csvMapper.readerFor(Map.class).with(bootstrap).readValues(file)) {
return mappingIterator.readAll();
}
}
public static void writeAsJson(List<Map<?, ?>> data, File file) throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.writeValue(file, data);
}
}
See this page: jackson-dataformat-csv for more information and examples.

Related

The method readerFor(Class) is undefined for the type CsvMapper

The source code:
import java.io.File;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
public class ConvertCSVtoJson {
#SuppressWarnings("unchecked")
public static void main(String[] args) throws Exception {
File input = new File("C:\\Users\\prshanka\\Documents\\Report.csv");
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
CsvMapper csvMapper = new CsvMapper();
// Read data from CSV file
List<Object> readAll = (csvMapper).readerFor(Map.class).with(csvSchema).readValues(input).readAll();
ObjectMapper mapper = new ObjectMapper();
mapper.enable(SerializationFeature.INDENT_OUTPUT);
// Write JSON formated data to output.json file
for (Object row : readAll) {
Map<String, String> map = (Map<String, String>) row;
String fileName = map.get("fileName");
File output = new File("C://Users//prshanka//Documents//Target" + fileName + ".txt");
mapper.writerWithDefaultPrettyPrinter().writeValue(output, row);
}
}
}
The line in which I am getting an error:
List<Object> readAll = (csvMapper).readerFor(Map.class).with(csvSchema).readValues(input).readAll();
The error log:
Error- The method readerFor(Class<Map>) is undefined for the type CsvMapper.
I have added all the dependencies on the build path but it's still not working.

Is there any way I can resolve java heap memory error when converting a large csv file to JSON?

I'm writing the below to convert CSV file to JSON but getting java heap memory error.
Can anyone help me to write using the Jackson stream API?
import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
public class JacksonProgram {
public static void main(String[] args) throws Exception {
File input = new File("C:\\Users\\shivamurthym\\Downloads\\FL_insurance_sample\\FL_insurance_sample.csv");
File output = new File("C:\\Users\\shivamurthym\\data.json");
List<Map<?, ?>> data = readObjectsFromCsv(input);
writeAsJson(data, output);
}
public static List<Map<?, ?>> readObjectsFromCsv(File file) throws IOException {
CsvSchema bootstrap = CsvSchema.emptySchema().withHeader();
CsvMapper csvMapper = new CsvMapper();
MappingIterator<Map<?, ?>> mappingIterator = csvMapper.reader(Map.class).with(bootstrap).readValues(file);
return mappingIterator.readAll();
}
public static void writeAsJson(List<Map<?, ?>> data, File file) throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.writeValue(file, data);
}
}

Transforming JSON to XML without converting decimal to scientific notation

I am trying to convert JSON to XML in middle ware tool. I am using Jackson libraries to do this transformation. The problem is that for decimal fields (length more than 8) in JSON, the corresponding XML value is converted to scientific notation. For example 8765431002.13 is converted to 8.76543100213E8.
I can convert the scientific notation to normal decimal format if know the name of the field. But in my case, the middleware application will not be aware of field that is coming as decimal.
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
public class JSONDataformat {
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
//String jsonString = "{\"Field1\":18629920.68,\"Field3\":\"test\", \"Field2\":\"null\"}";
ObjectMapper objectMapper = new ObjectMapper();
ObjectMapper xmlMapper = new XmlMapper();
JsonNode tree = objectMapper.readTree(jsonString);
String jsonAsXml = xmlMapper.writer().writeValueAsString(tree);
System.out.println(jsonAsXml);
}
catch(Exception e) {e.printStackTrace(); }
}
}
Output
<ObjectNode xmlns=""><Field1>1.862992068E7</Field1><Field3>test</Field3><Field2/></ObjectNode>
I expected to get <Field1> value as 18629920.68 in above code.
You need to enable USE_BIG_DECIMAL_FOR_FLOATS feature:
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.enable(DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS);
EDIT
import com.fasterxml.jackson.core.JsonGenerator.Feature;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import java.io.IOException;
public class Test {
public static void main(String[] args) throws IOException {
String jsonString = "{\"Field1\": 20121220.00,\"Field3\":\"test\", \"Field2\":\"null\"}";
ObjectMapper jsonMapper = new ObjectMapper();
jsonMapper.enable(DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS);
XmlMapper xmlMapper = new XmlMapper();
JsonNode tree = jsonMapper.readTree(jsonString);
String jsonAsXml = xmlMapper.writer().with(Feature.WRITE_BIGDECIMAL_AS_PLAIN).writeValueAsString(tree);
System.out.println(jsonAsXml);
}
}
Above code prints:
<ObjectNode><Field1>20121220</Field1><Field3>test</Field3><Field2>null</Field2></ObjectNode>

jackson-dataformat-csv: Mapping number value without POJO

I'm trying to parse a CSV file using jackson-dataformat-csv and I want to map the numeric column to the Number java type.
CsvSchema schema = CsvSchema.builder().setUseHeader(true)
.addColumn("firstName", CsvSchema.ColumnType.STRING)
.addColumn("lastName", CsvSchema.ColumnType.STRING)
.addColumn("age", CsvSchema.ColumnType.NUMBER)
.build();
CsvMapper csvMapper = new CsvMapper();
MappingIterator<Map<String, Object>> mappingIterator = csvMapper
.readerFor(Map.class)
.with(schema)
.readValues(is);
while (mappingIterator.hasNext()) {
Map<String, Object> entryMap = mappingIterator.next();
Number age = (Number) entryMap.get("age");
}
I'm expecting entryMap.get("age") should be a Number, but I get String instead.
My CSV file:
firstName,lastName,age
John,Doe,21
Error,Name,-10
I know that CsvSchema works fine with POJOs, but I need to process arbitrary CSV schemas, so I can't create a new java class for every case.
Any way to parse CSV into a typed Map or Array?
Right now it is not possible to configure Map deserialisation using CsvSchema. Process uses com.fasterxml.jackson.databind.deser.std.MapDeserializer which right now does not check schema. We could write custom Map deserialiser. There is a question on GitHub: CsvMapper does not respect CsvSchema.ColumnType when using #JsonAnySetter where cowtowncoder answered:
At this point schema type is not used much for anything, but I agree
it should.
EDIT
I decided to take a look closer what we can do with that fact that com.fasterxml.jackson.databind.deser.std.MapDeserializer is used behind the scene. Implementing custom Map deserialiser which will take care about types would be tricky to implement and register but we can use knowledge about ValueInstantiator. Let's define new Map type which knows what to do with ColumnType info:
class CsvMap extends HashMap<String, Object> {
private final CsvSchema schema;
private final NumberFormat numberFormat = NumberFormat.getInstance();
public CsvMap(CsvSchema schema) {
this.schema = schema;
}
#Override
public Object put(String key, Object value) {
value = convertIfNeeded(key, value);
return super.put(key, value);
}
private Object convertIfNeeded(String key, Object value) {
CsvSchema.Column column = schema.column(key);
if (column.getType() == CsvSchema.ColumnType.NUMBER) {
try {
return numberFormat.parse(value.toString());
} catch (ParseException e) {
// leave it as it is
}
}
return value;
}
}
For new type without no-arg constructor we should create new ValueInstantiator:
class CsvMapInstantiator extends ValueInstantiator.Base {
private final CsvSchema schema;
public CsvMapInstantiator(CsvSchema schema) {
super(CsvMap.class);
this.schema = schema;
}
#Override
public Object createUsingDefault(DeserializationContext ctxt) {
return new CsvMap(schema);
}
#Override
public boolean canCreateUsingDefault() {
return true;
}
}
Example usage:
import com.fasterxml.jackson.databind.DeserializationContext;
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectReader;
import com.fasterxml.jackson.databind.deser.ValueInstantiator;
import com.fasterxml.jackson.databind.module.SimpleModule;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import java.io.File;
import java.io.IOException;
import java.text.NumberFormat;
import java.text.ParseException;
import java.util.HashMap;
public class CsvApp {
public static void main(String[] args) throws IOException {
File csvFile = new File("./resource/test.csv").getAbsoluteFile();
CsvSchema schema = CsvSchema.builder()
.addColumn("firstName", CsvSchema.ColumnType.STRING)
.addColumn("lastName", CsvSchema.ColumnType.STRING)
.addColumn("age", CsvSchema.ColumnType.NUMBER)
.build().withHeader();
// Create schema aware map module
SimpleModule csvMapModule = new SimpleModule();
csvMapModule.addValueInstantiator(CsvMap.class, new CsvMapInstantiator(schema));
// register map
CsvMapper csvMapper = new CsvMapper();
csvMapper.registerModule(csvMapModule);
// get reader for CsvMap + schema
ObjectReader objectReaderWithSchema = csvMapper
.readerWithSchemaFor(CsvMap.class)
.with(schema);
MappingIterator<CsvMap> mappingIterator = objectReaderWithSchema.readValues(csvFile);
while (mappingIterator.hasNext()) {
CsvMap entryMap = mappingIterator.next();
Number age = (Number) entryMap.get("age");
System.out.println(age + " (" + age.getClass() + ")");
}
}
}
Above code for below CSV payload:
firstName,lastName,age
John,Doe,21
Error,Name,-10.1
prints:
21 (class java.lang.Long)
-10.1 (class java.lang.Double)
It looks like a hack but I wanted to show this possibility.
You can use univocity-parsers for this sort of thing. It's faster and way more flexible:
CsvParserSettingssettings = new CsvParserSettings(); //configure the parser if needed
CsvParser parser = new CsvParser(settings);
for (Record record : parser.iterateRecords(is)) {
Short age = record.getShort("age");
}
To get a typed map, tell the parser what is the type of the columns you are working with:
parser.getRecordMetadata().setTypeOfColumns(Short.class, "age" /*, and other column names*/);
//to get 0 instead of nulls when the field is empty in the file:
parser.getRecordMetadata().setDefaultValueOfColumns("0", "age", /*, and other column names*/);
// then parse
for (Record record : parser.iterateRecords(is)) {
Map<String,Object> map = record.toFieldMap();
}
Hope this helps
Disclaimer: I'm the author of this library. It's open source and free (Apache 2.0 license)

How to unit test a custom Jackson JsonSerializer?

I wrote the following JsonSerializer to let Jackson serialize an array of integers into JSON:
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.databind.JsonSerializer;
import com.fasterxml.jackson.databind.SerializerProvider;
import java.io.IOException;
public class TalkIdsSerializer extends JsonSerializer<TalkIds> {
/**
* Serializes a TalkIds object into the following JSON string:
* Example: { "talk_ids" : [ 5931, 5930 ] }
*/
#Override
public void serialize(TalkIds talkIds, JsonGenerator jsonGenerator,
SerializerProvider provider)
throws IOException {
jsonGenerator.writeStartObject();
jsonGenerator.writeArrayFieldStart(TalkIds.API_DICTIONARY_KEY);
for (Integer talkId : talkIds.getTalkIds()) {
jsonGenerator.writeNumber(talkId);
}
jsonGenerator.writeEndArray();
jsonGenerator.writeEndObject();
}
}
The class is used here:
#JsonSerialize(using = TalkIdsSerializer.class)
public class TalkIds { /* ... */ }
I want test the behavior of the serializer and came up with the following:
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonGenerator;
import org.junit.Before;
import org.junit.Test;
import java.io.IOException;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.Arrays;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertTrue;
public class TalkIdsSerializerTest {
protected final ArrayList<Integer> TALK_IDS =
new ArrayList<>(Arrays.asList(5931, 5930));
protected TalkIdsSerializer talkIdsSerializer;
#Before
public void setup() throws IOException {
talkIdsSerializer = new TalkIdsSerializer();
}
#Test
public void testSerialize() throws IOException {
StringWriter stringWriter = new StringWriter();
JsonGenerator jsonGenerator =
new JsonFactory().createGenerator(stringWriter);
TalkIds talkIds = new TalkIds();
talkIds.add(TALK_IDS);
talkIdsSerializer.serialize(talkIds, jsonGenerator, null);
String string = stringWriter.toString(); // string is ""
assertNotNull(string);
assertTrue(string.length() > 0);
stringWriter.close();
}
}
However, nothing is written to the StringWriter. What am I doing wrong?
You need to flush() the generator
Method called to flush any buffered content to the underlying target (output stream, writer), and to flush the target itself as well.
http://fasterxml.github.io/jackson-core/javadoc/2.1.0/com/fasterxml/jackson/core/JsonGenerator.html#flush()
I had a similar requirement, to test a custom serializer. I used objectMapper to get the string directly(since you have already annotated TalkIds with JsonSerialize). You can get the json string from the object as follows
String json = new ObjectMapper().writeValueAsString(talkIds)
For me flush() changed nothing, so I changed the way to test it, in accordance with http://www.baeldung.com/jackson-custom-serialization.
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.module.SimpleModule;
import java.io.StringWriter;
//...
#Test
public void serialize_custom() throws Exception {
ObjectMapper objectMapper = new ObjectMapper();
SimpleModule module = new SimpleModule();
module.addSerializer(MyCustomSerializer.class, myCustomSerializer);
objectMapper.registerModule(module);
StringWriter stringWriter = new StringWriter();
TalkIds talkIds = new TalkIds();
talkIds.add(TALK_IDS);
objectMapper.writeValue(stringWriter,wi);
assertTrue(stringWriter.toString().length() > 3);
}

Categories