I'm totally new to reactive programming and I have problem with coding even such an elementary task. The following method of an RestController should:
Take as parameter DoiReservationRequest object that represents reservation of yet-to-be-published DOI number (https://www.doi.org/). This reservation is meaningful only within our internal systems. The parameter is passed in the body of the POST request. DOI reservation request is a simple object:
public record DoiReservationRequest(String doi) {
}
Check that there is no previous reservation of the same number, or that the DOI number is not actually already submitted and published. For this purpose, try to find submissions with the same DOI in DoiSubmissionRepository, which is defined as:
#EnableMongoRepositories
#Repository
public interface DoiSubmissionRepository extends ReactiveMongoRepository<DoiSubmission, String> {
Flux<DoiSubmission> findAllByDoi(Publisher<String> doi);
}
DoiSubmission is itself defined as:
#Getter
#NoArgsConstructor(access = AccessLevel.PROTECTED)
#AllArgsConstructor
#ToString
#Document
public final class DoiSubmission {
#Id
private String id;
#Indexed
private String doi;
private Integer version;
private String xml;
private Date timestamp;
}
If no submission exists then return HTTP 201 with body that for now is empty, but before that save the reservation as DOI submission that has version 0 and empty xml content.
If submissions with the same doi exist (several different versions of the same DOI number with different xml data), return HTTP 409 with body that is yet to be determined that describes the error.
The code hangs indefinitely when POST request is made:
#PostMapping("/api/v1/reservation/")
public Mono<ResponseEntity<String>> create(#RequestBody Publisher<DoiReservationRequest> doi) {
return doiSubmissionRepository
.findAllByDoi(Mono.from(doi)
.map(DoiReservationRequest::doi))
.hasElements()
.flatMap(hasElements->{
if (hasElements) {
return Mono.just(ResponseEntity.status(HttpStatus.CONFLICT).body(""));
} else {
return Mono.from(doi)
.map(doiReservationRequest -> new DoiSubmission(
UUID.randomUUID().toString(),
doiReservationRequest.doi(), 0, "", new Date()))
.flatMap(doiSubmissionRepository::save)
.then(Mono.just(ResponseEntity.status(HttpStatus.OK).body("")));
}
});
}
Related
I am sorry for the long post, but I believe It is important to mention everything related to the issue.
I am dealing with a requirement for my webservice that sends out notifications for 20k+ users at a time. Since this is quite a heavy task, I thought that having it Async was probably the best approach as it will take some time to process the data. This feature is available for a vast majority of users on the platform, hence there can be multiple requests at once. The amount of users that will receive a notification can vary from 1k to 20k+. Since the request processing takes quite a long time - I basically create a notification, assign it to the correct talents and then send it out in waves. This feature alone seems to have a massive impact on performance when there is multiple concurrent requets for notifications active at the same time and I end up with an out of memory error. I am not sure if this can be optimized at all, or If I should just perhaps choose a completely different approach to everything. I apologize for the long post but I believe it important that I mention everything.
I designed the system to act as follows:
I receive a notification request, which is created in a separate table
I receive a token that indicates which users should get the notification
I fetch the users via a mapped class that is used as a predicate inside of a findAll method (QueryDSL)
I created a relational table taht contains the notificationId, talentId and an extra 'sent' column. Every talent that should receive the message is added to this table along with the notificationId
I have a #Scheduled method that picks up a portion of the notification/talent relations and sends out the notification periodically
My Async configuration class is as follows:
#Component
#Configuration
#EnableAsync
public class AsyncConfiguration implements AsyncConfigurer {
#Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setMaxPoolSize(5);
executor.setCorePoolSize(5);
executor.setThreadNamePrefix("asyncExec-");
executor.initialize();
return executor;
}
#Override
public AsyncUncaughtExceptionHandler getAsyncUncaughtExceptionHandler() {
return (ex, method, params) -> {
System.out.println("Exception with message :" + ex.getMessage());
System.out.println("Method :" + method);
System.out.println("Number of parameters :" + params.length);
};
}
}
My user entity has the following mapping:
#OneToMany(mappedBy = "talent", cascade = CascadeType.ALL, orphanRemoval = true)
private List<TalentRoleNotificationRelations> roleNotifications;
My notification has the following mapping:
#OneToMany(mappedBy = "roleNotification", cascade = CascadeType.ALL, orphanRemoval = true)
private List<TalentRoleNotificationRelations> roleNotifications;
And my relational table entity is as follows:
#Entity
#Data
#Builder
#NoArgsConstructor
#AllArgsConstructor
public class TalentRoleNotificationRelations implements Serializable {
private static final long serialVersionUID = 1L;
#EmbeddedId
private TalentRoleNotificationIdentity identity;
#ManyToOne(fetch = FetchType.LAZY)
#MapsId("talentId")
#JoinColumn(name = "talent_id")
private Talent talent;
#ManyToOne(fetch = FetchType.LAZY)
#MapsId("roleNotificationId")
#JoinColumn(name = "role_notification_id")
private RoleNotification roleNotification;
private Boolean sent;
}
And the composite key identity:
#Embeddable
#Builder
#AllArgsConstructor
#NoArgsConstructor
#EqualsAndHashCode
#Data
public class TalentRoleNotificationIdentity implements Serializable {
private static final long serialVersionUID = 1L;
#Column(name = "talent_id")
private String talentId;
#Column(name = "role_notification_id")
private String roleNotificationId;
}
This mapping was done following the guide here which indicates how a many to many relation with an extra column should be implemented.
And the actual process of creating the notification
Controller method:
#PostMapping(value = "notify/{searchToken}/{roleId}", produces = MediaType.APPLICATION_JSON_VALUE)
public RoleNotificationInfo notifyTalentsOfNewRole(#PathVariable String searchToken, #PathVariable String roleId) {
var roleProfileInfo = (RoleProfileInfo) Optional.ofNullable(searchToken)
.map(token -> searchFactory.fromToken(token, RoleProfileInfo.class))
.orElse(null);
return productionService.notifyTalentsOfMatchingRole(roleProfileInfo, roleId);
}
Service method: (This is where I believe the issue is, as well as a very bad use of a many to many table mapping?)
#Transactional(propagation = Propagation.REQUIRES_NEW)
#Async
public RoleNotificationInfo notifyTalentsOfMatchingRole(RoleProfileInfo roleProfileInfo, String roleId) {
var predicate = Optional.ofNullable(userService.getPredicate(roleProfileInfo)).orElse(new BooleanBuilder());
var role = roleDao.findById(roleId).orElseThrow(NoSuchRole::new);
var isNotificationLimitReached = isNotificationLimitReached(roleId);
if (!isNotificationLimitReached) {
var notificationBody = notificationBodyDao.findById(NotificationBodyIdentifier.MATCHING_ROLE)
.orElseThrow(NoSuchNotificationBody::new);
var newNotification = RoleNotification.builder()
.notificationBody(notificationBody)
.role(role)
.build();
roleNotificationDao.saveAndFlush(newNotification);
talentDao.findAll(predicate)
.forEach(talent -> {
var identity = TalentRoleNotificationIdentity.builder()
.talentId(talent.getId())
.roleNotificationId(newNotification.getId())
.build();
var talentRoleNotification = TalentRoleNotificationRelations.builder()
.identity(identity)
.roleNotification(newNotification)
.talent(talent)
.sent(false)
.build();
talentRoleNotification.setIdentity(identity);
talentRoleNotification.setTalent(talent);
talentRoleNotificationRelationsDao.save(talentRoleNotification);
});
return dtoFactory.toInfo(newNotification);
} else throw new RoleNotificationLimitReached();
}
Scheduled method that sends out the notifications:
#Transactional(propagation = Propagation.REQUIRES_NEW)
#Scheduled(fixedDelay = 5)
public void sendRoleMessage() {
var unsentNotificationIds = talentRoleNotificationRelationsDao.findAll(
QTalentRoleNotificationRelations.talentRoleNotificationRelations.sent.isFalse()
.and(QTalentRoleNotificationRelations.talentRoleNotificationRelations.talent.notificationToken.isNotNull()),
PageRequest.of(0, 50)
);
unsentNotificationIds.forEach(unsentNotification -> {
var talent = unsentNotification.getTalent();
var roleNotification = unsentNotification.getRoleNotification();
markAsSent(unsentNotification);
talentRoleNotificationRelationsDao.saveAndFlush(unsentNotification);
notificationPusher.push(notificationFactory.buildComposite(talent.getNotificationToken()), dtoFactory.toInfo(roleNotification));
});
}
The notificationPusher method itself:
#Override
public void push(PushMessageComposite composite, RoleNotificationInfo roleNotificationInfo) {
String roleId = roleNotificationInfo.getRoleId();
String title = "New matching role!";
var push = Message.builder()
.setToken(composite.getMeta().getDeviceToken())
.setAndroidConfig(AndroidConfig.builder()
.setNotification(AndroidNotification.builder()
.setTitle(title)
.setBody(roleNotificationInfo.getBody())
.setSound(composite.getMeta().getSound())
.build())
.build())
.setApnsConfig(ApnsConfig.builder()
.setAps(Aps.builder()
.setAlert(ApsAlert.builder()
.setTitle(title)
.setBody(roleNotificationInfo.getBody())
.build())
.setBadge(composite.getMeta().getBadge().intValue())
.setSound(composite.getMeta().getSound())
.build())
.build())
.putData("roleId", roleId)
.build();
Try.run(() -> FirebaseMessaging.getInstance().sendAsync(push).get())
.onFailure(e -> {
log.error("Firebase Cloud Messaging failed during sendNotification", e);
var talent = talentDao.findTalentByNotificationToken(composite.getMeta().getDeviceToken());
talent.setNotificationToken(null);
talentDao.save(talent);
});
}
And the dto factory mapper method:
#Transactional(propagation = Propagation.MANDATORY)
public RoleNotificationInfo toInfo(RoleNotification source) {
return RoleNotificationInfo.builder()
.id(source.getId())
.body(source.getNotificationBody().getBody())
.created(source.getCreated())
.roleId(source.getRole().getId())
.build();
}
I am unsure where the problem lies. I assume it is due to the high amount of users fetched from certain queries (20k+). I did some profiling, and this were the results:
Memory/CPU charts:
My question is, should this even be possible at all? Is there a different approach that would be much more efficient? Should I use an external service for this? Is the problem something very obvious that I do not see? I am not sure where to look. If anything in my code is unclear and needs further clarification, please let me know and I'll try to edit and format it the best way I can.
I have a working setup for Spring Cloud Kafka Streams with functional programming style.
There are two use cases, which are configured via application.properties.
Both of them work individually, but as soon as I activate both at the same time, I get a serialization error for the output stream of the second use case:
Exception in thread "ActivitiesAppId-05296224-5ea1-412a-aee4-1165870b5c75-StreamThread-1" org.apache.kafka.streams.errors.StreamsException:
Error encountered sending record to topic outputActivities for task 0_0 due to:
...
Caused by: org.apache.kafka.common.errors.SerializationException:
Can't serialize data [com.example.connector.model.Activity#497b37ff] for topic [outputActivities]
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
Incompatible types: declared root type ([simple type, class com.example.connector.model.Material]) vs com.example.connector.model.Activity
The last line here is important, as the "declared root type" is from the Material class, but not the Activity class, which is probably the source error.
Again, when I only activate the second use case before starting the application, everything works fine. So I assume that the "Material" processor somehow interfers with the "Activities" processor (or its serializer), but I don't know when and where.
Setup
1.) use case: "Materials"
one input stream -> transformation -> one output stream
#Bean
public Function<KStream<String, MaterialRaw>, KStream<String, Material>> processMaterials() {...}
application.properties
spring.cloud.stream.kafka.streams.binder.functions.processMaterials.applicationId=MaterialsAppId
spring.cloud.stream.bindings.processMaterials-in-0.destination=inputMaterialsRaw
spring.cloud.stream.bindings.processMaterials-out-0.destination=outputMaterials
2.) use case: "Activities"
two input streams -> joining -> one output stream
#Bean
public BiFunction<KStream<String, ActivityRaw>, KStream<String, Assignee>, KStream<String, Activity>> processActivities() {...}
application.properties
spring.cloud.stream.kafka.streams.binder.functions.processActivities.applicationId=ActivitiesAppId
spring.cloud.stream.bindings.processActivities-in-0.destination=inputActivitiesRaw
spring.cloud.stream.bindings.processActivities-in-1.destination=inputAssignees
spring.cloud.stream.bindings.processActivities-out-0.destination=outputActivities
The two processors are also defined as stream function in application.properties: spring.cloud.stream.function.definition=processActivities;processMaterials
Thanks!
Update - Here's how I use the processors in the code:
Implementation
// Material model
#Getter
#Setter
#AllArgsConstructor
#NoArgsConstructor
public class MaterialRaw {
private String id;
private String name;
}
#Getter
#Setter
#AllArgsConstructor
#NoArgsConstructor
public class Material {
private String id;
private String name;
}
// Material processor
#Bean
public Function<KStream<String, MaterialRaw>, KStream<String, Material>> processMaterials() {
return materialsRawStream -> materialsRawStream .map((recordKey, materialRaw) -> {
// some transformation
final var newId = materialRaw.getId() + "---foo";
final var newName = materialRaw.getName() + "---bar";
final var material = new Material(newId, newName);
// output
return new KeyValue<>(recordKey, material);
};
}
// Activity model
#Getter
#Setter
#AllArgsConstructor
#NoArgsConstructor
public class ActivityRaw {
private String id;
private String name;
}
#Getter
#Setter
#AllArgsConstructor
#NoArgsConstructor
public class Assignee {
private String id;
private String assignedAt;
}
/**
* Combination of `ActivityRaw` and `Assignee`
*/
#Getter
#Setter
#AllArgsConstructor
#NoArgsConstructor
public class Activity {
private String id;
private Integer number;
private String assignedAt;
}
// Activity processor
#Bean
public BiFunction<KStream<String, ActivityRaw>, KStream<String, Assignee>, KStream<String, Activity>> processActivities() {
return (activitiesRawStream, assigneesStream) -> {
final var joinWindow = JoinWindows.of(Duration.ofDays(30));
final var streamJoined = StreamJoined.with(
Serdes.String(),
new JsonSerde<>(ActivityRaw.class),
new JsonSerde<>(Assignee.class)
);
final var joinedStream = activitiesRawStream.leftJoin(
assigneesStream,
new ActivityJoiner(),
joinWindow,
streamJoined
);
final var mappedStream = joinedStream.map((recordKey, activity) -> {
return new KeyValue<>(recordKey, activity);
});
return mappedStream;
};
}
This turns out to be an issue with the way the binder infers Serde types when there are multiple functions with different outbound target types, one with Activity and another with Material in your case. We will have to address this in the binder. I created an issue here.
In the meantime, you can follow this workaround.
Create a custom Serde class as below.
public class ActivitySerde extends JsonSerde<Activity> {}
Then, explicitly use this Serde for the outbound of your processActivities function using configuration.
For e.g.,
spring.cloud.stream.kafka.streams.bindings.processActivities-out-0.producer.valueSerde=com.example.so65003575.ActivitySerde
Please change the package to the appropriate one if you are trying this workaround.
Here is another recommended approach. If you define a bean of type Serde with the target type, that takes precedence as the binder will do a match against the KStream type. Therefore, you can also do it without defining that extra class in the above workaround.
#Bean
public Serde<Activity> activitySerde() {
return new JsonSerde(Activity.class);
}
Here are the docs where it explains all these details.
You need to specify which binder to use for each function s.c.s.bindings.xxx.binder=....
However, without that, I would have expected an error such as "multiple binders found but no default specified", which is what happens with message channel binders.
I tried get entity by Data JPA & Data Rest without HATEOAS.
The condition is that I use the HATEOAS form, and sometimes I need a pure Json response.
So I'm creating JSON by creating the controller path separately from the repository's endpoint and creating the DTO class separately.
this is my code :
#RepositoryRestController
public class MetricController {
#Autowired
private MetricRepository metricRepository;
#RequestMapping(method = RequestMethod.GET, value = "/metrics/in/{id}")
public #ResponseBody
MetricDTO getMetric(#PathVariable Long id) {
return MetricDTO.fromEntity(metricRepository.getOne(id));
}
}
#RepositoryRestResource
public interface MetricRepository extends JpaRepository<Metric, Long> { }
#Setter
#Getter
#NoArgsConstructor
#AllArgsConstructor
public class MetricDTO {
private SourceType sourceType;
private String metricTypeField;
private String metricType;
private String instanceType;
private String instanceTypeField;
private List<String> metricIdFields;
private List<String> valueFields;
private Map<String, String> virtualFieldValueEx;
public static MetricDTO fromEntity(Metric metric) {
return new MetricDTO(
metric.getSourceType(),
metric.getMetricTypeField(),
metric.getMetricType(),
metric.getInstanceType(),
metric.getInstanceTypeField(),
metric.getMetricIdFields(),
metric.getValueFields(),
metric.getVirtualFieldValueEx()
);
}
}
It's the way I do, but I expect there will be better options and patterns.
The question is, I wonder if this is the best way.
HATEOAS (Hypermedia as the Engine of Application State) is a constraint of the REST application architecture.
It basically tells that anyone who is a consumer of your REST endpoints can navigate between them with the help of the link.
let take your example
**HTTP Method** **Relation (rel)** **Link**
GET Up /metrics/in
GET Self /metrics/in/{id}
GET SourceType /sourceType/{id}
GET metricIdFields /url for each in JSON aarray
Delete Delete /employe/{employeId}
Use org.springframework.hateoas.Links class to create such link in your DTOs.
in you DTO add
public class MetricDTO {
private Links links;
//Getters and setters
//inside your setters add SLEF , GET , create Delete for current resource
}
https://www.baeldung.com/spring-hateoas-tutorial
I'm currently working on a SpringBoot API to interface with a MongoRepository, but I'm having trouble understanding how the JSON being passed becomes a Document for storage within Mongo. I currently have a simple API that stores a group of users:
#Document
#JsonInclude
public class Group {
#Id
#JsonView(Views.Public.class)
private String id;
#JsonView(Views.Public.class)
private String name;
#JsonView(Views.Public.class)
private Set<GroupMember> groupMembers = new HashSet<>();
}
There are also setter and getter methods for each of the fields, although I don't know how necessary those are either (see questions at the end).
Here is the straightforward component I'm using:
#Component
#Path("/groups")
#Api(value = "/groups", description = "Group REST")
public class Groups {
#Autowired
private GroupService groupService;
#GET
#Produces(MediaType.APPLICATION_JSON)
#ApiOperation(value = "Get all Groups", response = Group.class, responseContainer = "List")
#JsonView(Views.Public.class)
public List<Group> getAllGroups() {
return groupService.getAllGroups();
}
#POST
#Produces(MediaType.APPLICATION_JSON)
#Consumes(MediaType.APPLICATION_JSON)
#ApiOperation(value = "Create a Group", response = Group.class)
#JsonView(Views.Detailed.class)
public Group submitGroup(Group group) {
return groupService.addGroup(group);
}
}
Finally, I have a Service class:
#Service
public class GroupServiceImpl implements GroupService {
#Autowired
private GroupRepository groupRepository;
#Override
public Group addGroup(Group group) {
group.setId(null);
return groupRepository.save(group);
}
#Override
public List<Group> getAllGroups() {
return groupRepository.findAll();
}
}
The GroupRespository is simply an interface which extends MongoRepository<Group,String>
Now, when I actually make a call to the POST method, with a body containing:
{
"name": "group001",
"groupMembers": []
}
I see that it properly inserts this group with a random Mongo UUID. However, if I try to insert GroupMember objects inside the list, I receive a null pointer exception. From this, I have two questions:
How does SpringBoot (Jackson?) know which fields to deserialize from the JSON being passed? I tested this after deleting the getter and setter methods, and it still works.
How does SpringBoot handle nested objects, such as the Set inside the class? I tested with List instead of Set, and it worked, but I have no idea why. My guess is that for each object that is both declared in my class and listed in my JSON object, SpringBoot is calling a constructor that it magically created behind the scenes, and one doesn't exist for the Set interface.
Suppose I'm adamant on using Set (the same user shouldn't show up twice anyway). What tools can I use to get SpringBoot to work as expected?
It seems to me that a lot of the things that happen in Spring are very behind-the-scenes, which makes it difficult for me to understand why things work when they do. Not knowing why things work makes it difficult to construct things from scratch, which makes it feel as though I'm hacking together a project rather than actually engineering one. So my last question is something like, is there a guide that explains the wiring behind the scenes?
Finally, this is my first time working with Spring... so please excuse me if my questions are entirely off the mark, but I would appreciate any answers nonetheless.
I have a Customer entity that contains a list of Sites, as follows:
public class Customer {
#Id
#GeneratedValue
private int id;
#NotNull
private String name;
#NotNull
#AccountNumber
private String accountNumber;
#Valid
#OneToMany(mappedBy="customer")
private List<Site> sites
}
public class Site {
#Id
#GeneratedValue
private int id;
#NotNull
private String addressLine1;
private String addressLine2;
#NotNull
private String town;
#PostCode
private String postCode;
#ManyToOne
#JoinColumn(name="customer_id")
private Customer customer;
}
I am in the process of creating a form to allow users to create a new Customer by entering the name & account number and supplying a CSV file of sites (in the format "addressLine1", "addressLine2", "town", "postCode"). The user's input needs to be validated and errors returned to them (e.g. "file is not CSV file", "problem on line 7").
I started off by creating a Converter to receive a MultipartFile and convert it into a list of Site:
public class CSVToSiteConverter implements Converter<MultipartFile, List<Site>> {
public List<Site> convert(MultipartFile csvFile) {
List<Site> results = new List<Site>();
/* open MultipartFile and loop through line-by-line, adding into List<Site> */
return results;
}
}
This worked but there is no validation (i.e. if the user uploads a binary file or one of the CSV rows doesn't contain a town), there doesn't seem to be a way to pass the error back (and the converter doesn't seem to be the right place to perform validation).
I then created a form-backing object to receive the MultipartFile and Customer, and put validation on the MultipartFile:
public class CustomerForm {
#Valid
private Customer customer;
#SiteCSVFile
private MultipartFile csvFile;
}
#Documented
#Constraint(validatedBy = SiteCSVFileValidator.class)
#Target(ElementType.FIELD)
#Retention(RetentionPolicy.RUNTIME)
public #interface SiteCSVFile {
String message() default "{SiteCSVFile}";
Class<?>[] groups() default {};
Class<? extends Payload>[] payload() default {};
}
public class SiteCSVFileValidator implements ConstraintValidator<SiteCSVFile, MultipartFile> {
#Override
public void initialize(SiteCSVFile siteCSVFile) { }
#Override
public boolean isValid(MultipartFile csvFile, ConstraintValidatorContext cxt) {
boolean wasValid = true;
/* test csvFile for mimetype, open and loop through line-by-line, validating number of columns etc. */
return wasValid;
}
}
This also worked but then I have to re-open the CSV file and loop through it to actually populate the List within Customer, which doesn't seem that elegant:
#RequestMapping(value="/new", method = RequestMethod.POST)
public String newCustomer(#Valid #ModelAttribute("customerForm") CustomerForm customerForm, BindingResult bindingResult) {
if (bindingResult.hasErrors()) {
return "NewCustomer";
} else {
/*
validation has passed, so now we must:
1) open customerForm.csvFile
2) loop through it to populate customerForm.customer.sites
*/
customerService.insert(customerForm.customer);
return "CustomerList";
}
}
My MVC config limits file uploads to 1MB:
#Bean
public MultipartResolver multipartResolver() {
CommonsMultipartResolver multipartResolver = new CommonsMultipartResolver();
multipartResolver.setMaxUploadSize(1000000);
return multipartResolver;
}
Is there a spring-way of converting AND validating at the same time, without having to open the CSV file and loop through it twice, once to validate and another to actually read/populate the data?
IMHO, it is a bad idea to load the whole CSV in memory unless :
you are sure it will always be very small (and what if a user click on wrong file ?)
the validation is global (only real use case, but does not seem to be here)
your application will never be used in a production context under serious load
You should either stick to the MultipartFile object, or use a wrapper exposing the InputStream (and eventually other informations you could need) if you do not want to tie your business classes to Spring.
Then you carefully design, code and test a method taking an InputStream as input, reads it line by line and call line by line methods to validate and insert data. Something like
class CsvLoader {
#Autowired Verifier verifier;
#Autowired Loader loader;
void verifAndLoad(InputStream csv) {
// loop through csv
if (verifier.verify(myObj)) {
loader.load(myObj);
}
else {
// log the problem eventually store the line for further analysis
}
csv.close();
}
}
That way, your application only uses the memory it really needs, only looping once other the file.
Edit : precisions on what I meant by wrapping Spring MultipartFile
First, I would split validation in 2. Formal validation is in controller layer and only controls that :
there is a Customer field
the file size and mimetype seems Ok (eg : size > 12 && mimetype = text/csv)
The validation of the content is IMHO a business layer validation and can happen later. In this pattern, SiteCSVFileValidator would only test csv for mimetype and size.
Normally, you avoid directly using Spring classes from business classes. If it is not a concern, the controller directly sends the MultipartFile to a service object, passing also the BindingResult to populate directly the eventual error messages. The controller becomes :
#RequestMapping(value="/new", method = RequestMethod.POST)
public String newCustomer(#Valid #ModelAttribute("customerForm") CustomerForm customerForm, BindingResult bindingResult) {
if (bindingResult.hasErrors()) {
return "NewCustomer"; // only external validation
} else {
/*
validation has passed, so now we must:
1) open customerForm.csvFile
2) loop through it to validate each line and populate customerForm.customer.sites
*/
customerService.insert(customerForm.customer, customerForm.csvFile, bindingResult);
if (bindingResult.hasErrors()) {
return "NewCustomer"; // only external validation
} else {
return "CustomerList";
}
}
}
In service class we have
insert(Customer customer, MultipartFile csvFile, Errors errors) {
// loop through csvFile.getInputStream populating customer.sites and eventually adding Errors to errors
if (! errors.hasErrors) {
// actually insert through DAO
}
}
But we get 2 Spring classes in a method of service layer. If it is a concern, just replace the line customerService.insert(customerForm.customer, customerForm.csvFile, bindingResult); with :
List<Integer> linesInError = new ArrayList<Integer>();
customerService.insert(customerForm.customer, customerForm.csvFile.getInputStream(), linesInError);
if (! linesInError.isEmpty()) {
// populates bindingResult with convenient error messages
}
Then the service class only adds line numbers where errors where detected to linesInError
but it only gets the InputStream, where it could need say the original file name. You can pass the name as another parameter, or use a wrapper class :
class CsvFile {
private String name;
private InputStream inputStream;
CsvFile(MultipartFile file) {
name = file.getOriginalFilename();
inputStream = file.getInputStream();
}
// public getters ...
}
and call
customerService.insert(customerForm.customer, new CsvFile(customerForm.csvFile), linesInError);
with no direct Spring dependancies