Restrict access to public service many times - java

I have such situation. Imagine there is a public REST service. What we don't want, is for someone, to be able to access this service many times in short period of time, because they will be able to block our database (essentially a DDOS attack, I presume?).
Is there a way to effectively protect against this type of attack? Technology we use is Spring/Spring Security.

If you are using Spring Boot There is a fairly new opensource project which handles this:
https://github.com/weddini/spring-boot-throttling
Declarative approach of throttling control over the Spring services.
#Throttling annotation helps you to limit the number of service method
calls per java.util.concurrent.TimeUnit for a particular user, IP
address, HTTP header/cookie value, or using Spring Expression Language
(SpEL).
Obviously this wouldn't prevent DDOS attacks at the web server level, but it would help limit access to long running queries or implement a fair usage policy.

For those interested in the subject, spring-boot-throttling seems no longer maintained.
So, I take a look on bucket4j
The use is quite simple: There are 3 main objects:
Bucket : Interface allowing to define the total capacity of available tokens. It also provides the methods to consume the tokens.
Bandwidth : Class allowing to define the limits of the bucket.
Refill : Class allowing to define the way the bucket will be fed, with new tokens.
Example with simple Spring Boot controller:
#RestController
public class TestLimit {
private Bucket bucket = null;
public MsGeneratorController() {
Bandwidth limit = Bandwidth.classic(120, Refill.greedy(120, Duration.ofMinutes(1)));
this.bucket = Bucket4j.builder().addLimit(limit).build();
}
#RequestMapping(path = "/test-limit/", method = RequestMethod.GET)
public ResponseEntity<String> download() throws IOException {
if (this.bucket.tryConsume(1)) {
return ResponseEntity.status(HttpStatus.OK).build();
}else {
return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).build();
}
}
}
In this case, we have a limit of 120 requests per minute, with bucket capacity 120 and a refill rate of 120 tokens per minute.
If we exceed this limit, we will receive an HTTP 429 code (TOO_MANY_REQUESTS).

Related

Find or produce data streams for simulations in RESTful API Web Service

at the time of this post, I am creating the roadmap for my MSc thesis based on RESTful API Web Services by using Java Spring Boot. The scope of my thesis goes around IoT devices and the data being produced by them and then making simulations to extract results, back to the clients.
In order to do so, I need to find a way in producing data in the input of my service for the aforementioned simulations to take place.
Can anyone give me a suggestion about approaching this issue? Should I create a separate app for the data stream creation?
you can create a synch service with a cronjob or something.
#Configuration
#EnableScheduling
#EnableAsync
#Service
public class SynchService {
#Scheduled(cron = "0 0 * * * ?")
private void synchSomething() throws IOException {
}
}

How to stream large data from database via REST in Quarkus

I'm implementing a GET method in Quarkus that should send large amounts of data to the client. The data is read from the database using JPA/Hibernate, serialized to JSON, and then sent to the client. How can this can be done efficiently without having the whole data in memory? I tried the following three possibilities all without success:
Use getResultList from JPA and return a Response with the list as the body. A MessageBodyWriter will take care of serializing the list to JSON. However, this will pull all data into memory which is not feasible for a larger number of records.
Use getResultStream from JPA and return a Response with the stream as the body. A MessageBodyWriter will take care of serializing the stream to JSON. Unfortunately this doesn't work because it seems the EntityManager is closed after the JAX-RS method has been executed and before the MessageBodyWriter is invoked. This means that the underlying ResultSet is also closed and the writer cannot read from the stream any more.
Use a StreamingOutput as Response body. The same problem as in 2. occurs.
So my question is: what's the trick for sending large data read via JPA with Quarkus?
Do your results have to be all in one response? How about making the client request the next results page until there's no next - a typical REST API pagination exercise? Also the JPA backend will only fetch that page from the database so there's no moment when everything would sit in memory.
Based on your requirement you have two options:
Option 1:
Take HATEOAS approach (https://restfulapi.net/hateoas/). One of standard pattern to exchange large data sets over REST standard. So in this approach server will respond with set of HATEOAS URIs in first response quickly. Where each HATEOAS URI represents on group of elements. So you need to generate these URIs based on data size and let client code to take responsibility of calling these URIs individually as REST APIs to get actual data. But again in this option also you can consider Reactive style to get more advantage of streaming processing with small memory foot print.
Option 2:
As suggested by #Serkan above, continuously stream the result set from database as REST response to client. Here you need to make sure the gateway between client and Service for timeout settings. If there is no gateway you are good. So you can take advantage of reactive programming at all layers to achieve continuous streaming. "DAO/data access layer" --> "Service layer" --> REST Controller --> Client. Spring reactor is compliant of JAX-RS as well. https://quarkus.io/guides/getting-started-reactive. This is best architecture style while dealing large data processing.
Here you have some resources that can help you with this:
Using reactive Hibernate: https://quarkusio.zulipchat.com/#narrow/stream/187030-users/topic/Large.20datasets.20using.20reactive.20SQL.20clients
Paging vs Forward only ResultSets: https://knes1.github.io/blog/2015/2015-10-19-streaming-mysql-results-using-java8-streams-and-spring-data.html
The last article is for SpringBoot, but the idea can also be implemented with Quarkus.
------------Edit:
OK, I've worked out an example where I do a batch select. I did it with Panache, but you can do it easily also without it.
I'm returning a ScrollableResult, then use this in the Rest resource to stream it via SSE (server sent event) to the client.
------------Edit 2:
I've added the setFetchSize to the query. You should play with this number and set it between 1-50. If value = 1, then the db rows will be fetched 1 by 1, this mimics streaming the most. And it will use the least amount of memory, but the I/O between the db & app will be more often.
And the usage of a StatelessSession is highly recommended when doing bulk operations like this.
#Entity
public class Fruit extends PanacheEntity {
public String name;
// I've removed the logic from here to the Rest resource,
// otherwise you cannot close the session
}
#Path("/fruits")
public class FruitResource {
#GET
#Produces(SERVER_SENT_EVENTS)
public void fruitsStream(#Context Sse sse, #Context SseEventSink sink) {
var sf = Fruit.getEntityManager().getEntityManagerFactory().unwrap(SessionFactory.class);
try (var session = sf.openStatelessSession();
var scrollableResults = session.createQuery("select f from Fruit f")
.scroll(ScrollMode.FORWARD_ONLY)
.setFetchSize(1) {
while (scrollableResults.next()) {
sink.send(sse.newEventBuilder().data(scrollableResults.get(0)).mediaType(APPLICATION_JSON_TYPE).build());
}
sink.close();
}
}
}
Then I call this Rest endpoint like this (via httpie):
> http :8080/fruits --stream
data: {"id":9996,"name":"applecfcdd592-1934-4f0e-a6a8-2f88fae5d14c"}
data: {"id":9997,"name":"apple7f5045a8-03bd-4bf5-9809-03b22069d9f3"}
data: {"id":9998,"name":"apple0982b65a-bc74-408f-a6e7-a165ec3250a1"}
data: {"id":9999,"name":"apple2f347c25-d0a1-46b7-bcb6-1f1fd5098402"}
data: {"id":10000,"name":"apple65d456b8-fb04-41da-bf07-73c962930629"}
Hope this helps you.

Is routing API calls through my own RESTful API considered an acceptable strategy?

This question might be considered opinionated but I really can't seem to find a straight answer. So either I'm missing something or I'm asking the wrong questions.
So, I'm an undergrad student and new in the whole Spring app development and I'm currently creating an app with React acting as the frontend and building a RESTful API using Spring in order to support it with necessary operations for the backend. Among other services, the backend API I'm building is used as a middle-man forwarding calls to the Google Geocoding API and other 3rd party APIs. **My biggest question since I started this is, is this a valid strategy? ** (at least, something acceptable by industry professionals)
My Research
The biggest pro that this method has is that it allows me to effectively hide my API keys from the client rendering any attacks impossible.
At the same time, this whole process adds (I believe) unnecessary complexity and delay in the overall responses resulting in a possible hindered user experience.
Something I'm not quite sure yet would be that I'll have to add async capabilities to the services exposed by my own API in order to facilitate multiple users. (This last part might be entirely wrong but I haven't been able to understand how multiple users are handled when querying the same endpoint concurrently.)
I have used Apache JMeter to test the performance of the app on concurrent POST calls and it seems to be able to handle them on around 170ms. (I'll post screenshots of the results below.)
Code
I'll try to include code demonstrating the controller that's responsible for the calls to the Geocoding API.
#RestController
#RequestMapping("/api")
#Slf4j
#RequiredArgsConstructor
#Component
public class GeocodingController {
private final OkHttpClient httpClient = new OkHttpClient();
#PostMapping(value = "/reversegeocoding")
public String getReverseGeocode(#RequestBody LatLng latlng) throws IOException, ExecutionException, InterruptedException {
String encodedLatLng = latlng.toString();
Request request = new Request.Builder()
.url("https://maps.googleapis.com/maps/api/geocode/json?" +
"language=en&result_type=street_address&latlng=" + encodedLatLng +
"&key=MY_API_KEY")
.build();
CallbackFuture future = new CallbackFuture();
httpClient.newCall(request).enqueue(future);
Response response = future.get();
return response.body().string();
}
}
The getReverseGeocode() method takes in as an argument the following object:
#Data
#AllArgsConstructor
#NoArgsConstructor
public class LatLng {
double lat;
double lng;
public String toString() {
return lat + "," + lng;
}
}
So, the Request Body is mapped onto the above object once the request arrives.
Finally the CallbackFuture is just an adapter class based on this answer.
public class CallbackFuture extends CompletableFuture<Response> implements Callback {
#Override
public void onFailure(Call call, IOException e) {
super.completeExceptionally(e);
}
#Override
public void onResponse(Call call, Response response) {
super.complete(response);
}
}
JMeter Results
Yeah, this is done all the time. Not all external APIs are setup to allow for you to grant access to your users directly. You also may have requirements for logging and or access control. It does mean you need to dedicate your resources to the calls but unless you are expecting an excessive load its not worth optimizing for too far in advance. Sometimes you can offload the proxy responsibilities to something like nginix instead which can be more efficient than your application backend.
In my experience it is worthwhile to keep these proxies in their own separate package that is isolated from your other code where possible. Then if you need to scale them independently of your main app you can easily break them out.
Yes, this is an entirely valid strategy. There are many benefits to using this approach and whilst it can feel like it's adding unnecessary complexity, the benefits often out way the cost.
Firstly you're avoiding "leaky abstractions", the client shouldn't care how you implement a particular piece of functionality, they just care that it works! It also means that you should be able to change the implementation in the future without the client even knowing.
Secondly, you're decoupling your APIs from others. If the wrapped API changes, you can handle this yourself without the clients having to change their code (there will be times when this can't be done, but it offers a good defence to this).
Also, it gives you a convenient point to implement your own functionality around these APIs (rate limiting, access control and logging for example).
The #Async issue is not specific to wrapping third party APIs, it's an issue for all endpoints you have which have blocking IO.
It is common. Recently did this to a third party api adding Cross Origin headers so clients could perform cross origin requests. Also the original JSON was malformed in some cases so it was possible to cleanly handle that scenario.
The first is very easy with Spring boot and as you can just decorate the controller with #CrossOrigin(maxAge = 3600)

how to make spring boot rest api as stateful

I want to make web application with REST and spring boot. Rest web service is stateless. I want to make it stateful so that the information server send to client after first request should be used in upcoming request. Or the execution done in first or second request will be used further.
Can we generate some session id for this and this session id client can send to sever in followed requests ? If yes, then
if state is changing (values get modified due to some manipulation) of some objects/beans. So how can we save the state inorder of objects/beans to make it stateful and what is the scope of these beans (whose value will be modified) and those classes/beans which will give call to these beans as multiple clients or users will be using this web application ?
Restful API's are not stateful by design, if you make them stateful using a server side then its not REST!
What you need a correlation Id which is a recognised pattern in a distributed system design. Correlation Id will let you tie requests together.
Sessions are typically an optimization to improve performance when running multiple servers. They improve performance by ensuring that a clients requests always get sent to the same server which has cached the clients data.
If you only want to run a single server, you won't have to worry about sessions. There are two common approaches to solve this problem.
1. In Memory State
If the state you want to maintain is small enough to fit into memory, and you don't mind losing it in the event of a server crash or reboot, you can save it in memory. You can create a spring service which holds a data structure. Then you can inject that service into your controllers and change the state in your http handlers.
Services are singletons by default. So state stored in a service is accessible to all controllers, components, and user requests. A small pseudo example is bellow.
Service Class
#Service
public class MyState
{
private Map<String, Integer> sums = new HashMap<>();
public synchronized int get(String key) {
return sums.get(key);
}
public synchronized void add(String key, int val) {
int sum = 0;
if (sums.contains(key)) {
sum = sum.get(key);
}
sum += val;
sums.put(key, (Integer)sum);
}
}
Controller Class
#RestController
#RequestMapping("/sum")
public class FactoryController
{
#Autowired
private MyState myState;
#PostMapping(consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_JSON_VALUE)
#ResponseStatus(HttpStatus.OK)
#ResponseBody
public SuccessResponse saveFactory(#RequestBody KeyVal keyVal)
{
myState.add(keyVal.getKey(), keyVal.getValue());
}
}
Note: This approach will not work if you are running multiple servers behind a load balancer, unless you use a more complex solution like a distributed cache. Sessions can be used to optimize performance in this case.
2. Database
The other option is to just use a database to store your state. This way you won't lose data if you crash or reboot. Spring supports the hibernate persistence framework and you can run a database like Postgres.
Note: If you are running multiple servers you will need a more complex solution since hibernate caches data in memory. You will have to plug hibernate into a distributed cache to synchronize in memory state across multiple servers. Sessions could be used as a performance optimization here.
Important
Whenever you are modifying state you need to make sure you are doing it in a thread safe manner, otherwise your state may be incorrect.

How to limit requests per USER in ServletAPI (Spring MVC)

How can I allow only a one request to micorservice method with specific URL #PathVariable per User.
My Controller
#RestController
#RequestMapping(value = "/rest/product", produces = "application/json;charset=UTF-8")
public class ProductRestController {
#Autowired
ProductService productService;
#Autowired
ProductAsm productAsm;
#RequestMapping(value = "/ID/{ID}", method = RequestMethod.GET)
public ResponseEntity<ProductResource> getProductID(#PathVariable("ID") Long ID, #AuthenticationPrincipal User) {
Product product = productService.getProduct(ID);
if (product == null)
return new ResponseEntity<>(HttpStatus.NOT_FOUND);
return new ResponseEntity<>(productAsm.toResource(product), HttpStatus.OK);
}
For example :
first request /rest/product/ID/2231 allowed for USER(with login="xaxa" )
second request /rest/product/ID/2545 allowed for USER(with login="xaxa" )
thirdth request /rest/product/ID/2231 not allowed for USER(with login="xaxa" )
Which is the best way to implement this functionality?(Have I to keep this URL request with User login in DB or there is already solutions)
You could use AOP and implement your own aspect that would be called Before your Rest Endpoint method.
This pointcut would read ID provided in a request and would try to find a Lock corresponding with this ID. Then the usual - try to access resource and potentially wait.
Implementation could base on Guava's Striped class - at least for the start.
There are several problems that need to be taken into consideration:
Striped could be replaced with some LRU Cache for better memory management.
You would of course have to provide synchronization for the case when the same ID is accessed simultaneously by two requests.
It would work only for an application deployed on a single node.
It would not be very good approach performance-wise. Depending on your traffic this may be an issue.

Categories