Goal
Build a service that streams a zip file built on the fly. The zip file contains files downloaded via HTTP. Files are big enough to don't fit into RAM. Using filesystem should be avoided since it's unnecessary.
Problems
JSX-RS defines StreamingResponse and seems to be the answer. However, it's not supported by the Quarkus Reactive. It just returns ZipResource$ZipStreamingOutput#50a05849 as a response body.
Using Multi (from Mutiny) as a returning result does not allow bridging it with Kotlin coroutines, because coroutines require calling them from suspend function. And if the method is marked with suspend then it just returns io.smallrye.mutiny.operators.multi.builders.EmitterBasedMulti#24ac00e7 as a function body.
Example code:
#GET
#Produces("application/zip")
suspend fun download(): Multi<String> {
return flow { emit("hello") }.asMulti()
}
Also even if it worked, I don't see any option to specify response headers.
Criteria
Non-blocking streaming, so that concurrency is not limited by the number of threads
Kotlin coroutines are preferable since it's native for the language rather than framework-specific
Context
Kotlin 1.5.30
Quarkus 2.2.2.Final
Here is a simplified thread-blocking implementation, which I want to implement in a non-thread-blocking (reactive) way.
import java.io.IOException
import java.io.OutputStream
import java.net.URL
import java.util.zip.ZipEntry
import java.util.zip.ZipOutputStream
import javax.ws.rs.GET
import javax.ws.rs.Path
import javax.ws.rs.Produces
import javax.ws.rs.WebApplicationException
import javax.ws.rs.core.Response
import javax.ws.rs.core.StreamingOutput
data class Entry(val url: String, val name: String)
#Path("/zip")
class ZipResource {
#GET
#Produces("application/zip")
fun download(): Response? {
val entries = listOf(
Entry("http://link-to-a-source-file1", "file1.txt"),
Entry("http://link-to-a-source-file2", "file2.txt"),
)
val contentDisposition = """attachment; filename="test.zip""""
return Response.ok(ZipStreamingOutput(entries)).header("Content-Disposition", contentDisposition).build()
}
class ZipStreamingOutput(private val entries: List<Entry>) : StreamingOutput {
#Throws(IOException::class, WebApplicationException::class)
override fun write(output: OutputStream) {
val zipOut = ZipOutputStream(output)
for (entry in entries) {
zipOut.putNextEntry(ZipEntry(entry.name))
val downloadStream = URL(entry.url).openStream()
downloadStream.transferTo(zipOut)
downloadStream.close()
}
zipOut.close()
}
}
}
Related
say you have a method:
public CompletableFuture<List<Integer>> getStuffAsync()
I want the same stream as:
Multi<Integer> stream = Multi
.createFrom().completionStage(() -> getStuffAsync())
.onItem().transformToIterable(Function.identity())
which is a stream of each integer returned by the list from the method specified at the beggining...
but without the onItem().transformToIterable(), hopefully something like:
Multi.createFrom().completionStageIterable(() -> getStuffAsync())
purely for aesthetic reasons and to save on valuable characters
You can use Multi.createFrom().iterable() and pass in the result of the CompletableFuture.
Multi<Integer> stream = Multi
.createFrom()
.iterable(getStuffAsync().get());
Alternatively use a utility method to create it.
Create this class Multi.java (or other name) in a package such as util:
package util;
import io.smallrye.mutiny.groups.*;
import java.util.concurrent.CompletableFuture;
import java.util.function.Function;
public interface Multi<T> extends io.smallrye.mutiny.Multi<T> {
static <T> io.smallrye.mutiny.Multi<T> createFromCompletionStageIterable(CompletableFuture<? extends Iterable<T>> completableFuture) {
return MultiCreate.INSTANCE.completionStage(completableFuture).onItem().transformToIterable(Function.identity());
}
}
Then you can use your custom method like this:
Multi<Integer> stream = util.Multi
.createFromCompletionStageIterable(getStuffAsync());
Problem statement
I think the title says it all: I'm looking for the way to parse a String containing the body part of a multipart/form-data HTTP request. I.e. the contents of the string would look something like this:
--xyzseparator-blah
Content-Disposition: form-data; name="param1"
hello, world
--xyzseparator-blah
Content-Disposition: form-data; name="param2"
42
--xyzseparator-blah
Content-Disposition: form-data; name="param3"
blah, blah, blah
--xyzseparator-blah--
What I'm hoping to obtain, is a parameters map, or something similar, like this.
parameters.get("param1"); // returns "hello, world"
parameters.get("param2"); // returns "42"
parameters.get("param3"); // returns "blah, blah, blah"
parameters.keys(); // returns ["param1", "param2", "param3"]
Further criteria
It would be best if I don't have to supply the separator (i.e. xyzseparator-blah in this case), but I can live with it if I do have to.
I'm looking for a library based solution, possibly from a main stream library (like "Apache Commons" or something similar).
I want to avoid rolling my own solution, but at the current stage, I'm afraid I will have to. Reason: while the example above seems trivial to split/parse with some string manipulation, real multipart request bodies can have many more headers. Besides that, I do not want to re-invent (and much less re-test!) the wheel :)
Alternative solution
If there were a solution, which satisfies the above criteria, but whose input is an Apache HttpRequest, instead of a String, that would be acceptable too.
(Basically I do receive an HttpRequest, but the in-house library I'm using is built such, that it extracts the body of this request as a String, and passes that to the class responsible for doing the parsing. However, if need be, I could also work directly on the HttpRequest.)
Related questions
No matter how I try to find an answer through Google, here on SO, and on other forums too, the solution seems to be always to use commons fileupload to go through the parts. E.g.: here, here, here, here, here...
However, parseRequest method, used in that solution, expects a RequestContext, which I do not have (only HttpRequest).
The other way, also mentioned in some of the above answers, is getting the parameters from the HttpServletRequest (but again, I only have HttpRequest).
EDIT: In other words: I could include Commons Fileupload (I have access to it), but that would not help me, because I have an HttpRequest, and the Commons Fileupload needs RequestContext. (Unless there is an easy way to convert from HttpRequest to RequestContext, which I have overlooked.)
You can parse your String using Commons FileUpload by wrapping it in a class implementing 'org.apache.commons.fileupload.UploadContext', like below.
I recommend wrapping the HttpRequest in your proposed alternate solution instead though, for a couple of reasons. First, using a String means that the whole multipart POST body, including the file contents,needs to fit into memory. Wrapping the HttpRequest would allow you to stream it, with only a small buffer in memory at one time. Second, without the HttpRequest, you'll need to sniff out the multipart boundary, which would normally be in the 'Content-type' header (see RFC1867).
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.commons.fileupload.FileItem;
import org.apache.commons.fileupload.FileItemFactory;
import org.apache.commons.fileupload.FileUpload;
import org.apache.commons.fileupload.disk.DiskFileItemFactory;
public class MultiPartStringParser implements org.apache.commons.fileupload.UploadContext {
public static void main(String[] args) throws Exception {
String s = new String(Files.readAllBytes(Paths.get(args[0])));
MultiPartStringParser p = new MultiPartStringParser(s);
for (String key : p.parameters.keySet()) {
System.out.println(key + "=" + p.parameters.get(key));
}
}
private String postBody;
private String boundary;
private Map<String, String> parameters = new HashMap<String, String>();
public MultiPartStringParser(String postBody) throws Exception {
this.postBody = postBody;
// Sniff out the multpart boundary.
this.boundary = postBody.substring(2, postBody.indexOf('\n')).trim();
// Parse out the parameters.
final FileItemFactory factory = new DiskFileItemFactory();
FileUpload upload = new FileUpload(factory);
List<FileItem> fileItems = upload.parseRequest(this);
for (FileItem fileItem: fileItems) {
if (fileItem.isFormField()){
parameters.put(fileItem.getFieldName(), fileItem.getString());
} // else it is an uploaded file
}
}
public Map<String,String> getParameters() {
return parameters;
}
// The methods below here are to implement the UploadContext interface.
#Override
public String getCharacterEncoding() {
return "UTF-8"; // You should know the actual encoding.
}
// This is the deprecated method from RequestContext that unnecessarily
// limits the length of the content to ~2GB by returning an int.
#Override
public int getContentLength() {
return -1; // Don't use this
}
#Override
public String getContentType() {
// Use the boundary that was sniffed out above.
return "multipart/form-data, boundary=" + this.boundary;
}
#Override
public InputStream getInputStream() throws IOException {
return new ByteArrayInputStream(postBody.getBytes());
}
#Override
public long contentLength() {
return postBody.length();
}
}
I would like to expose a public API (a kind of Runnable) and let users implement it, and then execute that code against our servers (given the class name containing the code to run). Users provide a jar containing their implementation and should not have access to implementation details.
Below is a snippet illustrating the issue.
The public API:
package mypackage
trait MyTrait {
def run(i: Int) : Unit
}
A sample user's implementation:
package mypackage
object MyImpl extends MyTrait {
override def run(i : Int) : Unit = {
println(i)
}
}
The server-side code running the user's code:
package mypackage
import scala.reflect.runtime.{universe => ru}
object MyTest extends App {
val m = ru.runtimeMirror(getClass.getClassLoader)
val module = m.staticModule("mypackage.MyImpl")
val im = m.reflectModule(module)
val method = im.symbol.info.decl(ru.TermName("run")).asMethod
val objMirror = m.reflect(im.instance)
objMirror.reflectMethod(method)(42)
}
The above code works (printing "42"), but the deisgn seems ugly to me.
In addition it seems unsafe (class instead of object, object that does not exist or does not implement the correct interface).
What's the best way to achieve this ?
I am using Scala 2.11.8.
Thanks for your help
I am trying to learn the Play Framework 2.4. I am trying to get the time it takes to access different webpages asynchronously using Promise. Below is the code for that:
final long start = System.currentTimeMillis();
F.Function<WSResponse,Long> timing = new F.Function<WSResponse, Long>() {
#Override
public Long apply(WSResponse wsResponse) throws Throwable {
return System.currentTimeMillis()-start;
}
};
F.Promise<Long> google = ws.url("http://www.google.com").get().map(timing);
F.Promise<Long> yahoo = ws.url("http://www.yahoo.com").get().map(timing);
F.Promise<Long> bing = ws.url("http://www.bing.com").get().map(timing);
As you can see I am using the get function to get the requested pages and putting the result in a Future Promise. Then I convert/map it to long. What I am not able to do is how do I compose these three promises into one and once all of the three promises are redeemed convert/map it to json and return the result. In earlier versions of Play it could have been done by F.Promise.waitAll(google,yahoo,bing).map(...) however I am unable to do it in Play 2.4. Please advice
EDIT1: Based on the answer below i used sequence like below:
return F.Promise.sequence(google, yahoo, bing).map(new F.Function<List<Long>, Result>() {
#Override
public Result apply(List<Long> longs) throws Throwable {
Map<String, Long> data = new HashMap<String, Long>();
data.put("google", google.get());
data.put("yahoo", yahoo.get());
data.put("bing", bing.get());
return ok(Json.toJson(data));
}
});
However, i am getting error that google.get() method cannot be resolved and that Json cannot be applied. What am i missing here?
EDIT 2. The Json error was fixed by using return ok((JsonNode) Json.toJson((Writes<Object>) data)); However, i am still not able to resolve the earlier error that google.get() method cannot be resolved in the line data.put("google", google.get());
EDIT 3. It seems Play2.4 has no get() method which returns the value of a Promise once it has been redeemed. What should i use then?
waitAll has been replaced with F.Promise.sequence.
From the docs
public static <A> F.Promise<java.util.List<A>> sequence(java.lang.Iterable<F.Promise<A>> promises)
Combine the given promises into a single promise for the list of results. The sequencing operations are performed in the default ExecutionContext.
Parameters:
promises - The promises to combine
Returns:
A single promise whose methods act on the list of redeemed promises
Update
Regarding the second half of the question, you don't need to call .get() because the promises have already completed.
In fact, you can get rid of the individual promise variables and just pass them directly into the sequence. The resulting list will contain results in the same order (Google first, then Yahoo, then Bing, in this case).
The whole thing should look something like this:
package controllers;
import java.util.HashMap;
import java.util.Map;
import play.libs.F;
import play.libs.Json;
import play.libs.ws.WS;
import play.libs.ws.WSResponse;
import play.mvc.Controller;
import play.mvc.Result;
import play.mvc.Results;
public class Application extends Controller {
public F.Promise<Result> index() {
final long start = System.currentTimeMillis();
final F.Function<WSResponse,Long> timing = response -> System.currentTimeMillis() - start;
return F.Promise.sequence(WS.url("http://www.google.com").get().map(timing),
WS.url("http://www.yahoo.com").get().map(timing),
WS.url("http://www.bing.com").get().map(timing))
.map(list -> {
Map<String, Long> data = new HashMap<>();
data.put("google", list.get(0));
data.put("yahoo", list.get(1));
data.put("bing", list.get(2));
return data;
})
.map(Json::toJson)
.map(Results::ok);
}
}
Finally, since Play 2.4 requires Java 8, this is a good opportunity to play around with lambdas!
I've builded a scala application using spray with akka actor.
My problem is that the request are synchronized and the server can't manage many requests at once.
Is that a normal behaviour? what can I do to avoid this?
This is my boot code:
object Boot extends App with Configuration {
// create an actor system for application
implicit val system = ActorSystem("my-service")
//context.actorOf(RoundRobinPool(5).props(Props[TestActor]), "router")
// create and start property service actor
val RESTService = system.actorOf(Props[RESTServiceActor], "my-endpoint")
// start HTTP server with property service actor as a handler
IO(Http) ! Http.Bind(RESTService, serviceHost, servicePort)
}
actor code:
class RESTServiceActor extends Actor
with RESTService {
implicit def actorRefFactory = context
def receive = runRoute(rest)
}
trait RESTService extends HttpService with SLF4JLogging{
val myDAO = new MyDAO
val AccessControlAllowAll = HttpHeaders.RawHeader(
"Access-Control-Allow-Origin", "*"
)
val AccessControlAllowHeadersAll = HttpHeaders.RawHeader(
"Access-Control-Allow-Headers", "Origin, X-Requested-With, Content-Type, Accept"
)
val rest = respondWithHeaders(AccessControlAllowAll, AccessControlAllowHeadersAll) {
respondWithMediaType(MediaTypes.`application/json`){
options {
complete {
""
}
} ~
path("some"/"path"){
get {
parameter('parameter){ (parameter) =>
ctx: RequestContext =>
handleRequest(ctx) {
myDAO.getResult(parmeter)
}
}
}
}
}
}
/**
* Handles an incoming request and create valid response for it.
*
* #param ctx request context
* #param successCode HTTP Status code for success
* #param action action to perform
*/
protected def handleRequest(ctx: RequestContext, successCode: StatusCode = StatusCodes.OK)(action: => Either[Failure, _]) {
action match {
case Right(result: Object) =>
println(result)
ctx.complete(successCode,result.toString())
case Left(error: Failure) =>
case _ =>
ctx.complete(StatusCodes.InternalServerError)
}
}
}
I saw that:
Akka Mist provides an excellent basis for building RESTful web
services in Scala since it combines good scalability (enabled by its
asynchronous, non-blocking nature) with general lightweight-ness
Is that what I'm missing? is spray using it as default or I need to add it, and how?
I'm a bit confuse about it. any help is appreciated.
If you are starting from scratch, I suggest using Akka HTTP, documented at http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M4/scala/http/. It is a port of Spray, but using Akka Streams, which will be important moving forward.
As far as making your code completely asynchronous, the key pattern is to return a Future to your result, not the result data itself. In other words, RESTServiceActor should be return a Future that returns the data, not the actual data. This will allow Spray/Akka HTTP accept additional connections and the asynchronous completion of the service actor will return the results when they are finished.
Instead of sending result to the complete method:
ctx.complete(successCode,result.toString())
I used the future method:
import concurrent.Future
import concurrent.ExecutionContext.Implicits.global
ctx.complete(successCode,Future(Option(result.toString())))